Neural network base class and organization of forward and backward pass processes

We have already done preparatory work to create constants and an interface for transferring the architecture of the created neural network. Let’s continue. Now I propose to move on to creating a top-level class CNet, which will act as the manager of our neural network.

To do this work, we will create a new included library file neuronnet.mqh in a subdirectory of our library. In it, we will collect all the code of our CNet neural network class. Next, we will create a separate file for each new class. File names will correspond to the names of the classes — this will allow for structuring the project and quickly accessing the code of a specific class.

We won't be able to write the complete code for the methods of this class right now, as during their implementation we will need to refer to the neural layer classes and their methods. There are currently no such classes. Why have I decided to start by creating the top-level object instead of creating the lower-level objects first? Here, I am addressing the issue of the integrity of the structure and the standardization of methods and data transfer interfaces between individual blocks of our neural network.

Later, when examining the architectural features of the neural layers, you will be able to notice differences in their functionality and, to some extent, in the information flow. When solving the problem from the bottom up, we run the risk of obtaining quite different methods and interfaces, which will then be difficult to integrate into a unified system. On the contrary, I want to create a top-level "skeleton" of our development right from the beginning, and later fill it with functionality. By early planning the architecture and functionality of the interfaces, we will simply integrate new neural layer architectures into the already established information flow.

Let's define the functionality of the CNet class. The first thing this class should do is directly assemble the neural network with the architecture provided by the user. This can be done in the class constructor, or you can create a separate method, Create. I picked the second option. Using the base class constructor without parameters will allow us to create an "empty" class instance, for example, to load a previously trained neural network. It will also make it easier to inherit the class for possible future development.

Since we have started on the issue of loading a pre-trained network, the following class functionality follows from here: saving (Save) and loading (Load) our model.

Whether it is newly created (generated) neural layers or loaded from a file, we will need to store them and work with them. When elaborating and defining constants, we allocated a separate constant for the dynamic array storing the neural layers. We will add an instance of this object to the class variables (m_cLayers).

Let's take a look at how the work of the neural network is organized. Here we need to implement feed-forward pass (FeedForward) and backpropagation pass (Backpropagation) algorithms. Let's display the process of updating the weights UpdateWeights as a separate method.

Of course, you can update the weights in the backpropagation method, which is what is most commonly encountered in practice. But we're talking about a universal constructor. At the time of writing the code, we don't know if batch normalization (batch size) will be used. Therefore, there is no clear understanding at what point it will be necessary to update the weights.

A complex problem is always easier to solve step by step. Dividing a process into smaller subprocesses makes it easier to both write code and debug it. Therefore, I decided to separate the process of updating the weights.

Let's recall the neuron optimization methods. Almost all methods use a learning rate, and some require additional parameters, such as decay coefficients. We also need to allow the user to specify them. In this case, the user specifies once, and we will need them at each iteration. So we need to store them somewhere. Let's add a method for specifying learning parameters (SetLearningRates) and variables for storing data (m_dLearningRate and m_adBeta). For the decay coefficients, we will create a vector of two elements, which, in my opinion, will make the code more readable.

In the process of practical use of a neural network, the user may need to obtain the results of processing of the same source data several times. This option should be possible. However, in order not to make a direct pass every time, we will output the possibility of obtaining the results of the last direct pass using a separate GetResults method.

In addition, in the process of training and operating the neural network, we will need to control the process of accuracy and correctness of the forward pass data. The main indicator of the neural network's correct operation is the value of the loss function. The actual calculation of the loss function will be carried out in the Backpropagation method. The calculated value of the loss function will be stored in the m_dNNLoss variable. Let's add the GetRecentAverageLoss method to display the variable value at the user's request.

Now, speaking of the loss function. A specific loss function should be selected by the user. Therefore, we need a method to be able to get it from the user (LossFunction). The actual calculation of the value of the loss function will be carried out by standard means of matrix operations in MQL5. Here we will create a variable to store the type of the loss function (m_eLossFunction).

When defining constants, we didn't create a separate enumeration for regularization methods. Then we agreed to implement Elastic Net and manage the process through regularization coefficients. I suggest adding the specification of regularization coefficients to the loss function method. After all, look at how the number of class methods grows. Therefore, the question is not only in the implementation of our constructor. On the contrary, when building the constructor, all possible usage scenarios should be anticipated. This will help make it more flexible.

At the same time, the actual use of such a constructor should be as easy and intuitive as possible. In other words, we should provide the user with an interface that allows for the most flexible configuration of a new neural network with the minimum number of iterations required from the user.

Note that the algorithm of the normalization layers and Dropout differ depending on the mode of use (training or operation). Of course, this could have been done as a separate parameter in the forward and backward pass methods, but it's important to have a clear correspondence between the operations of the forward and backward passes. Performing a backward pass in training mode after a working forward pass and vice versa can only destabilize the neural network. Therefore, to avoid overloading the aforementioned methods with additional checks, we'll create separate functions to set and query the TrainMode operating mode.

There's another aspect regarding the operating mode of the neural network, specifically, the choice of tool for conducting computational operations. We have already discussed the topic of using OpenCL technology for parallel computing. This will allow parallel computation of mathematical operations on the GPU and speed up calculations during the operation of the neural network. The standard MQL5 library OpenCL.mqh provides the COpenCL class for working with OpenCL.

In the process of working with this class, I decided to slightly supplement its functionality, for which I created a new class CMyOpenCL that inherits the standard COpenCL class. Inheritance allowed me to write the code for just a couple of methods while still utilizing the full power of the parent class.

To use the CMyOpenCL class, add a pointer to an instance of the m_cOpenCL class. We will also add the m_bOpenCL flag, which will inform you if the functionality is enabled in our neural network. We will also add methods for initializing the functionality and managing it (InitOpenCL, UseOpenCL).

Let's not forget that we plan to use neural networks to work with timeseries. This leaves a certain imprint on their work. Do you remember the time-shift correlation score plot of the initial data? As the time lag increases, the impact of the indicator on the target result decreases. This once again confirms the importance of taking into account the position of the analyzed indicator on the timeline. Therefore, it will be necessary to implement such a mechanism.

We will talk about the method itself a little later. For now, let's create an instance of the CPositionEncoder class to implement positional encoding. We will also create a flag for controlling the activity of the function and declare methods for managing the function.

Let's add another class identification method to our list and get the following CNet class structure.

class CNet  : public CObject
  {
protected:
   bool               m_bTrainMode;
   CArrayLayers*      m_cLayers;
   CMyOpenCL*         m_cOpenCL;
   bool               m_bOpenCL;
   TYPE               m_dNNLoss;
   int                m_iLossSmoothFactor;
   CPositionEncoder*  m_cPositionEncoder;
   bool               m_bPositionEncoder;
   ENUM_LOSS_FUNCTION m_eLossFunction;
   VECTOR             m_adLambda;
   TYPE               m_dLearningRate;
   VECTOR             m_adBeta;

public:
                      CNet(void);
                     ~CNet(void);
   //--- Methods for creating an object
   bool               Create(........);
   //--- Organization of work with OpenCL
   void               UseOpenCL(bool value);
   bool               UseOpenCL(void)          const { return(m_bOpenCL);          }
   bool               InitOpenCL(void);

   //--- Methods of working with positional coding
   void               UsePositionEncoder(bool value);
   bool               UsePositionEncoder(voidconst { return(m_bPositionEncoder); }
   //--- Organization of the basic algorithms of the model
   bool               FeedForward(........);
   bool               Backpropagation(........);
   bool               UpdateWeights(........);
   bool               GetResults(........);
   void               SetLearningRates(TYPE learning_rateTYPE beta1 = defBeta1,
                                                           TYPE beta2 = defBeta2);
   //--- Methods of the loss function
   bool               LossFunction(ENUM_LOSS_FUNCTION loss_function,
                          TYPE lambda1 = defLambdaL1TYPE lambda2 = defLambdaL2);
   ENUM_LOSS_FUNCTION LossFunction(void)       const { return(m_eLossFunction);    }
   ENUM_LOSS_FUNCTION LossFunction(TYPE &lambda1TYPE &lambda2);

   TYPE               GetRecentAverageLoss(voidconst { return(m_dNNLoss);        }
   void               LossSmoothFactor(int value)   { m_iLossSmoothFactor = value; }
   int                LossSmoothFactor(void)   const { return(m_iLossSmoothFactor);}
   //--- Model operation mode control
   bool               TrainMode(void)          const { return m_bTrainMode;        }
   void               TrainMode(bool mode);
   //--- Methods for working with files
l   virtual bool      Save(........);
   virtual bool       Load(........);
   //--- object identification method
   virtual int        Type(void)               const { return(defNeuronNet);       }
   //--- Retrieving pointers to internal objects
   virtual CBufferTypeGetGradient(uint layer)     const;
   virtual CBufferTypeGetWeights(uint layer)      const;
   virtual CBufferTypeGetDeltaWeights(uint layerconst;
  };

You can note that in the declaration of several methods, I left ellipsis instead of specifying parameters. Now we will analyze the class methods and add the missing data.

Let's start with the class constructor. In it, we initialize the variables with initial values and create instances of the classes used.

CNet::CNet(void)     :  m_bTrainMode(false),
                        m_bOpenCL(false),
                        m_bPositionEncoder(false),
                        m_dNNLoss(-1),
                        m_iLossSmoothFactor(defLossSmoothFactor),
                        m_dLearningRate(defLearningRate),
                        m_eLossFunction(LOSS_MSE)
  {
   m_adLambda.Init(2);
   m_adBeta.Init(2);
   m_adLambda[0] = defLambdaL1;
   m_adLambda[1] = defLambdaL2;
   m_adBeta[0]   = defBeta1;
   m_adBeta[1]   = defBeta2;
   m_cLayers     = new CArrayLayers();
   m_cOpenCL     = new CMyOpenCL();
   m_cPositionEncoder = new CPositionEncoder();
  }

In the class destructor, we will clear the memory by deleting the instances of the previously created objects.

CNet::~CNet(void)
  {
   if(!!m_cLayers)
      delete m_cLayers;
   if(!!m_cPositionEncoder)
      delete m_cPositionEncoder;
   if(!!m_cOpenCL)
      delete m_cOpenCL;
  }

Let’s consider the Create method that creates a neural network. I omitted the parameters of this method earlier, and now I suggest we discuss them.

The interface for passing the structure of a neural network to a class was described in the previous chapter. Of course, we will pass it to this method. But is this data enough or not? From a technical perspective, this data is quite sufficient to specify the architecture of the neural network. We have provided additional methods for specifying learning rates and loss functions.

But if we look at the question from the user's perspective: how convenient is it to use three methods to specify all the necessary parameters when initializing the neural network? In fact, it is a matter of personal habits and preferences of the user. Some prefer to use multiple methods specifying one or two parameters and monitor the process at each step. Others would prefer to 'throw' all the parameters into one method in a single line of code, check the result once, and move on.

When we work directly with the customer, we can discuss their preferences and make the product convenient for them. But when creating a universal product, it's logical to try to satisfy the preferences of all potential users. Moreover, the user can choose different options depending on the task at hand. Therefore, we will use the ability to overload functions and create several methods with the same name to satisfy all possible usage scenarios.

First, we'll create a method with a minimal number of parameters, which will only receive a dynamic array describing the architecture of the neural network. At the beginning of the method, we will check the validity of the pointer to the object received in the method parameter. Then we check the number of neural layers in the passed description.

We already mentioned earlier that there cannot be less than two layers, as the first input layer is used to input the initial data, and the last layer is for outputting the result of the neural network's operation. If at least one check fails, we exit the method with a false result.

bool CNet::Create(CArrayObj *descriptions)
  {
//--- Control block
   if(!descriptions)
      return false;
//--- Check the number of layers to be created
   int total = descriptions.Total();
   if(total < 2)
      return false;

After successfully passing the controls, we initialize the class to work with the OpenCL technology. Unlike the previous checks, we will not return false in the case of initialization errors. We will simply disable this functionality and continue operating in the standard mode. This approach is implemented to enable the replication of the finished product on various computing machines without altering the program code. This, in general, expands the potential customer base for distributing the end product.

//--- Initialize OpenCL objects
   if(m_bOpenCL)
      m_bOpenCL = InitOpenCL();
   if(!m_cLayers.SetOpencl(m_cOpenCL))
      m_bOpenCL = false;

For all objects of our neural network to work in the same OpenCL context, we will pass a pointer to an instance of the CMyOpenCL class to the storage array of neural layers. From there, it will subsequently be passed to each neural layer.  

Then we will organize a loop with the number of iterations equal to the number of layers of our network. In it, we will sequentially iterate through all the elements of the dynamic array describing neural layers. During this process, we will validate the validity of the description object for each layer, as well as ensure that the specified parameters adhere to the model's integrity. In the method's code, you can observe the validation of specific parameters for various types of neural layers, which we will become acquainted with a little later.

After that, we will call the method to create the corresponding layer. It is worth noting that we will entrust the creation of the neural layer directly to the element creation method, CreateElement of the m_cLayers dynamic storage array of neural layers.

//--- Organize a loop to create neural layers
   for(int i = 0i < totali++)
     {
      CLayerDescription *temp = descriptions.At(i);
      if(!temp)
         return false;
      if(i == 0)
        {
         if(temp.type != defNeuronBase)
            return false;
         temp.window = 0;
        }

      else
        {
         CLayerDescription *prev = descriptions.At(i - 1);
         if(temp.window <= 0 || temp.window > prev.count ||
            temp.type == defNeuronBase)
           {
            switch(prev.type)
              {
               case defNeuronConv:
               case defNeuronProof:
                  temp.window = prev.count * prev.window_out;
                  break;
               case defNeuronAttention:
               case defNeuronMHAttention:
                  temp.window = prev.count * prev.window;
                  break;
               case defNeuronGPT:
                  temp.window = prev.window;
                  break;
               default:
                  temp.window = prev.count;
                  break;
              }

            switch(temp.type)
              {
               case defNeuronAttention:
               case defNeuronMHAttention:
               case defNeuronGPT:
                  break;
               default:
                  temp.step = 0;
              }
           }
        }
      if(!m_cLayers.CreateElement(itemp))
         return false;
     }

At the end of the method, we initialize the positional encoding class. Please note that the actual code for each position remains unchanged throughout the training and utilization of the neural network. The elements will change, but the size of the input layer of neurons will stay the same. That means, upon creating the network, we can calculate and store the position code for each element right away, and subsequently use the saved values instead of repeatedly recalculating the code.

//--- Initialize positional coding objects
   if(m_bPositionEncoder)
     {
      if(!m_cPositionEncoder)
        {
         m_cPositionEncoder = new CPositionEncoder();
         if(!m_cPositionEncoder)
            m_bPositionEncoder = false;
         return true;
        }
      CLayerDescription *temp = descriptions.At(0);
      if(!m_cPositionEncoder.InitEncoder(temp.counttemp.window))
         UsePositionEncoder(false);
     }
//---
   return true;
  }

When organizing method overloads for Create, we won't rewrite the entire code; we'll only carry out the user's tasks and make calls to the necessary methods with the received parameters. Below are the possible variations of the overloaded method.

bool CNet::Create(CArrayObj *descriptions,
                  TYPE learning_rate,
                  TYPE beta1,TYPE beta2,
                  ENUM_LOSS_FUNCTION loss_function,
                  TYPE lambda1,TYPE lambda2)
  {
   if(!Create(descriptions))
      return false;
   SetLearningRates(learning_rate,beta1,beta2);
   if(!LossFunction(loss_function,lambda1,lambda2))
      return false;
//---
   return true;
  }

bool CNet::Create(CArrayObj *descriptions,
                  ENUM_LOSS_FUNCTION loss_function,
                  TYPE lambda1,TYPE lambda2)
  {
   if(!Create(descriptions))
      return false;
   if(!LossFunction(loss_function,lambda1,lambda2))
      return false;
//---
   return true;
  }

bool CNet::Create(CArrayObj *descriptions,
                  TYPE learning_rate,
                  TYPE beta1,TYPE beta2)
  {
   if(!Create(descriptions))
      return false;
   SetLearningRates(learning_rate,beta1,beta2);
//---
   return true;
  }

When creating overloaded methods, be sure to declare any method overloads that you use in the class declaration.

Let’s move on. Let's talk about the FeedForward fees forward method. The method parameters are omitted in the declaration above. Let's think about what data we need to perform a direct pass. First of all, we need initial data. They must be transferred to the neural network from the outside. We are adding the dynamic array CBufferType to the parameters. We will create this class later; it will serve all our data buffers.

During the forward pass, the input data is multiplied by the weights stored in the neural layer objects. This means that the neural network already knows them. The obtained values are passed through an activation function. The functions used for each layer are specified during the neural network's creation stage in the architecture description.

Thus, to implement the direct pass, it is enough for us to receive an array of initial data at the input.

In the method body, we will validate the pointers to the array of input data and the first neural layer of our network. We will not create a separate type of neural layer for the initial data. Instead, we take a basic fully connected neural layer and write the received initial data to the buffer of output (resulting) values of neurons. Thus, we get the unification of neural layers.

bool CNet::FeedForward(const CBufferType *inputs)
  {
//--- control block
   if(!inputs)
      return false;
   CNeuronBase *InputLayer = m_cLayers.At(0);
   if(!InputLayer)
      return false;

In the next step, if necessary, we will position the initial values.

   CBufferType *Inputs = InputLayer.GetOutputs();
   if(!Inputs)
      return false;
   if(Inputs.Total() != inputs.Total())
      return false;
//--- Transfer the source data to the neural layer
   Inputs.m_mMatrix = inputs.m_mMatrix;
//--- Apply positional coding
   if(m_bPositionEncoder && !m_cPositionEncoder.AddEncoder(Inputs))
      return false;
   if(m_bOpenCL)
      Inputs.BufferCreate(m_cOpenCL);

At this stage, the preparation of the initial data can be considered complete. Let's proceed directly to the forward pass: we will organize a loop that iterates through all the neural layers in our network sequentially, from the first to the last. For each layer, we will call its corresponding forward pass method. Note that the loop starts at layer index 1. The neural layer with the initial data recorded has an index of 0.

Another point to which you should also pay attention. In the process of enumeration, we use one class CNeuronBase for all objects of neural layers. This is our base class for the neural layer. All other classes of neural layers will inherit from it.

In addition, we will create the virtual method FeedForward that will be overridden in all other types of neural layers. This implementation allows us to use the neural layer base class and call the forward pass virtual method. The task of distributing and utilizing the specific type of neuron's forward pass method will be handled by the compiler and system on our behalf.

//--- Create a loop with a complete search of all neural layers
//--- and call the forward pass method for each of them
   CNeuronBase *PrevLayer = InputLayer;
   int total = m_cLayers.Total();
   for(int i = 1i < totali++)
     {
      CNeuronBase *Layer = m_cLayers.At(i);
      if(!Layer)
         return false;
      if(!Layer.FeedForward(PrevLayer))
         return false;
      PrevLayer = Layer;
     }

It should be noted here that when using the OpenCL technology, when the kernel is sent for execution, it is queued. To "push" its execution, we need to initiate the retrieval of the operation results. We have previously discussed the need to minimize the exchange of data between RAM and the OpenCL context. Therefore, we will not retrieve data after each kernel is added to the queue. Instead, we will enqueue the entire chain of operations and only after completing the loop iterating through all the neural layers, we will request the results of the operations from the last neural layer. Since our data is passed sequentially from one layer to another, the entire queue of operations will be pulled along. But do not forget that data loading is only necessary when using the OpenCL technology.

   if(m_bOpenCL)
      if(!PrevLayer.GetOutputs().BufferRead())
         return false;
//---
   return true;
  }

During the feed-forward pass, we obtained certain calculated data. On an untrained neural network, the obtained result will be quite random. We aim for our neural network to produce results that are as close as possible to real outcomes. And in order to get closer to them, we need to train a neural network. The supervised learning process is based on an iterative approach with the gradual adjustment of weights to the correct answers. As we said earlier, this process consists of two stages: forward and backward (backpropagation) pass. We have already written about the forward pass method. Let's look at the backpropagation method.

Above, when describing the class, I also omitted the parameters of this method. Please take another look at the algorithm for the backward pass. Here we need only correct answers from the external system. Therefore, we will add a dynamic array of correct answers to the method parameters. But at the input of the method, we will receive only reference values for the output neural layer. Therefore, we need to calculate the error gradient for each neuron in our network. The only exception is the neurons in the input layer: their values are provided by an external system and are independent of the neural network state. Hence, calculating the error gradient for the input data is unnecessary work that has no practical value and logical meaning.

At the beginning of the method, as always, we will perform data validation for the method operation. In this block, we will validate the received pointer to the dynamic array of target values and compare the result buffer size with the size of the obtained vector of target values. After that, we calculate the value of the loss function. The calculation of the loss function itself is hidden in the standard MQL5 matrix operations. The algorithm for calculating the value of the function was shown when considering possible options for the loss function. We will check the obtained loss function value and calculate the smoothed error over the entire training period.

bool CNet::Backpropagation(CBufferType *target)
  {
//--- Control block
   if(!target)
      return false;
   int total = m_cLayers.Total();
   CNeuronBase *Output = m_cLayers.At(total - 1);
   if(!Output ||Output.Total()!=target.Total())
      return false;
//--- Calculate the value of the loss function
   TYPE loss = Output.GetOutputs().m_mMatrix.Loss(target.m_mMatrix,
                                                  m_eLossFunction);

   if(loss == FLT_MAX)
      return false;
   m_dNNLoss = (m_dNNLoss < 0 ? loss :
                m_dNNLoss + (loss - m_dNNLoss) / m_iLossSmoothFactor);

In the next block of our backward pass method, we will bring the error gradient to each neuron of our network. To achieve this, we will first calculate the error gradient at the output layer and then set up a backward loop. While iterating from the output of the neural network to its input, for each neural layer, we will invoke the gradient calculation method. We will discuss the differences in gradient calculation algorithms for the output and hidden layers of the neural network a bit later while exploring the fully connected neural layer.

Right here, we will calculate how the weights of our neural network should change in order for it to produce correct results for the current set of input data. In the sequential enumeration of neural layers, for each layer we will call the method for calculating deltas.

//--- Calculate the error gradient at the output of a neural network
   CBufferTypegrad = Output.GetGradients();
   grad.m_mMatrix = target.m_mMatrix;
   if(m_cOpenCL)
     {
      if(!grad.BufferWrite())
         return false;
     }
   if(!Output.CalcOutputGradient(grad, m_eLossFunction))
      return false;
//--- Create a loop with enumeration of all neural layers in reverse order
   for(int i = total - 2i >= 0i--)
     {
      CNeuronBase *temp = m_cLayers.At(i);
      if(!temp)
         return false;
      //--- Call the method for distributing the error gradient through the hidden layer
      if(!Output.CalcHiddenGradient(temp))
         return false;
      //--- Call the method for distributing the error gradient to the weight matrix
      if(!Output.CalcDeltaWeights(tempi == 0))
         return false;
      Output = temp;
     }

Similarly to the forward pass, in the case of using OpenCL technology, we need to download the results of the operations of the last kernel in the queue.

   if(m_cOpenCL)
     {
      for(int i = 1i < m_cLayers.Total(); i++)
        {
         Output = m_cLayers.At(i);
         if(!Output.GetDeltaWeights() || !Output.GetDeltaWeights().BufferRead())
            continue;
         break;
        }
     }
//---
   return true;
  }

The goal of training a neural network is not to find deviations, but to adjust it for the maximum likelihood of producing accurate results. A neural network is tuned by adjusting the correct weights. Therefore, after calculating the deltas, we must update the weights. For the above reasons, I moved the update of the weights into a separate method UpdateWeights.

When declaring a method in the class description, the parameters are not specified. Let's think: we have already calculated the deltas for updating the weights, and the training and regularization coefficients are set when initializing the neural network. At first glance, we have everything we need to update the weights. But look at the deltas. At each iteration, we will summarize them. If a batch of a certain size is used for updating coefficients, there is a high likelihood of obtaining an exaggerated delta. In such a situation, it is logical to use the average delta. To get the average of the sum of the packet deltas, it is enough to divide the available sum by the packet size. Of course, mathematically speaking, batch size can be factored into the learning rate. If we pre-divide the learning rate by the batch size, the final result will remain unchanged.

But this is manual control, and as always, it's a matter of user preference. We will give the opportunity to use both options: we will add a parameter to the method to specify the batch size and set its default value to one. Thus, the user can specify the batch size in the method parameters or can call the method without specifying parameters. In that case, the batch size will be set to the default value, and the delta will be adjusted only by the learning coefficient.

The algorithm of the method is quite straightforward. First, we will validate the specified batch size as it must be a positive integer value. Next, we will set up a loop to iterate through all the neural layers in our network, calling the corresponding method for each layer. The very process of updating the weights will be carried out at the level of the neural layer.

bool CNet::UpdateWeights(uint batch_size = 1)
  {
//--- Control block
   if(batch_size <= 0)
      return false;
//--- Organize a loop of enumeration of all hidden layers
   int total = m_cLayers.Total();
   for(int i = 1i < totali++)
     {
      //--- Check the validity of the pointer to the neural layer object
      CNeuronBase *temp = m_cLayers.At(i);
      if(!temp)
         return false;
      //--- Call the method of updating the matrix of the weights of the inner layer
      if(!temp.UpdateWeights(batch_sizem_dLearningRatem_adBetam_adLambda))
         return false;
     }
//---
   return true;
  }

Of course, the user should have the ability to obtain the results of the neural network operation after the forward pass is executed. This will be implemented by the GetResult method.

What external data should the method receive? Logically reasoning, the function should not receive but rather return data to an external program. However, we do not know what this data will be and in what numbers. Knowing the possible options for the neuron activation functions, it is logical to assume that the output of each neuron will be a certain number. The number of such values will be equal to the number of neurons in the output layer. Accordingly, it will be known at the stage of generation of the neural network. The logical way out of this situation would be a dynamic array of the appropriate type. Previously we used the data buffer class CBufferType for passing data into our model. Here we will use a similar object. Thus, for data exchange between the main program and the model, we will always use one dynamic array class.

In the method body, we first obtain a pointer to the array of output layer neuron values and validate this pointer. Then we check the validity of the pointer to the dynamic array for storing the results. We received a link to the last array in the method parameters from an external program. If the pointer is invalid, then we initiate the creation of a new instance of the data buffer class. After successfully creating a new buffer, we copy the values from the output layer neurons into it and exit the method.

bool CNet::GetResults(CBufferType *&result)
  {
   int total = m_cLayers.Total();
   CNeuronBase *temp = m_cLayers.At(total - 1);
   if(!temp)
      return false;
   CBufferType *output = temp.GetOutputs();
   if(!output)
      return false;
   if(!result)
     {
      if(!(result = new CBufferType()))
         return false;
     }
   if(m_cOpenCL)
      if(!output.BufferRead())
         return false;
   result.m_mMatrix = output.m_mMatrix;
//---
   return true;
  }

It's important to note that depending on the complexity of the task, neural networks can vary significantly in terms of architectural complexity and the number of synaptic connections. The training time of the network heavily depends on its complexity. Retraining the neural network every time is inefficient and is impossible in most cases. Therefore, the once-trained neural network must be saved and, at the next start, all the coefficients should be loaded from the file. Only after that, if necessary, you can retrain the neural network for the current realities.

The method responsible for saving the trained neural network is called Save. This virtual method is created in the CObject base class and is overridden in every new class. I intentionally did not immediately rewrite the method parameters from the parent class. The reason is that the parameters there are designed to receive a file handle for writing the object. That is, the file must first be opened in an external program, and after saving the data, the external program closes the file.

In other words, the control over opening and closing the file is removed from the class and placed onto the calling program. This approach is convenient when the object is part of a larger project and allows sequentially writing all project objects into a single shared file. And we will definitely use this when saving the objects that make up our neural network.

However, when we're talking about the top level of our program, it would be desirable to have a single method for saving the entire project. This method should handle the task of opening and closing the file, iterating through and saving all the necessary information for reconstructing the entire neural network from the file. At the same time, we cannot exclude the possibility that the neural network will be just a part of something larger.

Taking into consideration the ideas presented above, we will create two methods with the same name: one will receive a file handle in its parameters similar to the parent class method, and the other will be passed a file name for data writing.

Now, let's think about the minimum information we need to fully reconstruct a trained neural network. Of course, we need the architecture of the network, the number of layers and the number of neurons in them. Besides, we need all weights. To do this, we need to save the entire array of neural layers.

However, it's important to understand that a trained neural network will work correctly only within the environment for which it was trained. Therefore, we will save information about the loss function and position encoding.

I propose to write information about the symbol and timeframe in the name of the file. This will allow the Expert Advisor to quickly determine the presence of a pre-trained network on the disk in the future. Moreover, changing just the file name would be sufficient to transfer and test a pre-trained neural network on a different tool or timeframe. In most cases, fine-tuning a neural network will be easier than training it from random weights.

To gauge the extent of training for the neural network saved in the file, let's add the final average loss value and the smoothing coefficient. For convenient continuation of training, we will save the training and regularization parameters. To complete the picture, we will also add a flag indicating whether to use OpenCL.

Let's look at the algorithm of the method with the file handle in the parameters. At the beginning of the method, we will check the validity of the received file handle for data writing, as well as the pointers to the instances of loss functions and the dynamic array of neural layers.

bool CNet::Save(const int file_handle)
  {
   if(file_handle == INVALID_HANDLE ||
      !m_cLayers)
      return false;

Next, we will save the above parameters.

//--- Storing constants
   if(!FileWriteInteger(file_handle, (int)m_bOpenCL) ||
      !FileWriteDouble(file_handlem_dNNLoss) ||
      !FileWriteInteger(file_handlem_iLossSmoothFactor) ||
      !FileWriteInteger(file_handle, (int)m_bPositionEncoder) ||
      !FileWriteDouble(file_handle, (double)m_dLearningRate) ||
      !FileWriteDouble(file_handle, (double)m_adBeta[0]) ||
      !FileWriteDouble(file_handle, (double)m_adBeta[1]) ||
      !FileWriteDouble(file_handle, (double)m_adLambda[0]) ||
      !FileWriteDouble(file_handle, (double)m_adLambda[1]) ||
      !FileWriteInteger(file_handle, (int)m_eLossFunction))
      return false;

Let's check the flag for using the positional encoding of the input sequence and, if necessary, call the CPositionEncoder class instance saving method. At the end of the method, let's call the method that saves a dynamic array of neural layers. We will get acquainted with the called methods in more detail while analyzing the classes containing them.

//--- Save the positional coding object if necessary
   if(m_bPositionEncoder)
     {
      if(!m_cPositionEncoder ||
         !m_cPositionEncoder.Save(file_handle))
         return false;
     }
//-- Call the method for saving the data of a dynamic array of neural layers
   return m_cLayers.Save(file_handle);
  }

The algorithm method with the file name in the parameters will be a bit simpler. We will not rewrite the data saving algorithm in full. We will simply set up the file for writing information, and then pass the obtained file handle to the method discussed above. After the method execution is complete, we will close the file.

Please note that if an empty file name is provided in the parameters, we will replace it with the default file name and then proceed to execute the method in the standard mode.

Also, after executing the file opening function, we should check the success of the operation by checking the received handle. I deliberately omitted this step as it is the first operation in the Save method discussed above, and doing the same operation twice will only slow things down.

bool CNet::Save(string file_name = NULL)
  {
   if(file_name == NULL || file_name == "")
      file_name = defFileName;
//---
   int handle = FileOpen(file_nameFILE_WRITE | FILE_BIN);
//---
   bool result = Save(handle);
   FileClose(handle);
//---
   return result;
  }

For the reverse operation of loading neural network data from a file, we will create two similar Load methods with a handle and a file name in the parameters. While the algorithm for loading data with a specified file name in the parameters is identical to the corresponding data saving method, the algorithm for the second method becomes slightly more complex due to the initialization operations of objects.

At the beginning of the method, just like during saving, we validate the validity of the received file handle for loading data.

bool CNet::Load(const int file_handle)
  {
   if(file_handle == INVALID_HANDLE)
      return false;

Then we load all the previously saved parameters of the neural network. At the same time, we make sure that the sequence of reading data strictly corresponds to the sequence of their recording.

//--- Reading constants
   m_bOpenCL = (bool)FileReadInteger(file_handle);
   m_dNNLoss = FileReadDouble(file_handle);
   m_iLossSmoothFactor = FileReadInteger(file_handle);
   m_bPositionEncoder = (bool)FileReadInteger(file_handle);
   m_dLearningRate = (TYPE)FileReadDouble(file_handle);
   m_adBeta[0] = (TYPE)FileReadDouble(file_handle);
   m_adBeta[1] = (TYPE)FileReadDouble(file_handle);
   m_adLambda[0] = (TYPE)FileReadDouble(file_handle);
   m_adLambda[1] = (TYPE)FileReadDouble(file_handle);
   m_eLossFunction = (ENUM_LOSS_FUNCTIONFileReadInteger(file_handle);

Please note that when saving the data, we wrote the positional encoding object to the file only when the function was enabled. Consequently, we first check if the function was enabled when saving the data, and if necessary, initiate the process of reading the positional encoding method. We check the existence of the corresponding created object. If it has not been created before, then before loading the data, we initiate the creation of an instance of the object.

//--- Load the positional coding object
   if(m_bPositionEncoder)
     {
      if(!m_cPositionEncoder)
        {
         m_cPositionEncoder = new CPositionEncoder();
         if(!m_cPositionEncoder)
            return false;
        }
      if(!m_cPositionEncoder.Load(file_handle))
         return false;
     }

To initialize the OpenCL context object, we won't repeat the entire initialization code. Instead, we will use the appropriate method. We just need to call it and control the result of the operations.

//--- Initialize the object for working with OpenCL
   if(m_bOpenCL)
     {
      if(!InitOpenCL())
         m_bOpenCL = false;
     }
   else
      if(!!m_cOpenCL)
        {
         m_cOpenCL.Shutdown();
         delete m_cOpenCL;
        }

Next, we need to load the neural layers of the model and their parameters directly. To load this information, it would be sufficient to call the method for loading the dynamic array of neural layers. But before accessing the class method, we need to ensure the validity of the pointer to the class instance. Otherwise, we risk getting a critical program execution error. Therefore, we validate the pointer validity and create a new instance of the dynamic array object if necessary. Here we pass a valid pointer to the object to work with the OpenCL context into the object. Only after the preparatory work is done, we call the method that loads the dynamic array of neural layers.

//--- Initialize and load the data of a dynamic array of neural layers
   if(!m_cLayers)
     {
      m_cLayers = new CArrayLayers();
      if(!m_cLayers)
         return false;
     }
   if(m_bOpenCL)
      m_cLayers.SetOpencl(m_cOpenCL);
//---
   return m_cLayers.Load(file_handle);
  }

Perhaps, here we should explain why we're only loading the dynamic array instead of all the neural layers. The reason is that our dynamic array of neural layers serves as a container containing pointers to all the neural layer objects in the model. During saving, all the neural layers were sequentially stored in the array. Now, when loading the data, objects will also be sequentially created while preserving the pointers in the array. We will get acquainted with this mechanism in more detail when considering the methods of this class.

So, we've covered the main methods of our neural network class. In conclusion, taking into account everything mentioned above, its final structure will look as follows.

class CNet  : public CObject
  {
protected:
   bool               m_bTrainMode;
   CArrayLayers*      m_cLayers;
   CMyOpenCL*         m_cOpenCL;
   bool               m_bOpenCL;
   TYPE               m_dNNLoss;
   int                m_iLossSmoothFactor;
   CPositionEncoder*  m_cPositionEncoder;
   bool               m_bPositionEncoder;
   ENUM_LOSS_FUNCTION m_eLossFunction;
   VECTOR             m_adLambda;
   TYPE               m_dLearningRate;
   VECTOR             m_adBeta;

public:
                      CNet(void);
                     ~CNet(void);
   //--- Methods for creating an object
   bool               Create(CArrayObj *descriptions);
   bool               Create(CArrayObj *descriptionsTYPE learning_rate,
                                                      TYPE beta1TYPE beta2);
   bool               Create(CArrayObj *descriptions
                 ENUM_LOSS_FUNCTION loss_functionTYPE lambda1TYPE lambda2);
   bool               Create(CArrayObj *descriptionsTYPE learning_rate
                             TYPE beta1TYPE beta2,
                 ENUM_LOSS_FUNCTION loss_function, TYPE lambda1TYPE lambda2);

   //--- Implement work with OpenCL
   void               UseOpenCL(bool value);
   bool               UseOpenCL(void)          const { return(m_bOpenCL);         }
   bool               InitOpenCL(void);
   //--- Methods for working with positional coding
   void               UsePositionEncoder(bool value);
   bool               UsePositionEncoder(voidconst { return(m_bPositionEncoder);}
   //--- Implement the main algorithms of the model
   bool               FeedForward(const CBufferType *inputs);
   bool               Backpropagation(CBufferType *target);
   bool               UpdateWeights(uint batch_size = 1);
   bool               GetResults(CBufferType *&result);
   void               SetLearningRates(TYPE learning_rateTYPE beta1 = defBeta1,
                                                           TYPE beta2 = defBeta2);
   //--- Loss Function Methods
   bool               LossFunction(ENUM_LOSS_FUNCTION loss_function,
                          TYPE lambda1 = defLambdaL1TYPE lambda2 = defLambdaL2);
   ENUM_LOSS_FUNCTION LossFunction(void)       const { return(m_eLossFunction);    }
   ENUM_LOSS_FUNCTION LossFunction(TYPE &lambda1TYPE &lambda2);

   TYPE               GetRecentAverageLoss(voidconst { return(m_dNNLoss);        }
   void               LossSmoothFactor(int value)    { m_iLossSmoothFactor = value;}
   int                LossSmoothFactor(void)   const { return(m_iLossSmoothFactor);}
   //--- Model operation mode control
   bool               TrainMode(void)          const { return m_bTrainMode;        }
   void               TrainMode(bool mode);
   //--- Methods for working with files
   virtual bool       Save(string file_name = NULL);
   virtual bool       Save(const int file_handle);
   virtual bool       Load(string file_name = NULLbool common = false);
   virtual bool       Load(const int file_handle);
   //--- object identification method
   virtual int        Type(void)               const { return(defNeuronNet);      }
   //--- Retrieve pointers to internal objects
   virtual CBufferTypeGetGradient(uint layer)     const;
   virtual CBufferTypeGetWeights(uint layer)      const;
   virtual CBufferTypeGetDeltaWeights(uint layerconst;
  };