Construction using MQL5

As we have already seen in the description of the convolutional network architecture. To construct it, we need to create two new types of neural layers: convolutional and pooling. The first is responsible for data filtering and extracting the desired data, while the second is for pinpointing the points of maximum correspondence to the filter and reducing the data array's dimensionality. The convolutional layer has a weight matrix, but it is much smaller than the weight matrix of a fully connected layer due to the fact that it is searching for a small pattern. As for the pooling layer, it has no weighting coefficients at all. This reduction in the dimension of the weight matrix makes it possible to reduce the number of mathematical calculations and thereby increase the speed of information processing. At the same time, the number of operations decreases both during the forward and backward passes. Therefore, the time required to train the neural network is significantly reduced. The ability of the algorithm to filter out noise allows you to improve the quality of the neural network.

Pooling layer

We begin the implementation of the algorithm by constructing a pooling layer. To do this, we will create a class CNeuronProof. We have previously voiced the idea that for the continuity of neural layers, they will all inherit from one base class. Adhering to this concept, we will inherit the new neural layer from the previously created CNeuronBase class. The inheritance will be public. Therefore, all methods not overridden within the CNeuronProof class will be accessible from the parent class.

To cover additional requirements due to the peculiarities of the convolutional network algorithm, we will add variables to the new class to store additional information:

  • m_iWindow — window size at the input of the neural layer
  • m_iStep — step size of the input window
  • m_iNeurons — output size of one filter
  • m_iWindowOut — number of filters
  • m_eActivation — activation function

Note that, unlike the base class CNeuronBase, we did not use a separate activation function class CActivation but introduced a new variable m_eActivation. The reason is that the pooling layer does not use the activation function in the previously considered form. Its functionality is slightly different here. Usually, the result of the pooling layer is the maximum or the arithmetic mean value of the analyzed window. Therefore, we implement new functionality within the methods of this class and will create a new enumeration with two elements:

  • AF_AVERAGE_POOLING — the arithmetic mean of the input data window
  • Af_MAX_POOLING — the maximum value of the input data window

At the same time, we deliberately will not make changes to the code of the base class regarding new activation functions, as they will not be used in other neural layer architectures.

//--- activation functions of the pooling layer
enum ENUM_PROOF
  {
   AF_MAX_POOLING,
   AF_AVERAGE_POOLING
  };

Another feature of the pooling layer is the absence of a weight matrix. Therefore, the layer will not participate in the process of training and updating the weights. In this case, we can even delete some objects to free up memory. At the same time, the pooling layer cannot be completely excluded from the backward pass, as it will be involved in the propagation of the error gradient. To avoid cluttering the dispatcher class methods with excessive checks and at the same time to exclude the invocation of unnecessary parent class methods, we will replace a number of methods with "stubs" that will return the value required for the normal operation of the integrated neural network algorithm.

  • CalcOutputGradient always returns false because it is not intended to use the layer as an output layer for the neural network.
  • CalcDeltaWeights and UpdateWeights always return true. The absence of a weight matrix makes these methods redundant, but for the correct operation of the entire model, it is necessary to return a positive result from the methods.
  • GetWeights and GetDeltaWeights always return NULL. Methods have been overridden to prevent errors due to accessing a non-existent object.

Let's add another method to return the number of elements in the output of one filter and we will get the following class structure.

class CNeuronProof    :  public CNeuronBase
  {
protected:
   uint              m_iWindow;             //Window size at the input of the neural layer
   uint              m_iStep;               //Input window step size
   uint              m_iNeurons;            //Output size of one filter
   uint              m_iWindowOut;          //Number of filters
   ENUM_PROOF        m_eActivation;         //Activation function
public:
                     CNeuronProof(void);
                    ~CNeuronProof(void) {};
   //---
   virtual bool      Init(const CLayerDescription *descoverride;
   virtual bool      FeedForward(CNeuronBase *prevLayeroverride;
   virtual bool      CalcOutputGradient(CBufferType *targetoverride;
                                                               { return false;}
   virtual bool      CalcHiddenGradient(CNeuronBase *prevLayeroverride;
   virtual bool      CalcDeltaWeights(CNeuronBase *prevLayer) { return true; }
   virtual bool      UpdateWeights(int batch_sizeTYPE learningRate,
                                         VECTOR &BetaVECTOR &Lambdaoverride
                                                               { return true; }
   //---
   virtual CBufferType     *GetWeights(void)       const {  return(NULL);     }
   virtual CBufferType     *GetDeltaWeights(void)  const {  return(NULL);     }
   virtual uint      GetNeurons(void)              const {  return m_iNeurons;}
   //--- Methods for working with files
   virtual bool      Save(const int file_handleoverride;
   virtual bool      Load(const int file_handleoverride;
   //--- Object identification method
   virtual int       Type(voidoverride      const { return(defNeuronProof); }
  };

In the class constructor, we only initialize the added variables using initial values.

CNeuronProof::CNeuronProof(void) :  m_eActivation(AF_MAX_POOLING),
                                    m_iWindow(2),
                                    m_iStep(1),
                                    m_iWindowOut(1),
                                    m_iNeurons(0)
  {
  }

We did not add any new objects, and the destructor of the base class is responsible for deleting those created in the base class. Therefore, the destructor of our class will remain empty.

Let's look further at the methods of the new class of pooling layer CNeuronProof. Let's examine the Init method that initializes the neural layer. In the parameters, the method, similar to the method of the parent class, receives a layer description object. At the beginning of the method, we check the validity of the received object as well as the match between the required layer and the current neural network class.

bool CNeuronProof::Init(const CLayerDescription *description)
  {
//--- control block
   if(!description || description.type != Type() ||
      description.count <= 0)
      return false;

After successfully passing the initial check, we will save and verify the parameters of the created layer:

  • input window size
  • input window step
  • the number of filters
  • the number of elements at the output of one filter

All specified parameters must be non-zero positive values.

//--- Save constants
   m_iWindow = description.window;
   m_iStep = description.step;
   m_iWindowOut = description.window_out;
   m_iNeurons = description.count;
   if(m_iWindow <= 0 || m_iStep <= 0 || m_iWindowOut <= 0 || m_iNeurons <= 0)
      return false;

Let's also check the specified activation function. For the pooling layer, we can only use two variants of the activation function, AF_AVERAGE_POOLING and AF_MAX_POOLING. In other cases, we will exit the method with the result false.

//--- Checking the activation function
   switch((ENUM_PROOF)description.activation)
     {
      case AF_AVERAGE_POOLING:
      case AF_MAX_POOLING:
         m_eActivation = (ENUM_PROOF)description.activation;
         break;
      default:
         return false;
         break;
     }

After successfully passing all the control blocks, we proceed directly to the initialization of the neural layer. First, we initialize the results vector m_cOutputs with zero values. We will create this buffer in the form of a rectangular matrix, with its rows representing individual filters.

//--- Initializing the results buffer
   if(!m_cOutputs)
      if(!(m_cOutputs = new CBufferType()))
         return false;
   if(!m_cOutputs.BufferInit(m_iWindowOutm_iNeurons0))
      return false;

The use of matrices allows us to distribute data across filters within the scope of a single object. This gives us the opportunity to use a transparent data structure and exchange data between CPU and OpenCL context. This will allow us to gain a little time when transferring data and organize parallel processing of data by all filters at once.

A similar approach is used for the m_cGradients error gradient buffer.

//--- Initialize the error gradient buffer
   if(!m_cGradients)
      if(!(m_cGradients = new CBufferType()))
         return false;
   if(!m_cGradients.BufferInit(m_iWindowOutm_iNeurons0))
      return false;

After completing the initialization of the result and gradient buffers, we will remove unused objects and exit the method with a positive result.

//---
   m_eOptimization = None;
//--- Deleting unused objects
   if(!!m_cActivation)
      delete m_cActivation;
   if(!!m_cWeights)
      delete m_cWeights;
   if(!!m_cDeltaWeights)
      delete m_cDeltaWeights;
   for(int i = 0i < 2i++)
      if(!!m_cMomenum[i])
         delete m_cMomenum[i];
//---
   return true;
  }

Now that we have completed the initialization of the neural layer, let's move on to implementing the feed-forward pass in the FeedForward method. Similar to the previous method, the forward pass method is constructed following the concept of inheritance and overriding virtual methods of the base class while adding new functionality. In its parameters, the method receives a pointer to an object of the previous neural layer. As always, at the beginning of the method, we will set up a validation block to check the input data. Here, we are checking the validity of pointers to the previous neural layer and the result buffers of both the previous and current neural layers.

bool CNeuronProof::FeedForward(CNeuronBase *prevLayer)
  {
//--- Control block
   if(!prevLayer || !m_cOutputs ||
      !prevLayer.GetOutputs())
      return false;
   CBufferType *input_data = prevLayer.GetOutputs();

After successfully passing the control block, we will save a pointer to the result buffer of the previous layer and create a branching algorithm in the method based on the computational device in use: CPU or OpenCL context. We will return to the multi-threaded calculation algorithm a little later. Now, let's consider the implementation in MQL5.

Once again, we emphasize that the subsample layer does not have a weight matrix. And just like all other neural layers, it uses the same activation function for all neurons and filters. So, the difference between the filter outputs can only occur when different input data is used. In other words, the number of filters in the pooling layer must match the number of filters in the preceding convolutional layer. So, we will first copy the original data matrix and reformat it if necessary.

//--- Branching of the algorithm depending on the execution device
   if(!m_cOpenCL)
     {
      MATRIX inputs = input_data.m_mMatrix;
      if(inputs.Rows() != m_iWindowOut)
        {
         ulong cols = (input_data.Total() + m_iWindowOut - 1) / m_iWindowOut;
         if(!inputs.Reshape(m_iWindowOutcols))
            return false;
        }

It should be noted that despite the assumption of using a pooling layer after convolutional layers, our method allows for its use after the base class of a fully connected neural layer. That is why we copy the initial data matrix. This allows us to seamlessly reformat it into the desired format without the fear of disrupting the structure of the preceding layer.

It must be noted that MQL5 does not support three-dimensional matrices. Therefore, from this point on, we will need to work separately for each filter. First, we will create a local matrix with the number of rows and columns equal to the dimensions of the results of one filter and the input window, respectively. We organize two nested loops: an outer loop with a number of iterations equal to the number of filters, and an inner loop with a number of iterations equal to the number of elements in one filter of the current layer.

//--- Create a local matrix to collect data from one filter
      MATRIX array = MATRIX::Zeros(m_iNeuronsm_iWindow);
      m_cOutputs.m_mMatrix.Fill(0);
//--- Filter iteration cycle
      for(uint f = 0f < m_iWindowOutf++)
        {
//--- Loop through the elements of the results buffer
         for(uint o = 0o < m_iNeuronso++)
           {
            uint shift = o * m_iStep;
            for(uint i = 0i < m_iWindowi++)
               array[oi] = ((shift + i) >= inputs.Cols() ? 0 :
                              inputs[fshift + i]);
           }

In the inner loop, we implement another nested loop. In its body, we will distribute the input data of one filter into the previously created matrix according to the size of the data window and its step. The use of a loop is driven by the need for a unified approach in cases where the size and stride are not equal.

After distributing the initial data, we will use matrix operations according to the given activation function. The resulting vector is stored in the results matrix. The row of the results matrix corresponds to the number of the analyzed filter.

//--- Saving the current result in accordance with the activation function
         switch(m_eActivation)
           {
            case AF_MAX_POOLING:
               if(!m_cOutputs.Row(array.Max(1), f))
                  return false;;
               break;
            case AF_AVERAGE_POOLING:
               if(!m_cOutputs.Row(array.Mean(1), f))
                  return false;
               break;
            default:
               return false;
           }
        }
     }

I use the term 'filter' to maintain a clear chain in your understanding: the filter from the convolutional layer transitions to the filter in the pooling layer. Iterations of the pooling layer can hardly be called a filter. At the same time, I want it to be clear in your understanding that the convolutional and pooling layers, while organized into two objects of neural layers, form a single integrated structure. Therefore the same terminology is used.

After successfully completing all iterations of the loop system, we exit the method with the result true.

   else
     {
//--- The multi-threaded calculation block will be added in the next chapter
      return false;
     }
//--- Successful completion of the method
   return true;
  }

The feed-forward pass is followed by the backpropagation pass. The absence of a weight matrix in the pooling layer allows the backpropagation pass to be organized in a single method, unlike the base class of the neural network CNeuronBase, in which the backpropagation pass is divided into several functional methods.

Essentially, for the pooling layer, the backpropagation pass is the CalcHiddenGradient method that propagates the error gradient to the hidden layer. We have replaced the remaining methods with placeholders, as mentioned earlier.

The CalcHiddenGradient method itself is built within the framework of our concept of using a single format of virtual methods for all classes of neural networks with common inheritance from a single base class of the neural layer. Therefore, similar to the method of the base class of the neural layer CNeuronBase::CalcHiddenGradient, the method receives a pointer to the object of the previous neural layer in its parameters. At the beginning of the method, a control block for checking incoming data is organized. Here, we are checking the correctness of the pointer received as a parameter, which points to the object of the previous neural layer, and the presence of active result buffers and error gradients in the previous layer. We also check the correctness of the result buffers and error gradients of the current layer.

bool CNeuronProof::CalcHiddenGradient(CNeuronBase *prevLayer)
  {
//--- Control block
   if(!prevLayer || !m_cOutputs ||
      !m_cGradients || !prevLayer.GetOutputs() ||
      !prevLayer.GetGradients())
      return false;
   CBufferType *input_data = prevLayer.GetOutputs();
   CBufferType *input_gradient = prevLayer.GetGradients();
   if(!input_gradient.BufferInit(input_data.Rows(), input_data.Cols(), 0))
      return false;

After successfully passing the control block, similar to the forward pass method, we will copy and reformat the matrix of input data. We will also create a zero local matrix of similar size, to accumulate error gradients.

Note that in the base neural layer class, we did not pre-zero the gradient buffer. The difference lies in the approach to passing the error gradients to the previous layer. The base class algorithm includes recalculation and saving of the gradient value for each element. With this approach, pre-clearing the buffer doesn't make sense because any value will be overwritten with a new one. In the pooling layer algorithm, recording the error gradient into each buffer element of the previous layer is only envisaged when using Average Pooling (arithmetic mean value). In the case of Max Pooling (maximum value), the error gradient is transferred only to the element with the maximum value, because only it affects the subsequent result of the neural network. The remaining elements receive a zero error gradient. Therefore, we immediately clear the entire buffer and only insert the gradient value for elements that affect the result.

Next, we divide the algorithm depending on the computational device. We will not now discuss the implementation of multi-threaded calculations in OpenCL but will focus on implementation using MQL5.

Here, just like in the forward pass, we organize a system of nested loops to iterate through filters and their elements. Inside the loops, the error gradient is distributed to the elements of the previous layer depending on the activation function.

//--- Branching of the algorithm depending on the execution device
   if(!m_cOpenCL)
     {
      MATRIX inputs = input_data.m_mMatrix;
      ulong cols = (input_data.Total() + m_iWindowOut - 1) / m_iWindowOut;
      if(inputs.Rows() != m_iWindowOut)
        {
         if(!inputs.Reshape(m_iWindowOutcols))
            return false;
        }
//--- Create a local matrix to collect data from one filter
      MATRIX inputs_grad = MATRIX::Zeros(m_iWindowOutcols);

//--- Filter iteration cycle
      for(uint f = 0f < m_iWindowOutf++)
        {
//--- Loop through the elements of the results buffer
         for(uint o = 0o < m_iNeuronso++)
           {
            uint shift = o * m_iStep;
            TYPE out = m_cOutputs.m_mMatrix[fo];
            TYPE gradient = m_cGradients.m_mMatrix[fo];
 //--- Propagate the gradient in accordance with the activation function
            switch(m_eActivation)
              {
               case AF_MAX_POOLING:
                  for(uint i = 0i < m_iWindowi++)
                    {
                     if((shift + i) >= cols)
                        break;
                     if(inputs[fshift + i] == out)
              {
                        inputs_grad[fshift + i] += gradient;
                        break;
                       }
                    }
                  break;
               case AF_AVERAGE_POOLING:
                  gradient /= (TYPE)m_iWindow;
                  for(uint i = 0i < m_iWindowi++)
                    {
                     if((shift + i) >= cols)
                        break;
                     inputs_grad[fshift + i] += gradient;
                    }
                  break;
               default:
                  return false;
              }
           }
        }
//--- copy the gradient matrix to the buffer of the previous neural layer
      if(!inputs_grad.Reshape(input_gradient.Rows(), input_gradient.Cols()))
         return false;
      input_gradient.m_mMatrix = inputs_grad;
     }

When using the arithmetic average (AF_AVERAGE_POOLING), the error gradient is equally distributed to all elements in the input data window corresponding to the result element.

When using the maximum value (AF_MAX_POOLING), the entire error gradient is passed on to the element with the maximum value. Moreover, when there are multiple elements with the same maximum value, the error gradient is passed to the element with the minimum index in the result buffer of the previous layer. This choice was made deliberately to enhance the overall efficiency of the neural network. The reason for this is that when passing the same gradient to elements with the same value, we risk getting into a situation where two or more neurons will work synchronously, producing identical results. Duplicating the signal with different neurons doesn't increase the significance of the signal; it only reduces the efficiency of the neural network's operation. After all, when working synchronously, the efficiency of such neurons becomes equal to the work of one neuron. Therefore, by passing the error gradient to only one neuron, we hope that the next time, another element will receive a different gradient value and disrupt the synchronization of neurons' operation.

After filling the local gradient matrix, we transfer the obtained result to the gradient buffer of the previous layer and exit the method with the result of the operations.

   else
     {
//--- The multi-threaded calculation block will be added in the next chapter
      return false;
     }
//--- Successful completion of the method
   return true;
  }

The methods discussed above describe the main functionality of the pooling layer. For the completeness of the class functionality, it's necessary to add methods for working with files to save information about the trained neural network to a file. The main characteristic of the pooling layer is the absence of a weight matrix. Hence, there are no trainable elements and no need to store any data buffers. To fully restore the functionality of the layer, it's sufficient to save the values of its variables that define the operational parameters of the class.

  • m_iWindow — window size at the input of the neural layer
  • m_iStep — step size of the input window
  • m_iNeurons — output size of one filter
  • m_iWindowOut — number of filters
  • m_eActivation — activation function

bool CNeuronProof::Save(const int file_handle)
  {
//--- Control block
   if(file_handle == INVALID_HANDLE)
      return false;
//--- Save constants
   if(FileWriteInteger(file_handleType()) <= 0)
      return false;
   if(FileWriteInteger(file_handle, (int)m_iWindow) <= 0)
      return false;
   if(FileWriteInteger(file_handle, (int)m_iStep) <= 0)
      return false;
   if(FileWriteInteger(file_handle, (int)m_iWindowOut) <= 0)
      return false;
   if(FileWriteInteger(file_handle, (int)m_iNeurons) <= 0)
      return false;
   if(FileWriteInteger(file_handle, (int)m_eActivation) <= 0)
      return false;
//--- Successful completion of the method
   return true;
  }

The method for restoring the layer from a file is slightly more complex than the method for saving it. In this case, I think the term 'recovery' is more appropriate than 'loading'. This is due to the fact that we will not read any information about training and development of the method from the file. From the file, we first read the layer parameters, which contain roughly the same amount of information as we pass in the initialization method in the layer description object. Then we initialize the result and error gradient buffers.

bool CNeuronProof::Load(const int file_handle)
  {
//--- Control block
   if(file_handle == INVALID_HANDLE)
      return false;
//--- Load constants
   m_iWindow = (uint)FileReadInteger(file_handle);
   m_iStep = (uint)FileReadInteger(file_handle);
   m_iWindowOut = (uint)FileReadInteger(file_handle);
   m_iNeurons = (uint)FileReadInteger(file_handle);
   m_eActivation = (ENUM_PROOF)FileReadInteger(file_handle);
//--- Initialize the results buffer
   if(!m_cOutputs)
     {
      m_cOutputs = new CBufferType();
      if(!m_cOutputs)
         return false;
     }
   if(!m_cOutputs.BufferInit(m_iWindowOutm_iNeurons0))
      return false;
//--- Initialize the error gradient buffer
   if(!m_cGradients)
     {
      m_cGradients = new CBufferType();
      if(!m_cGradients)
         return false;
     }
   if(!m_cGradients.BufferInit(m_iWindowOutm_iNeurons0))
      return false;
//---
   return true;
  }

At this stage, we can say that we have completed the first part of the work on constructing convolutional neural network objects. Now we will move on to the second stage, building a convolutional layer class.

Convolutional layer

The construction of the convolutional layer is carried out in the CNeuronConv class, which we will inherit from the CNeuronProof pooling layer class created above. Inheriting from the pooling layer class does not violate our concept of having all classes in our neural network inherit from a common base class. The pooling layer class is a direct descendant of the base class, and all its descendants will also be descendants of the base class.

At the same time, by inheriting from the pooling layer class, we immediately gain access to all the added and overridden functionality, including variables for working with data windows. Moreover, inheriting objects and variables reinforces the connection between classes and underscores the unity of approaches in data processing.

Thus, thanks to inheritance, in the convolutional layer class CNeuronConv, we will use objects and variables declared in the parent classes. We don't need to declare any new objects and variables. As a consequence, the constructor and destructor of our class remain empty methods. At the same time, the convolutional layer class uses the weight matrix. In this case, we will need to override some previously set stubs.

  • UpdateWeights completely satisfies the algorithm of the method of the base class CNeuronBase, so let's call its execution.
  • GetWeights and GetDeltaWeights return pointers to the corresponding data buffers.

As a result, the class structure will take the following form.

class CNeuronConv    :  public CNeuronProof
  {
public:
                     CNeuronConv(void) {};
                    ~CNeuronConv(void) {};
   //---
   virtual bool      Init(const CLayerDescription *descoverride;
   virtual bool      FeedForward(CNeuronBase *prevLayer);
   virtual bool      CalcHiddenGradient(CNeuronBase *prevLayer);
   virtual bool      CalcDeltaWeights(CNeuronBase *prevLayer);
   virtual bool      UpdateWeights(int batch_sizeTYPE learningRate,
                                   VECTOR &BetaVECTOR &Lambda)
     {
      return CNeuronBase::UpdateWeights(batch_sizelearningRate,
                                        BetaLambda);
     }
   //---
   virtual CBufferType*  GetWeights(void)      const { return(m_cWeights);     }
   virtual CBufferType*  GetDeltaWeights(voidconst { return(m_cDeltaWeights);}
   bool              SetTransposedOutput(const bool value);
   //--- methods for working with files
   virtual bool      Save(const int file_handle);
   virtual bool      Load(const int file_handle);
   //--- object identification method
   virtual int       Type(void)       const { return(defNeuronConv); }
  };

Let's examine the implementation of the Init method that initializes the convolutional layer. It partially combines the initialization methods of both parent classes. Unfortunately, we cannot use any of them: in the initialization method of the base class, buffers of incorrect sizes will be created and will still need to be overridden, and in the initialization method of the pooling layer, objects that will need to be recreated later are deleted. Therefore, we will write the entire algorithm into the method.

Like similar methods in the parent classes, the initialization method receives a pointer to an object describing the created neural layer in its parameters. As before, the method starts with a control block in which we validate the received pointer, the specified type of the layer being created, and the layer parameters.

bool CNeuronConv::Init(const CLayerDescription *desc)
  {
//--- control block
   if(!desc || desc.type != Type() || desc.count <= 0 || desc.window <= 0)
      return false;

After executing the control block, we save the layer parameters into special variables and initialize the necessary buffers.

//--- save constants
   m_iWindow = desc.window;
   m_iStep = desc.step;
   m_iWindowOut = desc.window_out;
   m_iNeurons = desc.count;
//--- save parameter optimization method
   m_eOptimization = desc.optimization;

First, we initialize the results buffer m_cOutputs. Similar to the pooling layer, we set the number of rows and columns of the buffer matrix equal to the number of filters and the number of elements in one filter, respectively. The buffer is initialized with zero values.

Next, we initialize the m_cGradients error gradient buffer with zero values. We set its size equal to the size of the m_cOutputs results buffer.

//--- initialize the results buffer
   if(!m_cOutputs)
      if(!(m_cOutputs = new CBufferType()))
         return false;
//--- initialize the error gradient buffer
   if(!m_cGradients)
      if(!(m_cGradients = new CBufferType()))
         return false;
   if(!m_cOutputs.BufferInit(m_iWindowOutm_iNeurons0))
      return false;
   if(!m_cGradients.BufferInit(m_iWindowOutm_iNeurons0))
      return false;

Next, we will need to initialize an instance of the activation function object. As you may recall, during the development of the base neural layer class, we decided to separate all the work related to initializing the activation function instance into a separate method called SetActivation. Here we just call this method of the parent class, and check the result of the operations.

//--- initialize the activation function class
   VECTOR params=desc.activation_params;
   if(!SetActivation(desc.activationparams))
      return false;

Then we initialize the weight matrix with random values. The number of rows in the weight matrix is equal to the number of filters being used, and the number of columns in the matrix is one greater than the size of the analyzed window. The added element is used for bias. The matrix is initialized with random values.

//--- initialize the weight matrix buffer
   if(!m_cWeights)
      if(!(m_cWeights = new CBufferType()))
         return false;
   if(!m_cWeights.BufferInit(desc.window_outdesc.window + 1))
      return false;
   double weights[];
   double sigma = desc.activation == AF_LRELU ?
                  2.0 / (double)(MathPow(1 + desc.activation_params[0], 2) *
                                                                 desc.window) :
                  1.0 / (double)desc.window;
   if(!MathRandomNormal(0MathSqrt(sigma), m_cWeights.Total(), weights))
      return false;
   for(uint i = 0i < m_cWeights.Total(); i++)
      if(!m_cWeights.m_mMatrix.Flat(i, (TYPE)weights[i]))
         return false;

At the end of the method, we initialize the buffers involved in the learning process. These are: a buffer of weight deltas (also known as a buffer of accumulated gradients) and moment buffers. Recall that the number of moment buffers used depends on the user-specified method for optimizing model parameters. The sizes of the specified buffers will correspond to the size of the weights matrix.

//--- initialize the gradient buffer at the weight matrix level
   if(!m_cDeltaWeights)
      if(!(m_cDeltaWeights = new CBufferType()))
         return false;
   if(!m_cDeltaWeights.BufferInit(desc.window_outdesc.window + 10))
      return false;
//--- initialize moment buffers
   switch(desc.optimization)
     {
      case None:
      case SGD:
         for(int i = 0i < 2i++)
            if(m_cMomenum[i])
               delete m_cMomenum[i];
         break;
      case MOMENTUM:
      case AdaGrad:
      case RMSProp:
         if(!m_cMomenum[0])
            if(!(m_cMomenum[0] = new CBufferType()))
               return false;
         if(!m_cMomenum[0].BufferInit(desc.window_outdesc.window + 10))
            return false;
         if(m_cMomenum[1])
            delete m_cMomenum[1];
         break;
      case AdaDelta:
      case Adam:
         for(int i = 0i < 2i++)
           {
            if(!m_cMomenum[i])
               if(!(m_cMomenum[i] = new CBufferType()))
                  return false;
            if(!m_cMomenum[i].BufferInit(desc.window_outdesc.window + 10))
               return false;
           }
         break;
      default:
         return false;
         break;
     }
   return true;
  }

After initializing the class, we will move on to the forward pass method, which we will create in the overridden virtual method FeedForward. This way, we continue to exploit the concepts of inheritance and virtualization of class methods. In its parameters, the feed-forward pass method receives a pointer to the object of the previous layer, just like all the similar methods in the parent classes.

At the beginning of the method, as usual, we will insert a control block for checking the source data. In this method, we validate the received pointer to the object of the preceding neural layer and check for the presence of an 'active' result buffer in it. We also check whether the result buffer and weight matrix of the current layer have been created. To simplify the data access procedure for the result buffer of the preceding layer, we will store a pointer to this object in a local variable.

bool CNeuronConv::FeedForward(CNeuronBase *prevLayer)
  {
//--- control block
   if(!prevLayer || !m_cOutputs || !m_cWeights || !prevLayer.GetOutputs())
      return false;
   CBufferType *input_data = prevLayer.GetOutputs();
   ulong total = input_data.Total();

Next, we divide the algorithm into two threads depending on the execution device. We will discuss the algorithm for constructing multi-threaded calculations using OpenCL technology in the next chapter. Now let's look at the algorithm for arranging operations using MQL5.

The forward pass convolutional layer algorithm will somewhat resemble the similar pooling layer method. This is quite understandable: both layers work with a data window, which moves through the initial data array with a given step. Differences exist in the methods for processing the set of values that fall into the window.

Another difference lies in the approach to the perception of the array of initial data. The pooling layer in the convolutional neural network algorithm is placed after the convolutional layer, which can contain multiple filters. Consequently, the result buffer will contain the results of processing the data by multiple filters. The pooling layer is supposed to separate the results of one filter from another. In the convolutional layer, I chose to simplify this aspect, so I treat the entire input array as a single vector of input data. This approach allows us to simplify the method algorithm without losing the quality of the neural network in general.

Let's return to the algorithm. Before using matrix operations, we need to transform the vector of input data into a matrix with a number of rows equal to the number of elements in one filter. The number of columns should correspond to the size of the analyzed window of input data. Here, there are two possible scenarios: whether the size of the analyzed window is equal to its step or not.

In the first case, we can simply reformat the vector into a matrix. In the second case, we need to create a loop system for copying data.

//--- branching of the algorithm depending on the execution device
   if(!m_cOpenCL)
     {
      MATRIX m;
      if(m_iWindow == m_iStep && total == (m_iNeurons * m_iWindow))
        {
         m = input_data.m_mMatrix;
         if(!m.Reshape(m_iNeuronsm_iWindow))
            return false;
        }
      else
        {
         if(!m.Init(m_iNeuronsm_iWindow))
            return false;
         for(ulong r = 0r < m_iNeuronsr++)
           {
            ulong shift = r * m_iStep;
            for(ulong c = 0c < m_iWindowc++)
              {
               ulong k = shift + c;
               m[rc] = (k < total ? input_data.At((uint)k) : 0);
              }
           }
        }

Then, we will add the bias vector, which includes a single column of ones, to the resulting matrix. We multiply the resulting matrix by the transposed weight matrix.

//--- add a bias column
      if(!m.Resize(m.Rows(), m_iWindow + 1) ||
         !m.Col(VECTOR::Ones(m_iNeurons), m_iWindow))
         return false;
//--- Calculate the weighted sum of elements of the input window
      m_cOutputs.m_mMatrix = m_cWeights.m_mMatrix.MatMul(m.Transpose());
     }

Finally, we call the Activation method of the class of the activation function and terminate the method.

   else
     {
//--- The multi-threaded calculation block will be added in the next chapter
      return false;
     }
   if(!m_cActivation.Activation(m_cOutputs))
      return false;
//--- Successful completion of the method
   return true;
  }

After completing work on the feed-forward pass, we will move on to working on the backpropagation pass. Unlike the pooling layer, the convolutional layer contains a weight matrix. Therefore, to organize the pass, we need a full set of methods.

A little ahead, I will say that the weight matrix update method from the base class is perfectly suitable. However, since we inherited not directly from the CNeuronBaseclass but from a pooling layer CNeuronProof, in which the method was replaced by a stub, we will have to forcefully turn to the base class method.

bool CNeuronConv::UpdateWeights(int batch_sizeTYPE learningRate,
                                   VECTOR &BetaVECTOR &Lambda)
     {
      return CNeuronBase::UpdateWeights(batch_sizelearningRateBetaLambda);
     }

But let's return to the logical chain of the backpropagation algorithm and take a look at the method for distributing the gradient through the hidden layer, CNeuronConv::CalcHiddenGradient.

If you look at the influence of the elements of the initial data on the elements of the results, you will notice a dependence. Each element of the resulting vector analyzes a block of data from the initial data vector in the size of the specified window. Similarly, each element of the initial data affects the value of elements in the result vector within a certain influence window. The size of this window depends on the step with which the input window moves across the source data array. With a step equal to one, both windows are equal. However, as the step increases, the size of the influence window decreases. Consequently, to propagate the error gradient, we need to collect error gradients from elements of the subsequent layer within the influence window.

I propose to look at the practical implementation of this method. We continue working with the virtual methods of the parent classes. In the parameters, the method receives a pointer to the object of the previous layer. Following the same pattern as with other methods, we start with a data validation block at the beginning of the method. Here, we validate the received pointer in the parameters and check for the presence of valid objects for output value buffers and error gradients of the previous layer. We also check for the presence of the error gradient buffer and weight matrix of the current layer.

bool CNeuronConv::CalcHiddenGradient(CNeuronBase *prevLayer)
  {
//--- control block
   if(!prevLayer || !prevLayer.GetOutputs() || !prevLayer.GetGradients() ||
      !m_cGradients || !m_cWeights)
      return false;

After successfully passing the control block, we will adjust the error gradient by the derivative of the activation function of the current layer.

//--- adjusting error gradients to the derivative of the activation function
   if(m_cActivation)
     {
      if(!m_cActivation.Derivative(m_cGradients))
         return false;
     }

Next comes the branching of the algorithm depending on the computing device used. We are currently looking at the MQL5 branch.

The backpropagation method is the mirror of the forward pass method. During the feed-forward pass, we first transfer the input data to a local matrix and then multiply it by the weight matrix. During the backpropagation pass, we will first reformat the error gradient matrix received from the previous layer into the required format and then multiply it by the weight matrix.

//--- branching of the algorithm depending on the execution device
   CBufferTypeinput_gradient = prevLayer.GetGradients();
   if(!m_cOpenCL)
     {
      MATRIX g = m_cGradients.m_mMatrix;
      if(!g.Reshape(m_iWindowOutm_iNeurons))
         return false;
      g = g.Transpose();
      g = g.MatMul(m_cWeights.m_mMatrix);
      if(!g.Resize(m_iNeuronsm_iWindow))
         return false;

As a result of matrix multiplication, we obtain a matrix of gradients for the previous layer. However, the process becomes more complex due to the presence of the analyzed window and its step. If they are equal, we just need to reformat the matrix and copy its value to the buffer of the previous layer. But if the size of the analyzed window of the source data is not equal to its step, then we will need to organize a loop system for copying and summing gradients. Indeed, in this case, one neuron of the source data influences several neurons of the results of each filter.

      if(m_iWindow == m_iStep && input_gradient.Total() == (m_iNeurons * m_iWindow))
        {
         if(!g.Reshape(input_gradient.Rows(), input_gradient.Cols()))
            return false;
         input_gradient.m_mMatrix = g;
        }
      else
        {
         input_gradient.m_mMatrix.Fill(0);
         ulong total = input_gradient.Total();
         for(ulong r = 0r < m_iNeuronsr++)
           {
            ulong shift = r * m_iStep;
            for(ulong c = 0c < m_iWindowc++)
              {
               ulong k = shift + c;
               if(k >= total)
                  break;
               if(!input_gradient.m_mMatrix.Flat(k,
                                  input_gradient.m_mMatrix.Flat(k) + g[rc]))
                  return false;
              }
           }
        }
     }

After completing the loop iterations, we exit the method with a positive result.

   else
     {
//--- The multi-threaded calculation block will be added in the next chapter
      return false;
     }
//--- Successful completion of the method
   return true;
  }

After distributing the gradient through the hidden layer, it's time to calculate the error gradient on the elements of the weight matrix. After all, it is the weights that we will select for optimal operation of the neural network. All the work on propagating the error gradient is necessary only to determine the direction and magnitude of the weight adjustments. This approach makes selecting the optimal weight matrix directed and controllable.

Work on distributing the error gradient over the elements of the weight matrix is implemented in the CalcDeltaWeights method. This method is also virtual and is overridden in each class. In the parameters, the method receives a pointer to the object of the previous layer. At the beginning of the method, we immediately check the correctness of the received pointer and the presence of operational data buffers in the current and previous neural layers. To calculate the gradient on the weight matrix, we will need a buffer for incoming gradients, a buffer for input data (results from the previous layer), and a buffer to store the obtained results (m_cDeltaWeights). Let me remind you that our algorithm includes gradient distribution at each iteration of the backward pass, and the weight matrix update is triggered by a request from an external program. Therefore, in the m_cDeltaWeights buffer, we will accumulate the error gradient value. During the update, we will divide the accumulated value by the number of completed iterations. Thus, we obtain the average error for each weight.

bool CNeuronConv::CalcDeltaWeights(CNeuronBase *prevLayer)
  {
//--- control block
   if(!prevLayer || !prevLayer.GetOutputs() || !m_cGradients || !m_cDeltaWeights)
      return false;

To simplify access to the data buffer of the previous layer, we will save the pointer to the object in a local variable.

Next, we divide the algorithm into two logical threads of operations depending on the computational device in use.

//--- branching of the algorithm depending on the execution device
   CBufferType *input_data = prevLayer.GetOutputs();
   if(!m_cOpenCL)
     {

We will discuss the implementation of the OpenCL algorithm in the next chapter. Now we will focus on the implementation using MQL5.

We have a two-dimensional weight matrix, in which one dimension represents the filters of our layer. Each row in the weight matrix is a separate filter. Therefore, the number of rows in the weight matrix is equal to the number of filters used. The second dimension (columns) of the matrix represents the elements of our filter, and their number is equal to the size of the input window plus bias.

However, since the filter window moves across the input data array, each element of the filter affects the result of all elements in the vector of the current layer results. Therefore, for each filter element, we need to collect error gradients from all elements of the result vector, which are stored in the m_cGradientsbuffer. Vector operations will help us with this. But first, let me remind you that during the forward pass, we transformed the vector of the original data. Let's repeat this process.

      MATRIX inp;
      uint input_total = input_data.Total();
      if(m_iWindow == m_iStep && input_total == (m_iNeurons * m_iWindow))
        {
         inp = input_data.m_mMatrix;
         if(!inp.Reshape(m_iNeuronsm_iWindow))
            return false;
        }
      else
        {
         if(!inp.Init(m_iNeuronsm_iWindow))
            return false;
         for(ulong r = 0r < m_iNeuronsr++)
           {
            ulong shift = r * m_iStep;
            for(ulong c = 0c < m_iWindowc++)
              {
               ulong k = shift + c;
               inp[rc] = (k < input_total ? input_data.At((uint)k) : 0);
              }
           }
        }
      //--- add a bias column
      if(!inp.Resize(inp.Rows(), m_iWindow + 1) ||
         !inp.Col(VECTOR::Ones(m_iNeurons), m_iWindow))
         return false;

Next, we will directly collect error gradients for filter elements. Similar to the fully connected layer, the weight gradient in the convolutional layer is equal to the product of the neuron error gradient and the value of the corresponding element of the input data. In terms of matrix operations, all we need to do is multiply the gradient matrix before the activation function by the reformatted matrix of input data.

      MATRIX g = m_cGradients.m_mMatrix;
      if(!g.Reshape(m_iWindowOutm_iNeurons))
         return false;
i      m_cDeltaWeights.m_mMatrix += g.MatMul(inp);
     }

We will add the obtained result to the previously accumulated error gradients in the m_cDeltaWeights matrix.

   else
     {
//--- The multi-threaded calculation block will be added in the next chapter
      return false;
     }
//--- Successful completion of the method
   return true;
  }

We will become familiar with the algorithm for implementing multi-threaded computations in the next chapter, and at this stage, we exit the method with a positive result.

We've already discussed the weight update method earlier. We still need to create methods for working with files because we should have the ability to load and use a previously trained neural network. And here, we will also use the previously created groundwork. We have already created similar methods for two parent classes: the base class of the neural layer CNeuronBase, and the pooling layer CNeuronProof. Pooling layer methods are greatly simplified since it does not contain a matrix of weights and objects for its training. Therefore, we will use the base class method and force it to be called from the CNeuronConv::Save method. This approach will help us eliminate unnecessary controls since they are already implemented in the parent class method. We just have to check the result of the method. But we need more than that because the pooling layer introduces new variables. Therefore, after executing the parent class method, we will add the missing parameters to the file.

bool CNeuronConv::Save(const int file_handle)
  {
//--- call the method of the parent class
   if(!CNeuronBase::Save(file_handle))
      return false;
//--- save constant values
   if(FileWriteInteger(file_handle, (int)m_iWindow) <= 0)
      return false;
   if(FileWriteInteger(file_handle, (int)m_iStep) <= 0)
      return false;
   if(FileWriteInteger(file_handle, (int)m_iWindowOut) <= 0)
      return false;
   if(FileWriteInteger(file_handle, (int)m_iNeurons) <= 0)
      return false;
   if(FileWriteInteger(file_handle, (int)m_bTransposedOutput) <= 0)
      return false;
//---
   return true;
  }

The data loading is organized on the same principle. First, we need to read the data from the file in the same order in which it was written there. Hence, we will first call the method of the parent class. In it, all the controls are already implemented, and the sequence of data loading is observed. We only need to check the result returned by the parent class method, and after successful execution, read additional parameters from the file in the same sequence in which they were saved.

bool CNeuronConv::Load(const int file_handle)
  {
//--- calling the method of the parent class
   if(!CNeuronBase::Load(file_handle))
      return false;
//--- reading the values ​​of constants
   m_iWindow = (uint)FileReadInteger(file_handle);
   m_iStep = (uint)FileReadInteger(file_handle);
   m_iWindowOut = (uint)FileReadInteger(file_handle);
   m_iNeurons = (uint)FileReadInteger(file_handle);
   m_eActivation = -1;
//---
   if(!m_cOutputs.Reshape(m_iWindowOutm_iNeurons))
      return false;
   if(!m_cGradients.Reshape(m_iWindowOutm_iNeurons))
      return false;
//---
   return true;
  }

In this section, we created two new types of neural layers: pooling and convolutional. In the next section, we will further enhance their functionality with the ability to use the OpenCL for organizing parallel computations using multi-threading technologies. Then, in the comparative testing block, we will assemble a small neural network and compare the performance of the new architectural solution with the previously obtained testing results of fully connected neural networks.