Building an LSTM block in MQL5

To implement in our library, among all the options for architectural solutions of recurrent neurons, I have chosen the classical LSTM block. In my opinion, the presence of filters for new information and memory content in the form of gates will help minimize the influence of the noisy component of the signal. And a separate memory channel will help retain information for a longer period.

As before, to create a new type of neural layer, we will create a new class CNeuronLSTM. To maintain inheritance, the new class will be created based on our CNeuronBase neural layer base class.

class CNeuronLSTM    :  public CNeuronBase
  {
public:
                     CNeuronLSTM(void);
                    ~CNeuronLSTM(void);
   //--- method of identifying the object
   virtual int       Type(void)               const { return(defNeuronLSTM); }
  };

Since we apply the inheritance mechanism, our new class immediately possesses the basic functionality that was previously implemented in the parent class. Now we need to refine this functionality for the correct operation of our recurrent block. First, let's rewrite the virtual identification method.

As you know from the description of the LSTM block architecture presented in the previous chapter, we will need four fully connected layers for its proper operation. We'll declare them in the protected block of our class. And to maintain code readability, we will name them in accordance with the functionality laid out in the algorithm.

class CNeuronLSTM    :  public CNeuronBase
  {
protected:
   CNeuronBase*       m_cForgetGate;
   CNeuronBase*       m_cInputGate;
   CNeuronBase*       m_cNewContent;
   CNeuronBase*       m_cOutputGate;

In addition to the created neural layers, the block algorithm uses memory streams and a hidden state. We will need separate buffers to store them. We will also need to use the chronology of internal neurons in our training. Therefore, to store such information, we will create dynamic arrays, which we will also declare in the protected block:

  • m_cMemorys — memory state;
  • m_cHiddenStates — hidden state;
  • m_cInputs — concatenated array of raw data and hidden state;
  • m_cForgetGateOuts — state of the forget gate;
  • m_cInputGateOuts — state of the input gate;
  • m_cNewContentOuts — new content;
  • m_cOutputGateOuts — output gate state.

class CNeuronLSTM    :  public CNeuronBase
  {
protected:
   ....   
   CArrayObj*       m_cMemorys;
   CArrayObj*       m_cHiddenStates;
   CArrayObj*       m_cInputs;
   CArrayObj*       m_cForgetGateOuts;
   CArrayObj*       m_cInputGateOuts;
   CArrayObj*       m_cNewContentOuts;
   CArrayObj*       m_cOutputGateOuts;

Of course, in the process of operating a neural network, we cannot indefinitely accumulate a history of states, because our resources are finite. Therefore, we will need some kind of reference for understanding the buffer filling. If the buffer overflows above this limit, we will remove the oldest data and replace it with new. The depth of history for training the recurrent block will serve as such a reference for us. This parameter will be user-defined and stored in the m_iDepth variable.

class CNeuronLSTM    :  public CNeuronBase
  {
protected:
   ....   
   int               m_iDepth;

Continuing the discussion about declaring auxiliary variables for the class, there is another point to pay attention to. All four internal neural layers use the same input data which includes the concatenated tensor of the original data and the hidden state. The CalcHiddenGradient method of passing the gradient through the hidden layer of our base class is constructed so that it replaces the error gradient values in the buffer of the previous layer. However, we need to sum up the error gradient from all internal flows. Therefore, to accumulate the sum of the gradients, we will add another buffer m_cInputGradient.

class CNeuronLSTM    :  public CNeuronBase
  {
protected:
   ....   
   CBufferDouble*       m_cInputGradient;

It seems we've sorted out the variables. Now let's start building the class methods. The first thing that the class starts with is the constructor CNeuronLSTM::CNeuronLSTM. In this method, we create instances of the objects used and set initial values for the internal variables.

CNeuronLSTM::CNeuronLSTM(void)   : m_iDepth(2)
  {
   m_cForgetGate = new CNeuronBase();
   m_cInputGate = new CNeuronBase();
   m_cNewContent = new CNeuronBase();
   m_cOutputGate = new CNeuronBase();
   m_cMemorys = new CArrayObj();
   m_cHiddenStates = new CArrayObj();
   m_cInputs = new CArrayObj();
   m_cForgetGateOuts = new CArrayObj();
   m_cInputGateOuts = new CArrayObj();
   m_cNewContentOuts = new CArrayObj();
   m_cOutputGateOuts = new CArrayObj();
   m_cInputGradient = new CBufferType();
  }

We immediately create the destructor of the class CNeuronLSTM::~CNeuronLSTM, in which the reverse operation takes place, that is, memory is released after the class has finished its work. Here it's important to ensure complete memory cleanup so that nothing is missed.

CNeuronLSTM::~CNeuronLSTM(void)
  {
   if(m_cForgetGate)
      delete m_cForgetGate;
   if(m_cInputGate)
      delete m_cInputGate;
   if(m_cNewContent)
      delete m_cNewContent;
   if(m_cOutputGate)
      delete m_cOutputGate;
   if(m_cMemorys)
      delete m_cMemorys;
   if(m_cHiddenStates)
      delete m_cHiddenStates;
   if(m_cInputs)
      delete m_cInputs;
   if(m_cForgetGateOuts)
      delete m_cForgetGateOuts;
   if(m_cInputGateOuts)
      delete m_cInputGateOuts;
   if(m_cNewContentOuts)
      delete m_cNewContentOuts;
   if(m_cOutputGateOuts)
      delete m_cOutputGateOuts;
   if(m_cInputGradient)
      delete m_cInputGradient;
  }

 

Object initialization

Next, let's take a look at the method of initializing an instance of the CNeuronLSTM::Init class. It is in this method that all internal objects and variables are created and initialized, as well as the necessary foundation for the normal operation of the neural layer is prepared in accordance with the user-defined requirements. We created a similar virtual method in our base class for neural layers and constantly override it in each of our new classes.

class CNeuronLSTM    :  public CNeuronBase
  {
protected:
   ....   
public:
                     CNeuronLSTM(void);
                    ~CNeuronLSTM(void);
   //---
   virtual bool      Init(const CLayerDescription *descoverride;

As you know, a similar method of the base class receives the description of the neural layer being created as parameters. So, our method in the parameters will get a pointer to an instance of the CLayerDescription class. Therefore, at the beginning of the method, we perform a check for the validity of the received pointer and the parameters set in it. First of all, the type of neural layer it specifies must match our class. Also, our LSTM block cannot be used as an input layer and must contain at least one neuron at the output.

bool CNeuronLSTM::Init(const CLayerDescription *desc)
  {
//--- Control block
   if(!desc || desc.type != Type() || desc.count <= 0 || desc.window == 0)
      return false;

The use of the LSTM block as a source data layer is simply a waste of resources. We create a large number of additional objects that will never be used since we write the information directly into the output buffer in the input layer.

Next, we have to initialize our internal neural layers. To do this, we will call the sameInit method of our objects. Therefore, we need to pass them the corresponding instance of the CLayerDescription class. We can't simply pass the object describing the recurrent block received from the user, as we need to create other objects. So, first, we will prepare a description of the objects to be created:

  • All internal neural layers are fully connected. Hence, we create base class objects. Therefore, we will specify the defNeuronBase type in the type parameter.
  • All of them take as input a single tensor, which is a combination of the original data vector and the hidden state. We get the size of the source data vector in the method parameters (CLayerDescription.window parameter). The size of the hidden state vector is equal to the size of the output buffer of the current layer. We also get this value in method parameters (CLayerDescription.count parameter). The sum of the two values will be written in the window parameter.
  • If you look carefully at the LSTM block diagram in the previous section, you will be able to see that all internal information flows have the same size. The forget gate output vector is element-wise multiplied by the memory flow. This means their sizes are equal. Similarly, the input gate result vector is elementally multiplied by the new content layer result. Then this product is element-wise summed with the memory flow. Finally, everything is atomically multiplied by the output control gate. It becomes clear that all flows are equal to the size of the output buffer of the current block. So, to the count parameter, we will move the value of a similar element from the external parameters of the method.
  • The activation function is defined by the architecture of the LSTM block. All gates are activated by a sigmoid and the new content layer by a hyperbolic tangent. Along with the activation function, we will specify its corresponding parameters.
  • We will transfer the optimization method specified by the user.

//--- create a description for the inner neural layers
   CLayerDescription *temp = new CLayerDescription();
   if(!temp)
      return false;
   temp.type = defNeuronBase;
   temp.window = desc.window + desc.count;
   temp.count = desc.count;
   temp.activation = AF_SIGMOID;
   temp.activation_params[0] = 1;
   temp.activation_params[1] = 0;
   temp.optimization = desc.optimization;

After preparing the description for the internal neural layers, we will return to our inheritance from the parent class. All the block parameters are hidden within the internal neural layers, so there is no need for us to keep an additional weight matrix in memory, nor the associated delta and momentum buffers. In addition, we do not plan to use the CActivation activation class object. Essentially, the functionality of the input layer from the base class is sufficient for us. To initialize the necessary objects and remove the excess ones, we will zero out the size of the input data in the description of the recurrent block and call the initialization method of the parent class.

//--- call the parent class initialization method
   CLayerDescription *temp2=new CLayerDescription();
   if(!temp2 || !temp2.Copy(desc))
     return false;
   temp2.window = 0;
   if(!CNeuronBase::Init(temp2))
      return false;
   delete temp2;

To obtain information from the user about the history depth for training the recurrent block, we will use the window_out element. We will save the received value in a specially prepared variable. We did not check this value at the beginning of the method in order not to block the operation of the neural network. Instead, we simply limited the lower bound of the stored value. Therefore, if the user forgets to specify a value or indicates an intentionally low value, the neural network will use the value that we have set.

   if(!InsertBuffer(m_cHiddenStatesm_cOutputsfalse))
      return false;
   m_iDepth = (int)fmax(desc.window_out2);

Next, we move on to initializing our gate. The forget gate will be initialized first. Before calling the gate object initialization method, we need to verify the validity of the pointer to the object. If necessary, we will create a new instance of the object. If the attempt to create a new instance of the object is unsuccessful, we exit the method with the false result. If there is an actual object instance, we initialize the gate.

//--- initialize ForgetGate
   if(!m_cForgetGate)
     {
      if(!(m_cForgetGate = new CNeuronBase()))
         return false;
     }
   if(!m_cForgetGate.Init(temp))
      return false;
   if(!InsertBuffer(m_cForgetGateOutsm_cForgetGate.GetOutputs(), false))
      return false;

Similar iterations are performed for the other two gates.

//--- initialize InputGate
   if(!m_cInputGate)
     {
      if(!(m_cInputGate = new CNeuronBase()))
         return false;
     }
   if(!m_cInputGate.Init(temp))
      return false;
   if(!InsertBuffer(m_cInputGateOutsm_cInputGate.GetOutputs(), false))
      return false;

//--- initialize OutputGate
   if(!m_cOutputGate)
     {
      if(!(m_cOutputGate = new CNeuronBase()))
         return false;
     }
   if(!m_cOutputGate.Init(temp))
      return false;
   if(!InsertBuffer(m_cOutputGateOutsm_cOutputGate.GetOutputs(), false))
      return false;

The new content layer will be initialized in the same way. We will only preliminarily change the type of the activation function in the layer description.

//--- initialize NewContent
   if(!m_cNewContent)
     {
      if(!(m_cNewContent = new CNeuronBase()))
         return false;
     }
   temp.activation = AF_TANH;
   if(!m_cNewContent.Init(temp))
      return false;
   if(!InsertBuffer(m_cNewContentOutsm_cNewContent.GetOutputs(), false))
      return false;

After initializing the internal layers, we will move on to the other objects of our LSTM recurrent block. We initialize the gradient accumulation buffer. As in the case of neural layers, we first verify the validity of the object pointer. If necessary, we create a new instance of the class. Then we fill the entire buffer with zero values. We take the buffer size from the previously prepared description of the internal neural layers.

//--- initialize the InputGradient buffer
   if(!m_cInputGradient)
     {
      if(!(m_cInputGradient = new CBufferType()))
         return false;
     }
   if(!m_cInputGradient.BufferInit(1temp.window0))
      return false;
   delete temp;

It should be noted that after initializing the buffer for accumulating gradient values, we will no longer use the object describing the internal neural layers. Therefore, we can delete the unnecessary object.

In conclusion, all that remains is to create and fill with zero values the buffers for the memory flow and hidden state. Note that both buffers will be used on the first direct pass, and their absence will paralyze the entire neural network. A separate method CreateBuffer has been added to create these buffers, which we will consider later.

So, first, we create a memory buffer. We declare a temporary variable and call the CreateBuffer method. As a result of the method, we expect a pointer to the buffer object. Certainly, after obtaining a pointer, we check its validity. If an error occurs, we exit the method with the result of false.

Next, we check for the presence of existing objects in the memory stack. We are discussing the method of initializing a class instance, so we expect an empty stack to be present. If the stack, however, contains any information, we clear the stack and fill the created buffer with null values. After this, we place our buffer into the memory stack.

//--- initialize Memory
   CBufferType *buffer =  CreateBuffer(m_cMemorys);
   if(!buffer)
      return false;
   if(!InsertBuffer(m_cMemorysbufferfalse))
     {
      delete buffer;
      return false;
     }

As a result of executing this code block within the method, we expect to obtain a memory stack containing a single null memory buffer. Please note that at the end of the block execution, we do not delete the buffer object, even though the variable scope does not extend beyond this method. The reason is, we operate object pointers here. By putting a pointer on the stack, we can always get it from there. Conversely, if we delete the object pointed to by the variable pointer in the stack, we will also end up with a pointer to a deleted object, along with all the resulting consequences. The object will actually be deleted either upon stack overflow or when attempting to close the entire instance of the class.

Repeat all iterations for the hidden state buffer.

//--- initialize HiddenStates
   if(!(buffer =  CreateBuffer(m_cHiddenStates)))
      return false;
   if(!InsertBuffer(m_cHiddenStatesbufferfalse))
     {
      delete buffer;
      return false;
     }

Lastly, we pass the current pointer to the OpenCL object to all internal objects and exit the method.

//---
   SetOpenCL(m_cOpenCL);
//---
   return true;
  }

We have considered the algorithm of the class initialization method. However, as you may have noticed, during the execution of the algorithm, we used two methods of the class: SetOpenCL and CreateBuffer. The first method exists in the parent class, but for proper functionality, we will need to override it. The second method is new.

The CreateBuffer method in the initialization method was used to create a new buffer. Looking a bit ahead, we will use it in a broader context. As you know from the architecture of the LSTM recursive block we are building, we will need to extract the last hidden state and memory vectors from the stack on each feed-forward pass. We will also transfer this functionality to the CreateBuffer method.

Since we anticipate the method working with multiple stacks, we will pass a pointer to a specific stack as a parameter to the method. The result of the method execution will be a pointer to the desired buffer. We declare the method in the protected block of our class.

class CNeuronLSTM    :  public CNeuronBase
  {
protected:
   ....   
   CBufferDouble*     CreateBuffer(CArrayObj *&array);

At the beginning of the method body, as usual, we check the received stack pointer. However, in case we receive an invalid pointer, we don't rush to exit the method with an error message. Instead, we try to create a new stack. Only if we can't create a new stack, we exit the method.

Remember, the code that invokes the method expects to receive not just the logical state of the method execution but a pointer to the buffer. Therefore, in case of an error, we return NULL instead of the expected pointer.

CBufferType *CNeuronLSTM::CreateBuffer(CArrayObj *&array)
  {
   if(!array)
     {
      array = new CArrayObj();
      if(!array)
         return NULL;
     }

Next, we create a new buffer and immediately check the result.

   CBufferType *buffer = new CBufferType();
   if(!buffer)
      return NULL;

After successfully creating the buffer, we split the algorithm into two threads. In one case when there are no buffers on the stack, we fill the buffer we created with zero values. If there is already information on the stack, we copy the latest states to the buffer. Then we return a pointer to the buffer of the calling program.

   if(array.Total() <= 0)
     {
      if(!buffer.BufferInit(m_cOutputs.Rows(), m_cOutputs.Cols(), 0))
        {
         delete buffer;
         return NULL;
        }
     }

   else
     {
      CBufferType *temp = array.At(0);
      if(!temp)
        {
         delete buffer;
         return NULL;
        }
      buffer.m_mMatrix = temp.m_mMatrix;
     }
//---
   if(m_cOpenCL)
     {
      if(!buffer.BufferCreate(m_cOpenCL))
         delete buffer;
     }
//---
   return buffer;
  }

Note that I'm referring to the latest data and, in doing so, I'm copying the buffer with index 0. This class implements reverse stack logic. For each new buffer, we will insert it at the beginning of the stack, pushing the older ones down, and when the stack is full, we will remove the last ones.

And second point: we don't take a pointer to an existing buffer, instead we create a new one. This is because we will change the contents of the buffer during the forward pass. In doing so, it's important for us to preserve the previous state. In the case of using a pointer to an old buffer, we will simply overwrite its values, effectively discarding the desired previous states.

The second method, SetOpenCL, is an overriding method of the parent class and has the same functionality of passing a pointer to the OpenCL context to all internal objects involved in the computation process. Similar to the method in the parent class, our method will receive a pointer to the OpenCL context as a parameter and will return a logical result indicating the readiness of the class to operate within the specified context.

class CNeuronLSTM    :  public CNeuronBase
  {
protected:
   ....   
public:
   ....   
   virtual bool      SetOpenCL(CMyOpenCL *opencloverride;

The algorithm of the method is quite simple. First, we call the method of the parent class and pass the resulting pointer to it. The validation of the received pointer correctness is already implemented in the parent class method. Therefore, we need not repeat it here.

Then, we pass the OpenCL context pointer stored in our class variable to all internal objects. The key point here is that the method of the parent class has verified the received pointer and has saved the corresponding pointer in a variable. To ensure that all objects operate within the same context, we propagate the processed pointer.

bool CNeuronLSTM::SetOpenCL(CMyOpenCL *opencl)
  {
//--- call the parent class method
   CNeuronBase::SetOpenCL(opencl);
//--- call the relevant method for all internal layers
   m_cForgetGate.SetOpenCL(m_cOpenCL);
   m_cInputGate.SetOpenCL(m_cOpenCL);
   m_cOutputGate.SetOpenCL(m_cOpenCL);
   m_cNewContent.SetOpenCL(m_cOpenCL);
   m_cInputGradient.BufferCreate(m_cOpenCL);
   for(int i = 0i < m_cMemorys.Total(); i++)
     {
      CBufferType *temp = m_cMemorys.At(i);
      temp.BufferCreate(m_cOpenCL);
     }
   for(int i = 0i < m_cHiddenStates.Total(); i++)
     {
      CBufferType *temp = m_cHiddenStates.At(i);
      temp.BufferCreate(m_cOpenCL);
     }
//---
   return(!!m_cOpenCL);
  }

At this point, we can say that we have completed the work on the class initialization algorithm. We can now move on to the next phase, which is to create a feed-forward algorithm.