5.File operations

We have already built the feed-forward and backpropagation methods for our attention layer. We can add the layer to our model and train it, but we don't really want to retrain our model from scratch every time we want to use it. We need to be able to save a once-trained model to a file and, if necessary, load a ready-to-use neural network from the file. Two methods are responsible for working with files in our basic neural layer: Save and Load. To ensure the proper functioning of your new layer, you need to override the specified methods.

We perform a similar iteration when creating each new type of neural layer. Now we will follow the known path: we will focus on the structure of our class and determine what needs to be saved to a file, and which variables and objects we will simply create and initialize with initial values.

First of all, it is necessary to save the internal neural layers containing the weight matrices m_cQuerys, m_cKeys, m_cValues, m_cFF1, and m_cFF2. In addition, we need to save the values of the variables that define the architecture of the neural layer: m_iWindow, m_iUnits, and m_iKeysSize.

We do not need to save any information from the m_cScores buffer to the file, since it contains only intermediate data that is overwritten on each forward pass. Its size is easy to determine based on the number of elements in the sequence recorded in the variable m_iUnits.

The m_cAttentionOut inner layer does not contain the matrix weights, while its data, similarly to the data of the m_cScores buffer, are overwritten at each iteration of the forward and reverse passes. However, let's look at the situation from the other side. Recall the procedure for initializing the neural layer:

Create a neural layer description object
Fill in the neural layer description object with the necessary information
Call the method that initializes the neural layer with the transfer of a description
Delete the neural layer description object

At the same time, calling the save method for the base neural layer without weight matrices will write only 3 integers to the file, with a total size of 12 bytes. So, by sacrificing 12 bytes of disk space, we reduce our efforts in writing the initialization code for the neural layer in the data loading method.

class CNeuronAttention    :  public CNeuronBase
  {
protected:
   CNeuronConv       m_cQuerys;
   CNeuronConv       m_cKeys;
   CNeuronConv       m_cValues;
   CBufferType       m_cScores;
   int               m_cScoreGrad;
   int               m_cScoreTemp;
   CNeuronBase       m_cAttentionOut;
   CNeuronConv       m_cFF1;
   CNeuronConv       m_cFF2;
   //---
   int               m_iWindow;
   int               m_iUnits;
   int               m_iKeysSize;
   CBufferType       m_cStd;
   //---
   virtual bool      NormlizeBuffer(CBufferType *buffer, CBufferType *std,
                                                                uint std_shift);
   virtual bool      NormlizeBufferGradient(CBufferType *output,
                       CBufferType *gradient, CBufferType *std, uint std_shift);

public:
                     CNeuronAttention(void);
                    ~CNeuronAttention(void);
   //---
   virtual bool      Init(const CLayerDescription *desc) override;
   virtual bool      SetOpenCL(CMyOpenCL *opencl) override;
   virtual bool      FeedForward(CNeuronBase *prevLayer) override;
   virtual bool      CalcHiddenGradient(CNeuronBase *prevLayer) override;
   virtual bool      CalcDeltaWeights(CNeuronBase *prevLayer) override;
   virtual bool      UpdateWeights(int batch_size, TYPE learningRate,
                                   VECTOR &Beta, VECTOR &Lambda) override;
   //--- methods for working with files
   virtual bool      Save(const int file_handle) override;
   virtual bool      Load(const int file_handle) override;
   //--- object identification method
   virtual int       Type(void) override  const { return(defNeuronAttention); }
  };

Once we have decided on the objects to write data to the file, we can start working on our methods. Let's start with the Save method that writes data to the file. In the parameters, the method receives the handle of the file to write the data. However, we will not immediately check the received handle. Instead, we will call the analogous method of the parent class, where all checkpoints and the saving of inherited objects are already implemented. The result of the parent class method will indicate the result of the control block execution.

bool CNeuronAttention::Save(const int file_handle)
  {
   if(!CNeuronBase::Save(file_handle))
      return false;

After executing the parent class method, we call the save method for internal objects one by one. At the same time, we check the results of the operations.

   if(!m_cQuerys.Save(file_handle))
      return false;
   if(!m_cKeys.Save(file_handle))
      return false;
   if(!m_cValues.Save(file_handle))
      return false;
   if(!m_cAttentionOut.Save(file_handle))
      return false;
   if(!m_cFF1.Save(file_handle))
      return false;
   if(!m_cFF2.Save(file_handle))
      return false;

After saving the data of internal objects, we'll save the values of variables that define the architecture of the neural layer. Quite obviously, we check the result of the operations.

   if(FileWriteInteger(file_handle, m_iUnits) <= 0)
      return false;
   if(FileWriteInteger(file_handle, m_iWindow) <= 0)
      return false;
   if(FileWriteInteger(file_handle, m_iKeysSize) <= 0)
      return false;
//---
   return true;
  }

After successfully saving all the necessary data, we complete the method with a positive result.

After creating a data writing method, we move on to work on the Load data reading method. In the parameters, the method receives the file handle to read the data. Just like in the case of writing data, we do not create a new control block in our method. Instead, we call the method of the parent class where all controls, reading of inherited objects, and variables are already implemented. Checking the result of the parent class method immediately informs us about both the completion of the control block and the loading of data from inherited objects and variables.

bool CNeuronAttention::Load(const int file_handle)
  {
   if(!CNeuronBase::Load(file_handle))
      return false;

After successfully executing the data loading method of the parent class, we will sequentially read the data of internal objects. Recall that reading data from a file is carried out in strict accordance with the sequence of writing data. When writing data to a file, we first saved information from the m_cQuerys internal neural layer. Therefore, we will be loading data into this object first. However, don't forget about the nuance of loading internal neural layers: we first check the type of the loaded object and only then call the loading method for the corresponding object.

if(FileReadInteger(file_handle) != defNeuronConv || !m_cQuerys.Load(file_handle))
return false;

We repeat the same algorithm for all previously saved objects.

   if(FileReadInteger(file_handle) != defNeuronConv || !m_cKeys.Load(file_handle))
      return false;
   if(FileReadInteger(file_handle) != defNeuronConv || !m_cValues.Load(file_handle))
      return false;
   if(FileReadInteger(file_handle) != defNeuronBase ||
      !m_cAttentionOut.Load(file_handle))
      return false;
   if(FileReadInteger(file_handle) != defNeuronConv || !m_cFF1.Load(file_handle))
      return false;
   if(FileReadInteger(file_handle) != defNeuronConv || !m_cFF2.Load(file_handle))
      return false;

After loading the data of the internal neural layer objects, we read the values of the variables that determine the architecture of our attention neural layer from the file.

   m_iUnits = FileReadInteger(file_handle);
   m_iWindow = FileReadInteger(file_handle);
   m_iKeysSize = FileReadInteger(file_handle);

Then we need to initialize the m_cScores buffer of dependency coefficients with zero values. We do not change the size of the buffer beforehand, since the buffer initialization method provides for changing its size to the required level.

if(!m_cScores.BufferInit(m_iUnits, m_iUnits, 0))
return false;

We have loaded all the data and initialized the objects. It is worth remembering that to avoid unnecessary data copying, we replaced the pointers to the result and gradient buffers of the internal layer m_cFF2 and the attention layer itself. Without this substitution of pointers, all the work of our neural layer will be incorrect. But if for some reason we re-create the object of the m_cFF2 inner layer, then new objects of buffers of the specified inner neural layer will be created. In this case, we need to perform such a substitution of pointers again. At the same time, if both variables contain pointers to the same object, then by deleting the object through one pointer, we will end up with an invalid pointer in the second variable. This is a tricky moment that you need to be careful with.

We will, of course, add buffer replacement, but we will first check the correspondence of the pointers.

   if(m_cFF2.GetOutputs() != m_cOutputs)
     {
      if(m_cOutputs)
         delete m_cOutputs;
      m_cOutputs = m_cFF2.GetOutputs();
     }

   if(m_cFF2.GetGradients() != m_cGradients)
     {
      if(m_cGradients)
         delete m_cGradients;
      m_cGradients = m_cFF2.GetGradients();
     }
//---
   SetOpenCL(m_cOpenCL);
//---
   return true;
  }

After the successful completion of all operations, we exit the method with a positive result.

At this point, we can consider working on creating a neural layer of attention using the standard tools of the MQL5 language to be completed. In this version, we can insert a neural layer of attention into our model and check its performance. To make the most efficient use of the created class, we need to enhance its methods with multithreading capabilities.

5.1.2.2 Self-Attention backpropagation methods

Organizing parallel computing in the attention block