5.File operations

We already had good progress with our work on the implementation of the Multi-Head Self-Attention algorithm. In the previous sections, we implemented the feed-forward and backpropagation operations of our CNeuronMHAttention class using standard MQL5 tools. Now, in order to fully utilize it in our models, we need to complement it with file methods. The proper functioning of these methods is just as important for industrial use as the correct functioning of the feed-forward and backpropagation methods.

True, we can create a model and test its performance without saving the training results. However, to conduct a repeated test, we will have to retrain our model from scratch. In real-life operations, we wouldn't want to repeat the training process each time. On the contrary, quite often significant efforts are invested in developing and training a model on large datasets, which enables the creation of a truly functional model. At the same time, it is expected that during practical application, it will be sufficient to start the model, and it will be fully ready to operate on real data. Therefore, when approaching the development of file handling methods, we must design their functionality in such a way that we can fully restore the model's state with minimal effort. Well, we have done this work several times already, so let's use the established algorithm once again.

First, let's look at the structure of our multi-head attention class CNeuronMHAttention.

class CNeuronMHAttention    :  public CNeuronAttention
  {
protected:
   CNeuronConv       m_cW0;
   int               m_iHeads;
 
public:
                     CNeuronMHAttention(void);
                    ~CNeuronMHAttention(void);
   //---
   virtual bool      Init(const CLayerDescription *descoverride;
   virtual bool      SetOpenCL(CMyOpenCL *opencloverride;
   virtual bool      FeedForward(CNeuronBase *prevLayeroverride;
   virtual bool      CalcHiddenGradient(CNeuronBase *prevLayeroverride;
   virtual bool      CalcDeltaWeights(CNeuronBase *prevLayerbool readoverride;
   virtual bool      UpdateWeights(int batch_sizeTYPE learningRate,
                                   VECTOR &BetaVECTOR &Lambdaoverride;
   //--- methods of working with files
   virtual bool      Save(const int file_handleoverride;
   virtual bool      Load(const int file_handleoverride;
   //--- object identification method
   virtual int       Type(voidoverride const { return(defNeuronMHAttention);  }
  };

Seemingly, there's nothing complicated here. In the class body, we declare only one convolution layer m_cW0 and one variable m_iHeads indicating the number of attention heads used. Most of the objects are inherited from the parent class CNeuronAttention. We already created a similar method when working on the parent class, and now we can use it. I suggest looking again at the CNeuronAttention::Save parent class method and making sure it has a save of all the data we need. After that, we can start working on the method for saving the current class data. This time, everything here is indeed very simple.

In the parameters, the CNeuronMHAttention::Save method gets the handle of the file to which it will write the data. In the body of the method, we immediately pass the obtained handle to a similar method of the parent class, where all the control logic is already implemented. In addition to controls, the parent class method also implements the saving of inherited objects and their data. Therefore, by checking the result of the parent class method, we immediately get a consolidated result of passing through the control block and saving inherited objects. We only need to save the number of attention heads used and the m_cW0 convolutional layer data.

bool CNeuronMHAttention::Save(const int file_handle)
  {
//--- call the method of the parent class
   if(!CNeuronAttention::Save(file_handle))
      return false;
//--- save constants
   if(FileWriteInteger(file_handlem_iHeads) <= 0)
      return false;
//--- call the same method for all inner layers
   if(!m_cW0.Save(file_handle))
      return false;
//---
   return true;
  }

The CNeuronMHAttention::Load method loads data from a file in accordance with the sequence of their recording. Therefore, in the body of the method, we immediately pass the received file handle as a parameter to the corresponding method of the parent class and check the result.

bool CNeuronMHAttention::Load(const int file_handle)
  {
//--- call the method of the parent class
   if(!CNeuronAttention::Load(file_handle))
      return false;

After executing the operations of the parent class method, we read the number of attention heads used and the data of the m_cW0 internal convolution layer from the file. Loading a constant is very simple: we just read the value from the file and save it to our m_iHeads variable. But before calling the load method, we must check the type of the object to be loaded. Only if the object types match, we call the data loading method and check the result.

   m_iHeads = FileReadInteger(file_handle);
   if(CheckPointer(m_cW0) == POINTER_INVALID)
     {
      m_cW0 = new CNeuronConv();
      if(CheckPointer(m_cW0) == POINTER_INVALID)
         return false;
     }
   if(FileReadInteger(file_handle)!=defNeuronConv ||
      !m_cW0.Load(file_handle))
      return false;

It is expected that after the successful execution of the parent class operations, we will have fully restored inherited objects. However, we inherited the objects but initialized them in the corresponding method of this class with parameters different from the parent class. In this class, we adjusted almost all objects for the number of attention heads used. In the data loading method of the parent class, we not only load data from the file but also initialize unsaved objects. These are objects whose data are only used within a single iteration of feed-forward and backpropagation passes.

So, let's return to the parent class method and critically evaluate all the operations once again. Pay attention to the following lines of code.

bool CNeuronAttention::Load(const int file_handle)
  {
  ......
   m_iUnits = FileReadInteger(file_handle);
  ......
   if(!m_cScores.BufferInit(m_iUnitsm_iUnits0))
      return false;
  ......
//---
   return true;
  }

They initialize the m_cScores dependency coefficient matrix buffer. As you can see, the initialization is done with zero values with the size sufficient for only one attention head. However, this does not satisfy the requirements of our Multi-Head Self-Attention algorithm. It would make sense to add a reinitialization of the buffer in our class loading method, giving it the necessary size.

//--- initialize Scores
   if(!m_cScores.BufferInit(m_iHeadsm_iUnits * m_iUnits))
      return false;
//---
   return true;
  }

After completing all the operations, we exit the method with a positive result.

This completes the implementation of the CNeuronMHAttention class using standard MQL5 tools. We have implemented the Multi-Head Self-Attention algorithm. In the next section, we will add the ability to perform multi-threaded operations using OpenCL.