Русский 中文 Español Deutsch 日本語 Português
preview
Integrating ML models with the Strategy Tester (Part 3): Managing CSV files (II)

Integrating ML models with the Strategy Tester (Part 3): Managing CSV files (II)

MetaTrader 5Machine learning | 4 July 2023, 16:08
2 806 0
Jonathan Pereira
Jonathan Pereira

Introduction

In this article, we will focus on the third part of Strategy Tester integration with Python. We will see the creation of the CFileCSV class for the efficient management of CSV files. We will examine some examples and the code, so that the readers better understand how this class can be implemented in practice.

So, what is CSV?

CSV (Comma Separated Values) is a simple and widely used file format for storing and exchanging data. It is similar to a table in which each row represents a set of data while each column represents a field in that data. Values are separated by a delimiter to make them easier to read and write across different tools and programming languages.

The CSV format appeared in the early 1970s and was first used on mainframe systems. CSV cannot be traced to a specific creator since it is a widely used file type.

It is often used to import and export data in various applications such as spreadsheets, databases, data analysis programs, etc. Its popularity is due to both ease of use and understanding, and compatibility with many systems and tools. This is especially useful when we need to share data between different applications, for example to transfer information from one system to another.

So, the key advantages of using CSV are ease of use and compatibility. However, it also has some limitations, such as lack of support for complex data types and reduced ability to handle very large amounts of data. Also, the lack of a universal standard for the CSV format can cause compatibility issues between different applications. In addition, you can accidentally lose or modify data since the format does not provide validation. In general, CSV is a versatile and easy-to-use option for storing and sharing data. Nevertheless, it's important to know and fully understand its limitations and take steps to ensure data accuracy.


Motivation

The creation of the CFileCSV class was caused by the need to integrate the MetaTrader 5 Strategy Tester environment with Python. While developing trading strategies using machine learning (ML) models, I encountered the difficulty using models created in Python. I would either have to create a machine learning library in MQL5, which was beyond my main goal, or create an Expert Advisor entirely in Python.

Although the MQL5 language provides resources for creating ML libraries, I did not want to spend time and effort developing them, since my main goal was to analyze data and build models in a fast and efficient way.

So, the task was to find an intermediate solution. I wanted to take advantage of ML models built in Python, but also be able to apply them directly to my work with MQL5. So, I started to look for a way to overcome this limitation and find a solution for integrating these two environments.

The idea was to create a messaging system where MetaTrader 5 and Python could communicate with each other in a timely fashion. This would allow you to control the initialization and transfer of data from MetaTrader 5 to Python and the sending of predictions from Python to Meta Trader 5. The CFileCSV class was designed to facilitate this interaction by allowing efficient data storage and loading.


Introduction to the CFileCSV class

CFileCSV is a class for working with CSV (Comma Separated Values) files. The class is derived from CFile. So, it provides specific functionality for working with CSV files. The purpose of this class is to make CSV files easier to read and write by making it easier to work with different data types.

One of the big benefits of using CSV files is that such files are easy to share and the provide a convenient way to import/export data. Such files can be easily opened and edited in programs like Excel or Google Sheets, and they can be read in various programming languages. Moreover, since they do not have a specific format, they can be read and written according to different needs.

The CFileCSV class has four main public methods: Open, WriteHeader, WriteLine, and Read. In addition, it has two private helper methods which convert arrays or matrices to strings and write those values to a file.

class CFileCSV : public CFile
  {
private:
   template<typename T>
   string            ToString(const int, const T &[][]);
   template<typename T>
   string            ToString(const T &[]);
   short             m_delimiter;

public:
                     CFileCSV(void);
                    ~CFileCSV(void);
   //--- methods for working with files
   int               Open(const string,const int, const short);
   template<typename T>
   uint              WriteHeader(const T &values[]);
   template<typename T>
   uint              WriteLine(const T &values[][]);
   string            Read(void);
  };  

When using this class, keep in mind that it was designed to work with specific CSV files. If the data in the file is not formatted correctly, the results may be unexpected. It is also very important to make sure that the file has been opened before you attempt to write to it, and that it has write permission.

As an example of using the CFileCSV class, we can create a CSV file from a data matrix. First, we will create an instance of the class and open the file using the Open method. In this method we specify the file name and the Open flag. Next, we use the WriteHeader method to write the header to the file, and the WriteLine method to write data rows from the matrix. Let's illustrate these steps with an example function:

#include "FileCSV.mqh"

void CreateCSVFile(string fileName, string &headers[], string &data[][])
  {
   // Creates an object of the CFileCSV class
   CFileCSV csvFile;

   // Checks if the file can be opened for writing in the ANSI format
   if(csvFile.Open(fileName, FILE_WRITE|FILE_ANSI))
     {
        int rows = ArrayRange(data, 0);
        int cols = ArrayRange(data, 1);
        int headerSize = ArraySize(headers);
        //Checks if the number of columns in the data matrix is equal to the number if elements in the header array and if the number of rows in the data matrix is greater than zero
        if(cols != headerSize || rows == 0)
        {
            Print("Error: Invalid number of columns or rows. Data array must have the same number of columns as the headers array and at least one row.");
            return;
        }
      // Writes header to file
      csvFile.WriteHeader(headers);
      // Writes data rows to file
      csvFile.WriteLine(data);
      // Closes the file
      csvFile.Close();
     }
   else
     {
      // Shows an error message if the file cannot be opened
      Print("Error opening file!");
     }
  }

The purpose of this method is to create a CSV file from an array of headers and an array of data. Let's start by creating an object of the CFileCSV class. Then, check if the file can be opened for writing in ANSI format. If the file can be opened, make sure that the number of columns in the data matrix is equal to the number of elements in the header matrix and that the number of rows in the data matrix is greater than zero. If these conditions are met, the method writes the header to the file using the WriteHeader() method and then writes the data rows using the WriteLine() method. Finally, the method closes the file. If the file cannot be opened, an error message is displayed.

This method will be demonstrated with an example shortly. Pay attention that its implementation can be extended to perform other tasks. For example, you can add more validations: check if the file exists before trying to open it or add an options to choose which delimiter to use.

The CFileCSV class provides a simple and practical way to work with CSV files, making it easy to read and write data to CSV files. However, you should be careful when using it: you should ensure that the files are in the expected format and check the method returns to make sure they were successfully executed.


Implementation

As mentioned above, the CFileCSV class has four main public methods: Open, WriteHeader, WriteLine, and Read. It also has two private helper methods which have name overload: ToString.

  • Method Open(const string file_name,const int open_flags, const short delimiter=';') is used to open a CSV file. The method receives the following parameters: file name, open flags (for example, FILE_WRITE or FILE_READ) and the delimiter to be used in the file (the default is ';'). It calls the Open method of the base CFile class and saves the specified delimiter in a private variable. It also returns an integer indicating operation success or failure.

int CFileCSV::Open(const string file_name,const int open_flags, const short delimiter=';')
  {
   m_delimiter=delimiter;
   return(CFile::Open(file_name,open_flags|FILE_CSV|delimiter));
  }
  • Method WriteHeader(const T &values[]) is used to write a header to the open CSV file. It receives a parameter with an array of values representing the file's column headers. It uses the ToString method to convert the array to a string and writes this string to the file using the FileWrite method of the CFile base class. It also returns an integer indicating the number of bytes written in the file.
template<typename T>
uint CFileCSV::WriteHeader(const T &values[])
  {
   string header=ToString(values);
//--- check handle
   if(m_handle!=INVALID_HANDLE)
      return(::FileWrite(m_handle,header));
//--- failure
   return(0);
  }
  • The WriteLine(const T &values[][]) method is used to write data rows to the open CSV file. In parameters, this method receives a matrix of values representing the data rows in the file. It iterates over each row of the matrix, using the ToString method to convert each row to a string and concatenate those strings into a single string. It then writes this string to a file using the FileWrite method of the CFile base class. It also returns an integer indicating the number of bytes written in the file.

template<typename T>
uint CFileCSV::WriteLine(const T &values[][])
  {
   int len=ArrayRange(values, 0);

   if(len<1)
      return 0;

   string lines="";
   for(int i=0; i<len; i++)
      if(i<len-1)
         lines += ToString(i, values)  + "\n";
      else
         lines += ToString(i, values);

   if(m_handle!=INVALID_HANDLE)
      return(::FileWrite(m_handle, lines));
   return 0;
  }
  • The Read(void) method is used to read the contents of the open CSV file. It uses the FileReadString method of the CFile base class to read the file contents line by line and to save it in one string. Returns a string containing the contents of the file.

string CFileCSV::Read(void)
  {
   string res="";
   if(m_handle!=INVALID_HANDLE)
      res = FileReadString(m_handle);

   return res;

The ToString methods are private helper methods of the CFileCSV class and are used to convert matrices or arrays to strings and to write these values to a file.

  • Method ToString(const int row, const T &values[][]) is used to convert a matrix to a string. It receives the string of the matrix to be converted and the matrix itself as the parameters. The method iterates over each element of the matrix row adding it to the resulting string. The delimiter is added at the end of each element except for the last element in the row.

template<typename T>
string CFileCSV::ToString(const int row, const T &values[][])
  {
   string res="";
   int cols=ArrayRange(values, 1);

   for(int x=0; x<cols; x++)
      if(x<cols-1)
         res+=values[row][x] + ShortToString(m_delimiter);
      else
         res+=values[row][x];

   return res;
  }
  • Method ToString(const T &values[]) converts an array to a string. It iterates over each element of the array and adds it to the resulting string. The delimiter is added at the end of each element except for the last element in the array.

template<typename T>
string CFileCSV::ToString(const T &values[])
  {
   string res="";
   int len=ArraySize(values);

   if(len<1)
      return res;

   for(int i=0; i<len; i++)
      if(i<len-1)
         res+=values[i] + ShortToString(m_delimiter);
      else
         res+=values[i];

   return res;
  }

These methods are used by WriteHeader and Write Line to convert values passed as parameters to strings and to write those strings to the open file. They are used to ensure that the values are written into the file in the expected format and are separated by the specified delimiter. They are fundamental to ensure that the data is written correctly and in an organized form in the CSV file. 

In addition, these methods provide the CFileCSV class with more flexibility, allowing it to handle different kinds of data because they are implemented as templates. This means that these methods can be applied to any kind of data that can be converted to a string, including integers, floats, strings, and others. This makes the CFileCSV class very versatile and easy to use.

These methods are mainly intended to ensure that values are written to the file in the correct format. They include a delimiter at the end of every element except the last element in a row or matrix. This ensures that the values in the CSV file are properly separated, which very important for later reading and interpreting the data stored in the file.


An example of using ToString(const int row, const T &values[][]):

int data[2][3] = {{1, 2, 3}, {4, 5, 6}};
string str = csvFile.ToString(1, data);
//str -> "4;5;6"

In this example, we pass the second row of the data matrix to the ToString method. The method iterates over each element in the string, appending it to the resulting string, and inserting a delimiter at the end of every element except the last element of the string. The resulting string will be '4;5;6'.

Example of using ToString(const T &values[]):

string headers[] = {"Name", "Age", "Gender"};
string str = csvFile.ToString(headers);
//str -> "Name;Age;Gender"

In this example, the 'headers' array is passed to the ToString method. The method iterates over each element of the array, appending it to the resulting string and inserting a delimiter at the end of each element except the last element of the array. The resulting string will be 'Name;Age;Gender'.

These are just examples of using the ToString and ToString methods. They can be applied to any data type that can be converted to a string. However, please note that they're only available inside the CFileCSV class because they're declared as private.


Algorithmic complexity 

How can we measure the complexity of algorithms and use this information to optimize the performance of algorithms and systems?

The Big O notation is an important tool for analyzing algorithms, which has been recognized since the early days of computer science. The Big O concept was formally defined in the 1960s but it is still widely used today. It allows programmers to roughly estimate the complexity of an algorithm based on its inputs and the operations required to execute it. Using this tool, it is possible to compare different algorithms and define those which provide better performance for specific tasks.

The amount of data and the complexity of the problems that must be solved grow exponentially. That is why the Big O notation is so relevant. While more and more data is generated daily, we need more efficient algorithms to process this data.

The Big O concept is based on the idea that, for an algorithm, the execution time grows according to a certain mathematical function, usually a polynomial. This function is expressed as the Big O notation, which can be represented as O(f(n)), where f(n) is what shows the complexity of the algorithm.

Let's now look at a few examples of using Big O notation:

  • O(1), which is a constant time algorithm, i.e. it does not change depending on the size of the data.
  • O(n), which represents a linear time algorithm, where the execution time increases in proportion to the size of the data.
  • O(n^2), which corresponds to a quadratic time algorithm, in which the execution time grows as the square of the data size.
  • O(log n), which denotes a logarithmic time algorithm with the execution time growing as a function of the logarithm of the data size.

Big O will help to decide which algorithm to choose for solving your particular problem, and also to optimize the performance of systems.



The time complexity of each method of the CFileCSV class varies depending on the size of the data provided as a parameter.

  • The Open method has the complexity of O(1) because it performs one operation to open the file, regardless of the data size.
  • The Read method has the complexity of O(n), where n is the size of the file. It reads the entire contents of a file and saves it in a string.
  • The WriteHeader method also has O(n) complexity, where n is the size of the array provided as a parameter. It converts an array to a string and writes it to a file.
  • The WriteLine method has the complexity of O(mn), where m is the number of rows in the matrix and n is the number of elements in each row. It iterates over each row, converts it to a string, and writes it to a file.

Please note that these complexities are estimates, as they can be influenced by other factors, such as the size of the file's write buffer, the file system, etc. Moreover, the Big O notation estimates the worst-case scenario. If there is too much data to provide to the methods, the complexity can increase.

In general, the CFileCSV class has an acceptable time complexity and is efficient for working with files that are not too large. However, if you need to handle very large files, you may need to take other approaches or optimize the class to handle specific use cases.

 


Usage Example

//+------------------------------------------------------------------+
//|                                                    exemplo_2.mq5 |
//|                                     Copyright 2022, Lethan Corp. |
//|                           https://www.mql5.com/pt/users/14134597 |
//+------------------------------------------------------------------+
#property copyright "Copyright 2023, Lethan Corp."
#property link      "https://www.mql5.com/pt/users/14134597"
#property version   "1.00"
#include "FileCSV.mqh"

CFileCSV csvFile;
string fileName = "dados.csv";
string headers[] = {"Timestamp", "Close", "Last"};
string data[1][3];

//The OnInit function
int OnStart(void)
  {
//Fill the 'data' array with values timestamp, Bid, Ask, Indicador1 and Indicador2
   data[0][0] = TimeToString(TimeCurrent());
   data[0][1] = DoubleToString(iClose(Symbol(), PERIOD_CURRENT, 0), 2);
   data[0][2] = DoubleToString(SymbolInfoDouble(Symbol(), SYMBOL_LAST), 2);

//Open the CSV file
   if(csvFile.Open(fileName, FILE_WRITE|FILE_ANSI))
     {
      //Write the header
      csvFile.WriteHeader(headers);
      //Write data rows
      csvFile.WriteLine(data);
      //Close the file
      csvFile.Close();
     }
   else
     {
      Print("File opening error!");
     }
   return(INIT_SUCCEEDED);
  }
//+------------------------------------------------------------------+

//+------------------------------------------------------------------+

This code is an implementation of the CFileCSV class in MQL5. It covers the following functionality:

  • Provides error handling in situations where the file cannot be opened.
  • Allows opening a CSV file with the specified name and appropriate write permissions.
  • Allows writing a header to the file, defined as an array of strings.
  • Provides for writing data to a file, also defined as an array of strings.
  • Closes the file when the write is completed.


Conclusion

The CFileCSV class provides a practical and efficient method way for working with CSV files. It includes methods for opening, writing headers and strings, and reading CSV files. The Open, WriteHeader, WriteLine and Read methods ensure correct operations with CSV files, ensuring that data is written and organized in a readable manner. Thank you for your time! In the next article, we will look at how to use ML models through file sharing using the CFileCSV class that was introduced in this article.

Translated from Portuguese by MetaQuotes Ltd.
Original article: https://www.mql5.com/pt/articles/12069

Attached files |
Developing an MQTT client for MetaTrader 5: a TDD approach Developing an MQTT client for MetaTrader 5: a TDD approach
This article reports the first attempts in the development of a native MQTT client for MQL5. MQTT is a Client Server publish/subscribe messaging transport protocol. It is lightweight, open, simple, and designed to be easy to implement. These characteristics make it ideal for use in many situations.
Category Theory in MQL5 (Part 12): Orders Category Theory in MQL5 (Part 12): Orders
This article which is part of a series that follows Category Theory implementation of Graphs in MQL5, delves in Orders. We examine how concepts of Order-Theory can support monoid sets in informing trade decisions by considering two major ordering types.
Improve Your Trading Charts With Interactive GUI's in MQL5 (Part II): Movable GUI (II) Improve Your Trading Charts With Interactive GUI's in MQL5 (Part II): Movable GUI (II)
Unlock the potential of dynamic data representation in your trading strategies and utilities with our in-depth guide to creating movable GUIs in MQL5. Delve into the fundamental principles of object-oriented programming and discover how to design and implement single or multiple movable GUIs on the same chart with ease and efficiency.
Rebuy algorithm: Multicurrency trading simulation Rebuy algorithm: Multicurrency trading simulation
In this article, we will create a mathematical model for simulating multicurrency pricing and complete the study of the diversification principle as part of the search for mechanisms to increase the trading efficiency, which I started in the previous article with theoretical calculations.