Machine learning in trading: theory, models, practice and algo-trading - page 3257

 
fxsaber #:
I'm wrong somewhere, but I don't see it.

But it works with this string

const matrix<double> matrix1 = {{2, 2, 3}, {3, 2, 3}, {1, 2, 1}};

[[1,0,0.8660254037844387]
[0,0,0]
[0.8660254037844387,0,1]]

Apparently, if all the data in a column are the same, the calculation is skipped.
In the 2nd column I left all the data in 2 and the 2nd row of the matrix remained zero. Although it is probably correct to fill the diagonal with 1s.

PS. At first I thought it was a bug with Alglib.

In the old code the values of elements were set through
m[row].Set(col, val);
And now
m.Set(row,col, val);

It's a pity that there is no backward compatibility. Well, it doesn't matter to me. I'm not working through Alglib now. If someone's old codes stop working, it will be necessary to fix it.
The saddest thing is that the old version of

m[row].Set(col, val);

doesn't write error messages, it just doesn't do anything. People just won't replace and won't know they need to change the code. It will count something, but with unchanged matrices.

 
Forester #:

Apparently, if all data in a column are the same, the calculation is skipped.

Pearson does not calculate between rows, but between columns?

Looks like it.
const matrix<double> matrix1 = {{1, 1, 1}, {2, 2, 2}, {3, 3, 3}};
It gives a single matrix.
 
fxsaber #:

Pearson does not calculate between rows, but between columns?

ZY It seems so. It produces a unit matrix.
You can transpose it.
 

Alglib is a good library, it has everything for MO. Neural networks are super slow there, it was so in early versions.

 
Forester #:
in statistics.mqh.

PearsonCorrM - Correlation of all rows to all rows is the fastest.

On its basis I calculated the correlation matrix.

#include <Math\Alglib\statistics.mqh> // https://www.mql5.com/ru/code/11077

const matrix<double> CorrMatrix( const matrix<double> &Matrix )
{
  matrix<double> Res = {};
  
  const CMatrixDouble MatrixIn(Matrix);
  CMatrixDouble MatrixOut;  

  if (CBaseStat::PearsonCorrM(MatrixIn, MatrixIn.Rows(), MatrixIn.Cols(), MatrixOut)) // https://www.mql5.com/ru/code/11077
    Res = MatrixOut.ToMatrix();
  
  return(Res);
}


Measured the performance.

#property script_show_inputs

input int inRows = 100; // Длина строки
input int inCols = 15000; // Количество строк

void FillArray( double &Array[], const int Amount )
{
  for (uint i = ArrayResize(Array, Amount); (bool)i--;)
    Array[i] = MathRand();
}

bool IsEqual( matrix<double> &Matrix1, const matrix<double> &Matrix2 )
{
//  return(MathAbs((Matrix1 - Matrix2).Mean()) < 1e-15); // Дорого по памяти.
  
  Matrix1 -= Matrix2;  
  
  const bool Res = (MathAbs(Matrix1.Mean()) < 1 e-15);
  
  Matrix1 += Matrix2;
  
  return(Res);
}

#define  TOSTRING(A) #A + " = " + (string)(A) + " "

#define  BENCH(A)                                                              \
  StartMemory = MQLInfoInteger(MQL_MEMORY_USED);                              \
  StartTime = GetMicrosecondCount();                                          \
  A;                                                                          \
  Print(#A + " - " + (string)(GetMicrosecondCount() - StartTime) + " mcs, " + \
       (string)(MQLInfoInteger(MQL_MEMORY_USED) - StartMemory) + " MB"); 

void PrintCPU()
{
#ifdef _RELEASE
  Print("EX5: " + (string)__MQLBUILD__ + " " + __CPU_ARCHITECTURE__ + " Release.");
#else // #ifdef _RELEASE
  Print("EX5: " + (string)__MQLBUILD__ + " " + __CPU_ARCHITECTURE__ + " Debug.");
#endif // #ifdef _RELEASE #else
  Print(TOSTRING(TerminalInfoString(TERMINAL_CPU_NAME)));
  Print(TOSTRING(TerminalInfoInteger(TERMINAL_CPU_CORES)));
  Print(TOSTRING(TerminalInfoString(TERMINAL_CPU_ARCHITECTURE)));
}

void OnStart()
{  
  PrintCPU();
  
  double Array[];
  FillArray(Array, inRows * inCols);
  
  matrix<double> Matrix;  
  Matrix.Assign(Array);
  Matrix.Init(inCols, inRows);
  Matrix = Matrix.Transpose();
  
  ulong StartTime, StartMemory;
  
  Print(TOSTRING(inRows) + TOSTRING(inCols));

  BENCH(matrix<double> Matrix1 = CorrMatrix(Matrix)) // https://www.mql5.com/ru/code/11077
  BENCH(matrix<double> Matrix2 = Matrix.CorrCoef(false)); // https://www.mql5.com/ru/docs/basis/types/matrix_vector
//  BENCH(matrix<double> Matrix3 = CorrMatrix(Array, inRows)); // https://www.mql5.com/ru/code/17982 

  Print(TOSTRING(IsEqual(Matrix1, Matrix2)));
//  Print(TOSTRING(IsEqual(Matrix3, Matrix2)));  
}


Result.

EX5: 3981 AVX Release.
TerminalInfoString(TERMINAL_CPU_NAME) = Intel Core i7-2700 K  @ 3.50 GHz 
TerminalInfoInteger(TERMINAL_CPU_CORES) = 8 
TerminalInfoString(TERMINAL_CPU_ARCHITECTURE) = AVX 
inRows = 100 inCols = 15000 
matrix<double> Matrix1 = CorrMatrix(Matrix) - 14732702 mcs, 1717 MB
matrix<double> Matrix2 = Matrix.CorrCoef(false) - 40318390 mcs, 1717 MB
IsEqual(Matrix1, Matrix2) = true 


It is well seen that Alglib calculates the matrix faster than the standard matrix method.

However, for pattern search, calculating the correlation matrix is insane in terms of RAM consumption.


How long does it take Python to read the same size of the original matrix as in the example above?

 
fxsaber #:

However, it is RAM-consuming madness to read a correlation matrix to find patterns.

My inbuilt one worked faster on my i7-6700

inRows = 100 inCols = 15000 
matrix<double> Matrix1 = CorrMatrix(Matrix) - 14648864 mcs, 1717 MB
matrix<double> Matrix2 = Matrix.CorrCoef(false) - 29589590 mcs, 1717 MB
IsEqual(Matrix1, Matrix2) = true 

It's strange that the native one is slower, they could have just copied it. It is unlikely that Alglibe has some unique accelerated algorithm under licence.

Have you tried the other 2 variants from Alglib?
If you count in loops every row to every row or every row to all rows, memory will be more economical (2 rows or 1 row + matrix). But it will take longer, I don't remember exactly, but I think it will be slower than the built-in function.

 
fxsaber #:

However, it is RAM-consuming madness to read a correlation matrix to find patterns.

It's worse with memory.
Before launching



And during the work of Alglibov PearsonCorrM memory is growing all the time: I saw 5 gg and 4.6 on the screen.


and during the work of standard Matrix.CorrCoef.

Apparently, the standard one is optimised for minimum memory usage, and the Alglibov one is optimised for speed.

 
Forester #:

I got the built in one to work faster: on an i7-6700.

I added CPU and EX5 instruction data to the code.
 
Forester #:

And while running Alglib's PearsonCorrM, the memory keeps growing: and 5 gg was seen, 4,6 got on the screen

Almost doubles the amount of consumption due to this line.

Forum on trading, automated trading systems and testing trading strategies

Machine learning in trading: theory, models, practice and algo-trading

fxsaber, 2023.09.25 18:01

#include <Math\Alglib\statistics.mqh> // https://www.mql5.com/ru/code/11077

const matrix<double> CorrMatrix( const matrix<double> &Matrix )
{
  matrix<double> Res = {};
  
  const CMatrixDouble MatrixIn(Matrix);
  CMatrixDouble MatrixOut;  

  if (CBaseStat::PearsonCorrM(MatrixIn, MatrixIn.Rows(), MatrixIn.Cols(), MatrixOut)) // https://www.mql5.com/ru/code/11077
    Res = MatrixOut.ToMatrix();
  
  return(Res);
}

This is just a transition from CMatrixDouble to matrix<double>. I even had to do this matrix comparison because of memory.

Forum on trading, automated trading systems and testing trading strategies

Machine learning in trading: theory, models, practice and algo-trading

fxsaber, 2023.09.25 18:01

bool IsEqual( matrix<double> &Matrix1, const matrix<double> &Matrix2 )
{
//  return(MathAbs((Matrix1 - Matrix2).Mean()) < 1e-15); // Дорого по памяти.
  
  Matrix1 -= Matrix2;  
  
  const bool Res = (MathAbs(Matrix1.Mean()) < 1 e-15);
  
  Matrix1 += Matrix2;
  
  return(Res);
}
 
fxsaber #:
Almost double the amount of consumption due to this line.

It is just a transition from CMatrixDouble to matrix<double>. I even had to do this matrix comparison because of memory.

And the time is increased by 40% by this conversion. Commented // Res = MatrixOut.ToMatrix();

matrix<double> Matrix1 = CorrMatrix(Matrix) - 10482307 mcs, 0 MB
matrix<double> Matrix2 = Matrix.CorrCoef(false) - 28882536 mcs, 1717 MB

I.e. if you work only with Alglibov fnctional (without converting its matrices into terminal matrices), it will be faster.

Reason: