Machine learning in trading: theory, models, practice and algo-trading - page 3260

 
Forester #:

This is the input matrix.
The output will be 15000 strokes to each of the 15000 rows. As in all other examples about 1.7 Gg each (if in Double by 8 bytes)

I agree that this is not how it counts.

 
fxsaber #:

So far, I don't see any technical obstacle to calculate a million-by-million matrix on a simple home machine. But the comparison of NumPy vs MQL5 is very important for me.

Are you sure?


For example, an input matrix with 50,000 columns/100 rows will give a correlation matrix of 50, 000 x 50, 000 x 8 bytes / (1024 x 1024 x 1024) = 18.63 GB

 
input int inRows = 100; // Длина строки
input int inCols = 15000; // Количество строк

bool IsEqual( matrix<double> &Matrix1, const matrix<double> &Matrix2 )
{
  Matrix1 -= Matrix2;  
  
  const bool Res = (MathAbs(Matrix1.Mean()) < 1 e-15);
  
  Matrix1 += Matrix2;
  
  return(Res);
}

#define  TOSTRING(A) #A + " = " + (string)(A) + " "

void OnStart()
{  
  double Array[];  
  Print(FileLoad("qwe\\arr.csv", Array)); // RAM-drive. https://www.mql5.com/ru/forum/86386/page3258#comment_49549438
  
  matrix<double> Matrix;  
  Matrix.Assign(Array);
  Matrix.Init(inCols, inRows);
  Matrix = Matrix.Transpose();
  
  ArrayFree(Array);  
  Print(FileLoad("qwe\\matr.csv", Array)); // RAM-drive. https://www.mql5.com/ru/forum/86386/page3258#comment_49549438

  matrix<double> Matrix2;
  Matrix2.Assign(Array);
  Matrix2.Init(inCols, inCols);
  Matrix2 = Matrix2.Transpose();
    
  ArrayFree(Array);
  
  matrix<double> Matrix1 = CorrMatrix(Matrix); // https://www.mql5.com/ru/forum/86386/page3256#comment_49538685

  Print(TOSTRING(IsEqual(Matrix1, Matrix2)));
}


Full coincidence of NumPy calculation values with MQL5.

1500000
225000000
IsEqual(Matrix1, Matrix2) = true 
 
Forester #:

This is the input matrix.
The output will be 15000 strokes to each of the 15000 rows. As in all other examples about 1.7 Gg each (if in Double by 8 bytes)

In general, alas, python does not know how to work with int - it converts it to double apparently.

import numpy as np
import time

def calc_corr_matrix():
    arr = np.random.randint(1, 101, size=(15000,100), dtype=np.int32)
    corr_matrix = np.corrcoef(arr)
    size_in_mb = corr_matrix.nbytes / 1024**2
    print("Array size:", size_in_mb, "MB")
    return corr_matrix

np.random.seed(123)

start_time = time.time()
corr_matrix = calc_corr_matrix()
end_time = time.time()

print("Time taken:", end_time - start_time, "seconds")
Array size: 1716.61376953125 MB
Time taken: 4.62926459312439 seconds
[Deleted]  
Aleksey Vyazmikin #:

In general, alas, python does not know how to work with int - it converts it to double, apparently.

Stop spamming rubbish. Correlation in ints does not count.

 
Maxim Dmitrievsky #:

Stop spamming bullshit. Correlation in ints doesn't count.

You don't need to open America. It's not common to count, but it's worth thinking about how it can be done.

[Deleted]  
Aleksey Vyazmikin #:

America doesn't need to be discovered. It is not common to consider, but it is worth thinking about how it can be done.

In a new thread, think of something

 
Maxim Dmitrievsky #:

in the new thread, come up with

What a bunch of people - I go to waste time for him and he's rude.

What the hell...

 
Aleksey Vyazmikin #:

America doesn't need to be discovered. It is not common to consider, but it is worth thinking about how it can be done.

I have already described the way - take Alglib f-iys (there are 8 pieces called from PearsonCorrM) and change data types. Even in 1 byte uchar. 4-byte ints won't give much gain.
Do it for yourself if you need to.
[Deleted]  
Aleksey Vyazmikin #:

I go and waste time for him and he's rude.

Fuck it.

I didn't ask you to waste your time for me.