Aprendizaje automático en el trading: teoría, práctica, operaciones y más

fxsaber 2023.09.24 11:47 #32561

Forester #:

Aparentemente, si todos los datos de una columna son iguales, se omite el cálculo.

¿Pearson no calcula entre filas, sino entre columnas?

Eso parece.

const matrix<double> matrix1 = {{1, 1, 1}, {2, 2, 2}, {3, 3, 3}};

Da una matriz única.

Aleksei Kuznetsov 2023.09.24 12:02 #32562

fxsaber #:

¿Pearson no calcula entre filas, sino entre columnas?

ZY Eso parece. Produce una matriz unitaria.

Puedes transponerla.

[Eliminado] 2023.09.25 12:48 #32563

Alglib es una buena librería, tiene de todo para MO. Las redes neuronales son super lentas allí, era así en las primeras versiones.

fxsaber 2023.09.25 18:01 #32564

Forester #:
en statistics.mqh.

PearsonCorrM - La correlación de todas las filas con todas las filas es la más rápida.

Sobre su base he calculado la matriz de correlaciones.

#include <Math\Alglib\statistics.mqh> // https://www.mql5.com/ru/code/11077

const matrix<double> CorrMatrix( const matrix<double> &Matrix )
{
  matrix<double> Res = {};
  
  const CMatrixDouble MatrixIn(Matrix);
  CMatrixDouble MatrixOut;  

  if (CBaseStat::PearsonCorrM(MatrixIn, MatrixIn.Rows(), MatrixIn.Cols(), MatrixOut)) // https://www.mql5.com/ru/code/11077
    Res = MatrixOut.ToMatrix();
  
  return(Res);
}

Medido el rendimiento.

#property script_show_inputs

input int inRows = 100; // Длина строки
input int inCols = 15000; // Количество строк

void FillArray( double &Array[], const int Amount )
{
  for (uint i = ArrayResize(Array, Amount); (bool)i--;)
    Array[i] = MathRand();
}

bool IsEqual( matrix<double> &Matrix1, const matrix<double> &Matrix2 )
{
//  return(MathAbs((Matrix1 - Matrix2).Mean()) < 1e-15); // Дорого по памяти.
  
  Matrix1 -= Matrix2;  
  
  const bool Res = (MathAbs(Matrix1.Mean()) < 1 e-15);
  
  Matrix1 += Matrix2;
  
  return(Res);
}

#define  TOSTRING(A) #A + " = " + (string)(A) + " "

#define  BENCH(A)                                                              \
  StartMemory = MQLInfoInteger(MQL_MEMORY_USED);                              \
  StartTime = GetMicrosecondCount();                                          \
  A;                                                                          \
  Print(#A + " - " + (string)(GetMicrosecondCount() - StartTime) + " mcs, " + \
       (string)(MQLInfoInteger(MQL_MEMORY_USED) - StartMemory) + " MB"); 

void PrintCPU()
{
#ifdef _RELEASE
  Print("EX5: " + (string)__MQLBUILD__ + " " + __CPU_ARCHITECTURE__ + " Release.");
#else // #ifdef _RELEASE
  Print("EX5: " + (string)__MQLBUILD__ + " " + __CPU_ARCHITECTURE__ + " Debug.");
#endif // #ifdef _RELEASE #else
  Print(TOSTRING(TerminalInfoString(TERMINAL_CPU_NAME)));
  Print(TOSTRING(TerminalInfoInteger(TERMINAL_CPU_CORES)));
  Print(TOSTRING(TerminalInfoString(TERMINAL_CPU_ARCHITECTURE)));
}

void OnStart()
{  
  PrintCPU();
  
  double Array[];
  FillArray(Array, inRows * inCols);
  
  matrix<double> Matrix;  
  Matrix.Assign(Array);
  Matrix.Init(inCols, inRows);
  Matrix = Matrix.Transpose();
  
  ulong StartTime, StartMemory;
  
  Print(TOSTRING(inRows) + TOSTRING(inCols));

  BENCH(matrix<double> Matrix1 = CorrMatrix(Matrix)) // https://www.mql5.com/ru/code/11077
  BENCH(matrix<double> Matrix2 = Matrix.CorrCoef(false)); // https://www.mql5.com/ru/docs/basis/types/matrix_vector
//  BENCH(matrix<double> Matrix3 = CorrMatrix(Array, inRows)); // https://www.mql5.com/ru/code/17982 

  Print(TOSTRING(IsEqual(Matrix1, Matrix2)));
//  Print(TOSTRING(IsEqual(Matrix3, Matrix2)));  
}

Resultado.

EX5: 3981 AVX Release.
TerminalInfoString(TERMINAL_CPU_NAME) = Intel Core i7-2700 K  @ 3.50 GHz 
TerminalInfoInteger(TERMINAL_CPU_CORES) = 8 
TerminalInfoString(TERMINAL_CPU_ARCHITECTURE) = AVX 
inRows = 100 inCols = 15000 
matrix<double> Matrix1 = CorrMatrix(Matrix) - 14732702 mcs, 1717 MB
matrix<double> Matrix2 = Matrix.CorrCoef(false) - 40318390 mcs, 1717 MB
IsEqual(Matrix1, Matrix2) = true

Se ve bien que Alglib calcula la matriz más rápido que el método matricial estándar.

Sin embargo, para la búsqueda de patrones, calcular la matriz de correlación es una locura en términos de consumo de RAM.

¿Cuánto tarda Python en leer el mismo tamaño de la matriz original que en el ejemplo anterior?

Cualquier pregunta de los Teoría de la aceleración Cómo minimizar la correlación

Aleksei Kuznetsov 2023.09.25 19:32 #32565

fxsaber #:

Sin embargo, es una locura que consume RAM leer una matriz de correlaciones para encontrar patrones.

La nativa funcionaba más rápido en mi i7-6700

inRows = 100 inCols = 15000 
matrix<double> Matrix1 = CorrMatrix(Matrix) - 14648864 mcs, 1717 MB
matrix<double> Matrix2 = Matrix.CorrCoef(false) - 29589590 mcs, 1717 MB
IsEqual(Matrix1, Matrix2) = true

Es extraño que la nativa sea más lenta, podrían simplemente haberla copiado. Es poco probable que Alglibe tenga algún algoritmo acelerado único bajo licencia.

¿Has probado las otras 2 variantes de Alglib?
Si cuentas en bucles cada fila con cada fila o cada fila con todas las filas, la memoria será más económica (2 filas o 1 fila + matriz). Pero tardará más, no recuerdo exactamente, pero creo que será más lento que la función incorporada.

Ojala EA ¿Puede el precio != Los "pases" del optimizador

Aleksei Kuznetsov 2023.09.25 20:17 #32566

fxsaber #:

Sin embargo, es una locura que consume RAM leer una matriz de correlaciones para encontrar patrones.

Es peor con la memoria.
Antes de lanzar

Y durante el trabajo de Alglibov PearsonCorrM memoria está creciendo todo el tiempo: Vi 5 gg y 4,6 en la pantalla.

Y durante el trabajo de Matrix.CorrCoef estándar.

Aparentemente, el estándar está optimizado para el uso mínimo de memoria, y el de Alglibov está optimizado para la velocidad.

Errores, fallos, preguntas Creación de una interfaz Teoría de la aceleración

fxsaber 2023.09.25 20:23 #32567

Forester #:

He conseguido que el integrado funcione más rápido: en un i7-6700.

He añadido los datos de la CPU y las instrucciones EX5 al código.

fxsaber 2023.09.25 20:36 #32568

Forester #:

Y mientras se ejecuta PearsonCorrM de Alglib, la memoria sigue creciendo: y se vio 5 gg, 4,6 se puso en la pantalla

Casi se duplica el consumo debido a esta línea.

Foro sobre trading, sistemas automatizados de trading y testeo de estrategias de trading

Aprendizaje automático en el trading: teoría, modelos, práctica y algo-trading

fxsaber, 2023.09.25 18:01

#include <Math\Alglib\statistics.mqh> // https://www.mql5.com/ru/code/11077

const matrix<double> CorrMatrix( const matrix<double> &Matrix )
{
  matrix<double> Res = {};
  
  const CMatrixDouble MatrixIn(Matrix);
  CMatrixDouble MatrixOut;  

  if (CBaseStat::PearsonCorrM(MatrixIn, MatrixIn.Rows(), MatrixIn.Cols(), MatrixOut)) // https://www.mql5.com/ru/code/11077
    Res = MatrixOut.ToMatrix();
  
  return(Res);
}

Esto es sólo una transición de CMatrixDouble a matrix<double>. Incluso tuve que hacer esta comparación de matriz debido a la memoria.

Foro sobre trading, sistemas automatizados de trading y testeo de estrategias de trading

Aprendizaje automático en trading: teoría, modelos, práctica y algo-trading

fxsaber, 2023.09.25 18:01

bool IsEqual( matrix<double> &Matrix1, const matrix<double> &Matrix2 )
{
//  return(MathAbs((Matrix1 - Matrix2).Mean()) < 1e-15); // Дорого по памяти.
  
  Matrix1 -= Matrix2;  
  
  const bool Res = (MathAbs(Matrix1.Mean()) < 1 e-15);
  
  Matrix1 += Matrix2;
  
  return(Res);
}

Bibliotecas: MT4Orders Librerías: Expert Discusión sobre el artículo

Aleksei Kuznetsov 2023.09.25 20:46 #32569

fxsaber #:
Casi el doble de consumo debido a esta línea.

Es sólo una transición de CMatrixDouble a matrix<double>. Incluso tuve que hacer esta comparación de matrices debido a la memoria.

Y el tiempo se incrementa en un 40% por esta conversión. Comentado // Res = MatrixOut.ToMatrix();

matrix<double> Matrix1 = CorrMatrix(Matrix) - 10482307 mcs, 0 MB
matrix<double> Matrix2 = Matrix.CorrCoef(false) - 28882536 mcs, 1717 MB

Es decir, si trabajas sólo con Alglibov fncional (sin convertir sus matrices en matrices terminales), será más rápido.

Errores, fallos, preguntas El problema de la Cualquier pregunta de novato,

[Eliminado] 2023.09.26 01:04 #32570

fxsaber #:

¿Cuánto tarda Python en calcular el mismo tamaño de la matriz original que en el ejemplo anterior?

import numpy as np
import time

def calc_corr_matrix():
    arr = np.random.rand(15000, 100).astype(np.float32)
    corr_matrix = np.corrcoef(arr)
    size_in_mb = corr_matrix.nbytes / 1024**2
    print("Array size:", size_in_mb, "MB")
    return corr_matrix

start_time = time.time()
corr_matrix = calc_corr_matrix()
end_time = time.time()

print("Time taken:", end_time - start_time, "seconds")

Array size: 1716.61376953125 MB
Time taken: 2.08686900138855 seconds

Medición del tiempo considerando la creación de la matriz

Aprendizaje automático en el trading: teoría, práctica, operaciones y más - página 3257