Linear regression in MQL5

 

Hi everybody,

I want to solve one task with linear regression:

I have the function

y = ax1 + bx2 + cx3 + dx4 + ex5 + fx6

y and x1 to x6 are known and I have the matrix with their empiric values (100 examples).

I want to calculate the approximation of a to f. I download the library "alglib.mqh" which must be useful.

Can somebody tell me how to calculate my task - some code will be useful. Thanks!

 
one_monk:

Hi everybody,

I want to solve one task with linear regression:

I have the function

y = ax1 + bx2 + cx3 + dx4 + ex5 + fx6

y and x1 to x6 are known and I have the matrix with their empiric values (100 examples).

I want to calculate the approximation of a to f. I download the library "alglib.mqh" which must be useful.

Can somebody tell me how to calculate my task - some code will be useful. Thanks!

You are confusing something.

If you are talking about linear regression, then maybe you have 6 points (x1,y1) ...(x6,y6) and you need to find a and b in the formula of the line (y=a+b*x), at which the value of the sum of the standard deviation from the known points is minimal.

linear regression (first-degree polynomial y=a+b*x): 


parabolic regression (second-degree polynomial y=a+b*x+c*x²):



wave regression (third-degree polynomial y=a+b*x+c*x²+d*x³) :


 

This is a data science topic and the linear regression task is possible.

//+------------------------------------------------------------------+
//| Linear regression                                                |
//| Subroutine builds model:                                         |
//|     Y = A(0)*X[0] + ... + A(N-1)*X[N-1] + A(N)                   |
//| and model found in ALGLIB format, covariation matrix, training   | 
//| set errors (rms, average, average relative) and leave-one-out    |
//| cross-validation estimate of the generalization error. CV        |
//| estimate calculated using fast algorithm with O(NPoints*NVars)   |
//| complexity.                                                      |
//| When  covariation  matrix  is  calculated  standard deviations of| 
//| function values are assumed to be equal to RMS error on the      |
//| training set.                                                    |
//| INPUT PARAMETERS:                                                |
//|     XY          -   training set, array [0..NPoints-1,0..NVars]: |
//|                     * NVars columns - independent variables      |
//|                     * last column - dependent variable           |
//|     NPoints     -   training set size, NPoints>NVars+1           |
//|     NVars       -   number of independent variables              |
//| OUTPUT PARAMETERS:                                               |
//|     Info        -   return code:                                 |
//|                     * -255, in case of unknown internal error    |
//|                     * -4, if internal SVD subroutine haven't     |
//|                           converged                              |
//|                     * -1, if incorrect parameters was passed     |
//|                           (NPoints<NVars+2, NVars<1).            |
//|                     *  1, if subroutine successfully finished    |
//|     LM          -   linear model in the ALGLIB format. Use       |
//|                     subroutines of this unit to work with the    |
//|                     model.                                       |
//|     AR          -   additional results                           |
//+------------------------------------------------------------------+
static void CAlglib::LRBuild(CMatrixDouble &xy,const int npoints,const int nvars,
                             int &info,CLinearModelShell &lm,CLRReportShell &ar)
  {
//--- initialization
   info=0;
//--- function call
   CLinReg::LRBuild(xy,npoints,nvars,info,lm.GetInnerObj(),ar.GetInnerObj());
//--- exit the function
   return;
  }

The catch is the code is poorly documented with few examples, and not sure anyone has tried to see if it is good.

You have to ask Sergey Bochkanov, the author of the code. His forum is here: http://forum.alglib.net/viewforum.php?f=2

Alternatively, I think the thread starter can also perform the operation using Azure Machine Learning (It is free) instead, rather than using alglib.mqh, especially CLinReg.

Why I would recommend this is because many people have played with this tool and it is popular. And because it is popular, the tool has been tested thoroughly.

You probably have to scale your variables (data cleaning) before you run the LR model, following the data science process.

Check this link out for more info: https://docs.microsoft.com/en-us/azure/machine-learning/service/tutorial-data-prep.

forum.alglib.net • View forum - ALGLIB-discuss
  • forum.alglib.net
Topics   Author   Replies   Views   Last post
 

Thank you Zee Zhou Ma. I register myself in the forum of Sergey Bochkanov. I'm waiting for answer.

The Azure may be is not a good idea - I want to calculate the regresion into my MQL5 bot. I don't know how to do this with Azure.

Thank you very much for the comprehensive answer.