Download MetaTrader 5

Discussion of article "Evaluation and selection of variables for machine learning models"

To add comments, please log in or register
MetaQuotes Software Corp.
Moderator
187126
MetaQuotes Software Corp.  

New article Evaluation and selection of variables for machine learning models has been published:

This article focuses on specifics of choice, preconditioning and evaluation of the input variables for use in machine learning models. Multiple methods of normalization and their features will be described here. Important moments of the process greatly influencing the final result of training models will also be revealed. We will have a closer look and evaluate new and little-known methods for determining the informativity and visualization of the input data.

With the "RandomUniformForests" package we will calculate and analyze the importance concept of a variable at different levels and in various combinations, the correspondence of predictors and a target, as well as the interaction between predictors, and the selection of an optimal set of predictors taking into account all aspects of importance.

With the "RoughSets" package we will look at the same issue of choosing predictors from a different angle and based on other concept. We will show that it's not only a set of predictors that can be optimal, a set of examples for training can also be optimized.

All calculations and experiments will be executed in the R language, to be specific - in Revolution R Open 3.2.1 .

OOB error

Fig. 2. Training error depending on the number of trees

Author: Vladimir Perervenko

mg64ve
7
mg64ve  

Dear Vladimir, thanks for your nice article.

I am reading all your articles and they are very interesting.

Regarding this one, I still do not understand very well why do you use preProcess to partition data and how are data splitted?

From my experiments this function splits data into a different order.

The question is: how can I restablish the original order after I have the results from the predict function?

It appears this result it is in a different order.

Thanks in advance for your comments.

Cheers 

Vladimir Perervenko
3460
Vladimir Perervenko  
mg64ve:

Dear Vladimir, thanks for your nice article.

I am reading all your articles and they are very interesting.

Regarding this one, I still do not understand very well why do you use preProcess to partition data and how are data splitted?

From my experiments this function splits data into a different order.

The question is: how can I restablish the original order after I have the results from the predict function?

It appears this result it is in a different order.

Thanks in advance for your comments.

Cheers 

Hi,

You probably misunderstood. Split  of the functions holdout() performed. Then the function preProcess() defined by the parameters of normalization in the training set. And then the code ..

Best regards

Vladimir

> idx <- rminer::holdout(y = data.f$Class)
> prep <- caret::preProcess(x = data.f[idx$tr, -ncol(data.f)],
+             method = c("spatialSign"))
> x.train <- predict(prep, data.f[idx$tr, -ncol(data.f)])
> x.test <- predict(prep, data.f[idx$ts, -ncol(data.f)])
> y.train <- data.f[idx$tr, ncol(data.f)]
> y.test <- data.f[idx$ts, ncol(data.f)]
JunCheng Li
47
JunCheng Li  

Is that worng?  " best <- Cs(cci, cmo,  slowD, oscK, signal, tr, ADX. chv, atr, ar)"

 


JunCheng Li
47
JunCheng Li  

Should the right code is:

>library(Hmisc)

>best <- Cs(cci, cmo,  slowD, oscK, signal, tr, ADX,chv, atr, ar)

the "ADX." should be "ADX,"?

Vladimir Perervenko
3460
Vladimir Perervenko  
JunCheng Li:

Should the right code is:

>library(Hmisc)

>best <- Cs(cci, cmo,  slowD, oscK, signal, tr, ADX,chv, atr, ar)

the "ADX." should be "ADX,"?

Да, это опечатка в статье.

JulInParis
11
JulInParis  

Hi Vlad,

I'm trying to rerun your example step by step.

In the section Input data , The In(p=16) function deals with a price object. What is its R- format or class ( zoo, xts or dataframe ) and how does it look like ( its column names, etc..). Without these information, it's impossible to run the command    x <- In(p = 16) ...

 

Best regards.

 

Julien 

Vladimir Perervenko
3460
Vladimir Perervenko  
JulInParis:

Hi Vlad,

I'm trying to rerun your example step by step.

In the section Input data , The In(p=16) function deals with a price object. What is its R- format or class ( zoo, xts or dataframe ) and how does it look like ( its column names, etc..). Without these information, it's impossible to run the command    x <- In(p = 16) ...

 

Best regards.

 

Julien 

Hi Julien,

> class(price)
[1] "matrix"
> colnames(price)
[1] "Open"  "High"  "Low"   "Close" "Med"   "CO"

Я приложил снимок сессии. Откройте его в Rstudio и проводите эксперименты.

Удачи

Владимир
 

Files:
EURUSD30.zip 302 kb
hzmarrou
6
hzmarrou  
Vladimir Perervenko:

Hi Julien,

> class(price)
[1] "matrix"
> colnames(price)
[1] "Open"  "High"  "Low"   "Close" "Med"   "CO"

Я приложил снимок сессии. Откройте его в Rstudio и проводите эксперименты.

Удачи

Владимир
 


Dear all, 


Can someone tell me what the --Dig-- defined in ZZ function variable means. Is it a constant? if yes what should the value be of this constant?    

Vladimir Perervenko
3460
Vladimir Perervenko  
hzmarrou :


Dear all, 


Can someone tell me what the --Dig-- defined in  ZZ function variable means. Is it a constant? if yes what should the value be of this constant?    

I answered you in the next branch.
Mahdi Golkar
127
Mahdi Golkar  
12
To add comments, please log in or register