Machine learning in trading: theory, models, practice and algo-trading - page 839

 

At first, when they were offered their pronozes only a couple hundred people - everything was normal, I even received payments for places. Here on the forum, at least two other people also showed great results with prizes.

It was quite easy for the numeraire admins to choose the best results and trade on them. Then participants became thousands, began to send them all sorts of garbage, not models, there were all sorts of cheaters with hundreds of accounts that stupid brutforcing forecasts. Admins were fed up with this whole circus. They made it very simple - "if you want a cash prize - leave a deposit. If the model is a fake, and will lose the deposit, and profits will not be.
In my opinion, admins went the easiest way for them. They themselves have not learned how to identify potentially good models, or even prepare features, and turned everything into a lottery. I would have done things differently if I were them.

 
Dr. Trader:

In the beginning, when they offered their pronozes only a couple of hundred people - everything was normal, I even received payments for places. Here on the forum, at least two other people also showed great results with prizes.

Admins numerae was pretty easy to choose the best results and trade on them. They do not know how to do this, but they are not sure how to do it. Admins were fed up with this whole circus. They made it very simple - "if you want a cash prize - leave a deposit. If the model is a fake, and will lose the deposit, and profits will not be.
In my opinion, admins went the easiest way for them. They themselves have not learned how to identify potentially good models, or even prepare features, and turned everything into a lottery. I would have done things differently if I were them.

Maybe you're right, but now IMHO it all too arbitrary, changing loglos as strange, then were more than half below (better) than random (0.96315) on Life, now suddenly almost all have your (worse) random ... shorter arbitrariness IMHO, I do not trust them to "take risks" when you can not check anything, and the whole idea is stupid, the classification itself is not the subject of the Challenge, there is no point in delegating it, another thing chips and targets to construct from raw data ...

 

I will not let this branch fall out of the Top level.

Gentlemen of the neural network - the common people are waiting for the Grail from you. Don't ruin it.

 

Experimented with randomUniformForest - didn't like it.

The importance of the predictors keeps jumping up and down the list.

Here is a reproducible example on the data from the article https://www.mql5.com/ru/articles/4227 :
Start RStudio, download Cotir.RData file from GitHub/Part_I, with quotes obtained from terminal, and FunPrepareData.R file with data preparation functions fromGitHub/Part_IV.
Then:

evalq({
  dt <- PrepareData(Data, Open, High, Low, Close, Volume)
}, env)

 prep <- caret::preProcess(x = env$dt[, -c(1,ncol(env$dt))], method = c("spatialSign"))
 x.train <- predict(prep, env$dt[, -c(1,ncol(env$dt))])#удалить время и класс
 x.train <- as.matrix(x.train,ncol=(ncol(env$dt)-1))
 
 y.train <- as.matrix(env$dt[, ncol(env$dt)],ncol=1)
 require(randomUniformForest)

ruf <- randomUniformForest( X = x.train,Y = y.train,mtry = 1, ntree = 300,threads = 2, nodesize = 2,  regression = FALSE)
ruf$forest$variableImportance

ruf <- randomUniformForest( X = x.train,Y = y.train,mtry = 1, ntree = 300,threads = 2, nodesize = 2,  regression = FALSE)
ruf$forest$variableImportance

ruf <- randomUniformForest( X = x.train,Y = y.train,mtry = 1, ntree = 300,threads = 2, nodesize = 2,  regression = FALSE)
ruf$forest$variableImportance

ruf <- randomUniformForest( X = x.train,Y = y.train,mtry = 1, ntree = 300,threads = 2, nodesize = 2,  regression = FALSE)
ruf$forest$variableImportance

Here we calculate the global importance of predictors on the same data 4 times. The result is almost random:

----------------------------------------------------------------------------------------------
1 ftlm 9204 2 0.52 100.00 8
2 rbci 9197 2 0.52 99.92 8
3 stlm 9150 2 0.52 99.41 8
4 v.fatl 9147 2 0.51 99.38 8
5 v.rftl 9122 2 0.52 99.11 8
6 v.satl 9110 2 0.51 98.98 8
7 v.stlm 9096 2 0.51 98.82 8
8 v.rbci 9084 2 0.51 98.69 8
9 pcci 9082 2 0.52 98.68 8
10 v.rstl 9049 2 0.52 98.31 8
11 v.pcci 8980 2 0.51 97.57 8
12 v.ftlm 8953 2 0.52 97.28 8
----------------------------------------------------------------------------------------------

1 v.fatl 9130 2 0.51 100.00 8
2 ftlm 9079 2 0.52 99.45 8
3 v.rbci 9071 2 0.52 99.35 8
4 v.rftl 9066 2 0.52 99.30 8
5 stlm 9058 2 0.51 99.21 8
6 v.satl 9033 2 0.51 98.94 8
7 pcci 9033 2 0.51 98.94 8
8 v.stlm 9019 2 0.51 98.78 8
9 v.rstl 8977 2 0.51 98.33 8
10 rbci 8915 2 0.52 97.64 8
11 v.pcci 8898 2 0.51 97.46 8
12 v.ftlm 8860 2 0.51 97.04 8
----------------------------------------------------------------------------------------------

1 v.fatl 9287 2 0.51 100.00 9
2 stlm 9191 2 0.52 98.96 8
3 v.rbci 9172 2 0.52 98.76 8
4 v.rftl 9134 2 0.51 98.35 8
5 v.satl 9115 2 0.51 98.14 8
6 ftlm 9109 2 0.51 98.08 8
7 v.stlm 9072 2 0.51 97.69 8
8 v.rstl 9072 2 0.51 97.68 8
9 v.ftlm 9036 2 0.51 97.30 8
10 pcci 9014 2 0.52 97.05 8
11 rbci 9002 2 0.52 96.93 8
12 v.pcci 8914 2 0.51 95.98 8
----------------------------------------------------------------------------------------------

1 v.satl 9413 2 0.51 100.00 8
2 ftlm 9389 2 0.52 99.75 8
3 v.stlm 9371 2 0.51 99.55 8
4 v.rftl 9370 2 0.51 99.54 8
5 v.rbci 9337 2 0.51 99.19 8
6 v.pcci 9314 2 0.51 98.95 8
7 v.fatl 9311 2 0.52 98.91 8
8 stlm 9295 2 0.52 98.75 8
9 pcci 9281 2 0.51 98.60 8
10 v.rstl 9261 2 0.51 98.39 8
11 v.ftlm 9257 2 0.51 98.35 8
12 rbci 9238 2 0.52 98.14 8

For the other 2 packages tested, the importance of the predictors is determined in the same way on pre-runs.

Глубокие нейросети (Часть VI). Ансамбль нейросетевых классификаторов: bagging
Глубокие нейросети (Часть VI). Ансамбль нейросетевых классификаторов: bagging
  • 2018.03.01
  • Vladimir Perervenko
  • www.mql5.com
В предыдущей статье этой серии мы оптимизировали гиперпараметры модели DNN, обучили ее несколькими вариантами и протестировали. Качество полученной модели оказалось довольно высоким. Также мы обсудили возможности того, как можно улучшить качество классификации. Одна из них — использовать ансамбль нейросетей. Этот вариант усиления мы и...
 
elibrarius:

Experimented with randomUniformForest - didn't like it.

The importance of the predictors keeps jumping up and down the list.

I don't see it.

But that's not the point; the point is this principle.

The importance of predictors, which is defined in this model and in other models, is some characteristic of the use of a particular predictor in a particular model.

And you can put the problem as the importance of the predictor for the target variable, not in a particular model.

These are the functions from caret are exactly that. You can use them to form some general set of predictors that are "useful" for the target variable. There is one very interesting nuance: that if we move the window and re-select in already selected predictors, for example related to a particular algorithm, this set will constantly change.

Generally speaking, you need an answer to the question: what do you need the importance of a predictor for? For selection in a particular algorithm? So the algorithm has already expressed its opinion on this and informed you about it. so the numbers you show are nothing at all, whether they change or not is irrelevant. What matters is the prediction by the model outside the training sample and the relationship of the list of predictors to the success of the prediction OUTSIDE the sample

 
elibrarius:

Experimented with randomUniformForest - didn't like it.

Try uploading your predictors here

https://www.mql5.com/ru/articles/3856

and then see their imports on an automatically generated matrix, after training the Agent

I have +- results on mine so far, but there is still room for improvement

I think it is useless to pick up the targets to predictors on the non-stationary market, importance changes stochastically as well.

Random Decision Forest в обучении с подкреплением
Random Decision Forest в обучении с подкреплением
  • 2018.04.12
  • Maxim Dmitrievsky
  • www.mql5.com
Random Forest (RF) с применением бэггинга — один из самых сильных методов машинного обучения, который немного уступает градиентному бустингу.  Случайный лес состоит из комитета деревьев решений (которые также называются деревьями классификации или регрессионными деревьями "CART" и решают одноименные задачи). Они применяются в статистике...
 
SanSanych Fomenko:

I don't see it.

But that's not the point; the point is this principle.

The importance of predictors, which is defined in this model and in other models, is some characteristic of the use of a particular predictor in a particular model.

And you can put the problem as the importance of the predictor for the target variable, not in a particular model.

These are the functions from caret are exactly that. You can use them to form some general set of predictors that are "useful" for the target variable. There is one very interesting nuance: that if we move the window and re-select in already selected predictors, for example related to a particular algorithm, this set will constantly change.

Generally speaking, you need an answer to the question: what do you need the importance of a predictor for? For selection in a particular algorithm? So the algorithm has already expressed its opinion on this and informed you about it. so the numbers you show are nothing at all, whether they change or not is irrelevant. What matters is the prediction by the model outside the training sample and the relationship of the list of predictors to the success of the prediction outside the sample

seed would simply fix one of these random sets. But it would still be random, though reproducible. I think the point is that the importance weights (3rd column) differ by only 3% between the minimum and maximum, so with small changes in the forest they easily jump over the list. In other packages these weights differ by times or orders of magnitude.

The importance of predictors is needed to sift out unimportant and noisy ones, and use them in NS or in ensemble.

In this dataset stlm - very much degrades the training result, I use it as a marker - if it doesn't drop out, then the predictor selection package - has failed.

 
Maxim Dmitrievsky:

Try uploading your predictors here

https://www.mql5.com/ru/articles/3856

and then see their imports on the automatically generated matrix, after training the Agent

I have +- results on mine so far on the OOS, but there is still room for improvement

I think it's useless to try to buy predictors in a non-stationary market as importance varies stochastically.

Oh - a new article. Interesting...
 
elibrarius:
Oh - new article. Interesting...

Yes, the problem of target selection falls away with this approach, but you need to learn how to make meaningful revisions to the agent

it works fine on any predictors, but for it to work on OOS you need to make a big deal with their choice

 
Alexander_K2:

I will not let this branch fall out of the Top level.

Gentlemen of the neural network - the common people are waiting for the Grail from you. Don't ruin it.

I was getting worried myself when I lost it :-)

Reason: