Machine learning in trading: theory, models, practice and algo-trading - page 186

 
Yury Reshetov:

Because it is frozen.

I'm sorry, but as the question, so is the answer.

I do not understand the humor, because to make a decision, the value of the classifier's output must be compared with something, for example, with a threshold value. And since in your formulation of the problem the comparable values for some reason are unknown, and only those that are not needed for the classification are known, it would be a good idea to make clarifications.

Forget it.
 

I have completed many days of calculations (models on 6 selected predictors (out of 114) for forex).

Here is the title picture. Distribution of regression accuracy (counted by L1 norm: sum of absolute error values ) on validation for the models that were selected as the best (by the same measure) on the test boxes.

Each box has 99 values, each of which is metric 1 - sum(abs(X-Y))/sum(abs(X-mean(X)) on a unique validation sample. Analogous to R^2, I see, yes.

A total of 8908 models turned out... For all the instruments and targets under study.

An average error reduction of 0.2% (only). But it is significant... A unique validation sample was generated for each model.

All studies I want to publish. That's where the MO evaluation of the model goes next, and so on to the logical end. If I publish (not in MQL), I'll give a link to some people I communicate with here or I'll post it in my profile.

And in the same place. This is a much more interesting picture from a practical point of view. The relation between model's expectation on test blocks (inside crossvalidation) and on validation.

Here we must immediately check if the positive correlation is significant (since negative correlation cannot be reasonably explained at all) and if there are positive values of MPO on validation. Well, you can see for yourself.

The 99 points are models.

 
Dr.Trader:


Well this is a good example of why 99% of naive traders are losing, if you move the window this mishmash of points will also morph randomly, that is, it is only noise and MO is not helpful here
 
It will be excellent, and if the ticker is also voiced... (brainwave)))
 
Idon't know:
This is a good example of why 99% of naive traders lose...
And this thread is a good, clear example of the fact that machine learning in trading, is only a theory...
 
Invalid45:
And this branch is a good, clear example of the fact that machine learning in trading, is only a theory...
If you look carefully at the branch, the inhabitants here are divided into three camps:
  1. R users (referred to here as "parasites"). They look like a destructive sect. All the time they are poking around in some packages, today engaged in classification, tomorrow regression, the next day some clustering, and so on in a circle. The activity seems to be lively, but it is useless, because whatever they undertake, everything will be done incorrectly and askew, because of which they do not succeed. This is expressed in their complaints, for example: about fate, the "problem" of retraining, noisy predictors and all sorts of "radishes and bad people" who do not recognize R, such as Reshetov.
  2. Those who don't use R. Such, as a rule, have chosen a particular direction, where they are good at something. They do not complain about destiny, they do not dig in various methods, i.e. do not scatter. And they are engaged in that which gives a result and gradually improve in the chosen direction.
  3. Those who dropped in on the fire. Sometimes insert their three cents, but often inopportune.
 
Alexey Burnakov:

I have completed many days of calculations.

I follow your research, very informative, thank you for posting. But it seems to me that even though you successfully solve such complex problems, you skip preparatory tasks and it spoils the result. Namely, you ignore the selection of predictors.

You took 114 predictors, then somehow selected 6, and after training the models you can conclude which goal is better. But this result is just a local maximum. You can say not globally that "eurusd is better predicted by 16 bars ahead", but only that "the set of 114 predictors: (pre1, pre2, pre3,...) using gbm best predicts price direction through 16 bars".

If you take neuronics instead of gbm the best target will be different. If you take the other 114 predictors, the best target will be different again. Your 114 predictors are such an important base that the whole further course of the experiment depends on it, and you just took them from the ceiling without any preparation.

About half a year ago SanSanych has posted a file with his predictors. The peculiarity of these predictors is that most of models in rattle have some small error on them, and the error does not increase on new data. You can train models on any segment, and do an oos test on the remaining data and see that nothing has deteriorated. It's the predictors and the target are so related that the models find the only possible relationship between them on any bars.
I'm trying to replicate that. I use more than 10 000 initial predictors (indicators with different parameters and lags from mt5) and learn to select them so that they have the only possible connection with the target. I think that the ability to find such correlated predictors and the target is a true bullet for the grail.

In MQL5 there is recently available Expert Advisor generator, when you select a list of necessary indicators and a ready Expert Advisor with code is created immediately. This Expert Advisor has 20 indicators and there are no machine learning models (all we have are importance coefficients, assigned to each indicator).
I just added my custom code for the fitness function of genetics, including some criteria, so that the target and indicators are considered closely related in my opinion. It turned out like this:
(eurusd h1)

The first 2/3 is backtest (sample), the last third is fronttest (oos). There after 2/3 of the time is not a flush, but the balance is reset to the initial one for the oos test. Having such a poor set of options and simply adding the "crude and unfinished criteria of predictor and target dependence", we obtain a good result, although not a great one, but not a loss. 51% of successful trades at oos. Isn't that great? But it is possible to take 20000 indicators instead of 20, add some machine learning model and remove the 10000 iteration limit of mt5 genetics and we'd have even a profitable Expert Advisor.

 
revers45:
And this thread is a good, clear example of the fact that machine learning in trading, is only a theory...

Yes in trading theory can not be in principle, or rather theory is about the fact that it is impossible to earn, the efficient market, etc., all taken into account in the price, exchange mechanism ..., that is, to trade to play roulette.But the statistics and machine learning, which became more accessible recently thanks to different math-packages and libraries, allows you to really see WHY it is so sad with the standard TA, not scientists, but simple traders, a week having messed around in R-studio or Matlab.

If MO in trading is "only theory", which in general is partially true, then TA is not even a theory, but in general a mare's nonsense, like astrology or voodoo.

But many here know that it is still possible to make money, an effective market is not just by the will of God, but because some have learned how to get and process information better than most. In my opinion the most significant obstacle for trader is an illusion of simplicity of this type of business, as if the official will get money for his signature, here on this forum not once sounded something like "you do not need to create a hadron collider to trade" ...

But it turns out you do....

 
Dr.Trader:

I follow your research, very informative, thank you for posting. But it seems to me that even though you successfully solve such complex problems, you skip preparatory tasks and it spoils the result. Namely, you ignore the selection of predictors.

You took 114 predictors, then somehow selected 6, and after training the models you can conclude which goal is better. But this result is just a local maximum. You can say not globally that "eurusd is better predicted by 16 bars ahead", but only that "the set of 114 predictors: (pre1, pre2, pre3,...) using gbm best predicts price direction through 16 bars".

If you take neuronics instead of gbm the best target will be different. If you take the other 114 predictors, the best target will be different again. Your 114 predictors are such an important base that the whole further course of the experiment depends on it, and you just took them from the ceiling without any preparation.

About six months ago SanSanych posted a file with his predictors. Their peculiarity is that most models in rattle have a small error on them, and at the same time the error does not grow on new data. You can train models on any segment, and do an oos test on the remaining data and see that nothing has deteriorated. It's the predictors and the target are so related that the models find the only possible relationship between them on any bars.
I'm trying to replicate that. I use more than 10 000 initial predictors (indicators with different parameters and lags from mt5) and learn to select them so that they also have the only possible connection with the target bar. I think that the ability to find such correlated predictors and the target is a true bullet for the grail.

In MQL5 there is recently available Expert Advisor generator, when you select a list of necessary indicators and a ready Expert Advisor with code is created immediately. This Expert Advisor has 20 indicators and there are no machine learning models (all we have are importance coefficients, assigned to each indicator).
I just added my custom code for the fitness function of genetics, including some criteria, so that the target and indicators are considered closely related in my opinion. It turned out like this:
(eurusd h1)

The first 2/3 is backtest (sample), the last third is fronttest (oos). There after 2/3 of the time is not a flush, but the balance is reset to the initial one for the oos test. Having such a poor set of options and simply adding the "crude and unfinished criteria of predictor and target dependence", we obtain a good result, although not a bad one, but not a loss. 51% of successful trades on oos. Isn't that great? Well, you could take not 20 indicators but 20000 and add some machine learning model and remove the limit of 10000 iterations from mt5 genetics and even a profitable Expert Advisor would appear.

Certainly a local result. There is no possibility of diversification. No time. No resources... This is exactly what GBM gives on my predictors.

The question is not to use the overtrained part of the experience to draw conclusions. If this local result is successfully validated, I will be satisfied.

The regression quality is successfully validated. Trained models give different from zero meaningful prediction quality. Without any model selection problems.

And it's more complicated with the MO of the trade. I haven't shown everything... There are subsamples (symbol-target) giving median MO on validation greater than zero... but the goal is to take the tail of patterns correlated with the validation to increase the MO. But that could be a case, too...

At the end, a committee will be assembled that is validated by another sample from the future. Through Monte Carlo, of course.

As for the predictors, it's complicated, too... It's a long explanation... But the point is that each of the 99 models for the symbol-target subsample uses its own unique set of 6 predictors. This makes a nice variety of models (plus they learn from different data). and in general, each of the 114 is involved somewhere.

And don't forget about optimistic model selection, please. Any choice has to be additionally validated. i.e. i don't understand how you chose this picture? By the best result on IS or OS? It's just a question. But many "research" results sin that there is no answer to that question.
 

Advise where to find an advisor (robot) which would open a deal at a given time and then close the deal at a given time.


For example, opened a trade at 12:59, and would close it at 13:59, regardless of the result - profit or loss all the same.

Reason: