Machine learning in trading: theory, models, practice and algo-trading - page 1035

 
Roffild:

I was answering the question "Why do I need Spark?

You were proving the need for someone else's system, but to show what you personally can with your library and answer my specific question, and there's a problem for beginner level skils. https://www.mql5.com/ru/forum/86386/page1033#comment_8211170

In fact, the pattern contains signals of a simple strategy on the crossover of moving averages - EMA 9 and EMA 14, slightly noisy, to increase profitability.)

I am pasting a complete template of the answer - solutions with initial signals, overlaying indicators and visual run in the tester of one of the Expert Advisor trained on the above signals.

I am posting EA_EURUSD_H1_NN - based on a neural network, EA_EURUSD_H1_RF - on random forests,

Expert Advisors are tested on EURUSD H1 server MetaQuotes-Demo, the appropriate test charts are presented below.

...neuronet


...random forests

On both charts the training period is marked, ie, the period in which there is a training signals, see pattern.

Машинное обучение в трейдинге: теория и практика (торговля и не только)
Машинное обучение в трейдинге: теория и практика (торговля и не только)
  • 2018.07.29
  • www.mql5.com
Добрый день всем, Знаю, что есть на форуме энтузиасты machine learning и статистики...
 
Ivan Negreshniy:

You were proving the need for someone else's system, but to show what you personally can with your library and answer my specific question, and there's a task for entry level skils. https://www.mql5.com/ru/forum/86386/page1033#comment_8211170

In fact, the pattern contains signals of a simple strategy on the crossover of moving averages - EMA 9 and EMA 14, slightly noisy, to increase profitability.)

I am pasting the full template of the answer - solutions with the original signals, overlaying indicators and visual run in the tester of one of the Expert Advisor trained on the specified signals.

I am posting EA_EURUSD_H1_NN - based on a neural network, EA_EURUSD_H1_RF - on random forests,

Advisors are tested on EURUSD H1 server MetaQuotes-Demo, below are the corresponding test charts.

...neuronet


...random forests

On both charts the training period is marked, ie, the period in which there are the training signals, see the pattern.

NN is that your secret grid?

and the features are exactly the same?

 
Maxim Dmitrievsky:

NN is that your secret grid? the differences are big

and the features are exactly the same?

Yes, but the point is different - let's agree on the formats and exchange information for trading, otherwise we won't have any progress, only the fold, in a blind and dumb style.

PS: The features in both EAs are counted by OHLC bar, and their number and calculation formula, are identical.

 
Ivan Negreshniy:

Well, yes, but the point is something else - let's agree on formats and exchange information on MO for trading, otherwise we will not have any progress, only fold, in the style of blind and dumb.

PS: In both EAs features are calculated using OHLC bar and their number and calculation formula are identical.

You can join us in our chat room, we discuss strategies there, and you can agree on everything there. There is a lot of division by topic, who is interested in what

Important or secret topics are closed to outsiders

 
Maxim Dmitrievsky:

It's been clear about Spark for a long time, I didn't ask. I asked about the idea. This approach with spark is just out of thin air because of inefficient way to train and required power

The same can be done through optimization in the MT5 cloud without scaffolding. I don't know what the output is or if there is any profit, but in theory there isn't and this algorithm will always fail because of the overfit

IMHA

There is an opinion that a constructed model should always return 0 or 1.

But what if we treat the result returned from the model as an indicator? Trying to estimate such a model by MSE etc. won't do any good. But when you apply such a model with buy parameters > 0.75 and sell < 0.25 you will make good profit.

The idea itself: to throw several data from different indicators into a random forest and obtain one super indicator.

The way of searching for the grail requires checking of non-standard ideas.

 
Roffild:

There is an opinion that a constructed model should always return 0 or 1.

But what if we treat the returned result of the model as an indicator? Trying to estimate such a model by MSE etc. will not do any good. But when you apply such a model with buy parameters > 0.75 and sell < 0.25 you will make good profit.

The idea itself: to throw several data from different indicators into random forest and obtain one super indicator.

Ways to find the grail require testing out-of-the-box ideas.

The forest doesn't give out probabilities of class membership, so these inequalities are nonsense

>< 0.5 and that's it, there is no other way. And there's also the question of whether binarized signs and outputs are better or not.

You can divide 0 to 100 into classes, there is no difference, it's not the NS
 
Maxim Dmitrievsky:

The forest does not give out probabilities of class membership, so these inequalities are nonsense

>< 0.5 and that's it, there is no other way. And another question is what is better - binarized signs and outputs or not.

you can divide 0 to 100 into classes, there's no difference, it's not NS.
If there is no probability, then what do these lines do?
static void CDForest::DFProcess(CDecisionForest &df,double &x[],double &y[])
...
//--- calculation
   v=1.0/(double)df.m_ntrees;
   for(i_=0;i_<=df.m_nclasses-1;i_++)
      y[i_]=v*y[i_];
 
Roffild:
If there is no probability, then what do these lines do?

Oh, right.

the result of all the classification algorithms included in the ALGLIB package is not the class to which the object belongs, but a vector of conditional probabilities.

But this is little consolation. There will be fewer signals and the efficiency will not necessarily increase. I, for example, did not increase, everywhere I put 0.5 threshold now

much more important is the comparability of errors on the train and oob

 
Maxim Dmitrievsky:

Oh, right.

the result of all the classification algorithms included in the ALGLIB package is not the class to which the object belongs, but a vector of conditional probabilities.

But this is little consolation. There will be fewer signals and the efficiency will not necessarily increase. I, for example, did not increase, everywhere I put 0.5 threshold now

What is much more important is the comparability of errors on the train and oob.

But these are peculiarities of modified algorithms.

AlgLib implements classic random forest without modification. It is the same in Spark.

Setting threshold to 0.5 = adapting data to the result.

P.S. Even the parameters of random forest generation are different.

 
Roffild:

But these are peculiarities of modified algorithms.

AlgLib implements classic random forest without modification. It is the same in Spark.

Setting threshold to 0.5 = adapting data to the result.

P.S. Even the random forest generation parameters are different...

i think i have algLib too )

here's a description, i don't know how "classic" it is

http://alglib.sources.ru/dataanalysis/decisionforest.php

��� �������� ������� - ���������� ����������
  • alglib.sources.ru
��� �������� �������� ������� �������� ��������� ������������� � ��������� RDF, � ����� �������� ����������� ��� ������ � ������ �������� ������� � ������ ��� �������� �������� �����. ����� ������� ���� �������� ����������� ���������� ������ �� ����� ��������� ����������� ������� ������������� � ���������. ��� ���������� ������ ����������...
Reason: