Discussion of article "Advanced resampling and selection of CatBoost models by brute-force method"

 

New article Advanced resampling and selection of CatBoost models by brute-force method has been published:

This article describes one of the possible approaches to data transformation aimed at improving the generalizability of the model, and also discusses sampling and selection of CatBoost models.

A simple random sampling of labels used in the previous article has some disadvantages, such as:

  • Classes can be imbalanced. Suppose that the market was mainly growing during the training period, while the general population (the entire history of quotes) implies both ups and downs. In this case, naive sampling will create more buy labels and less sell labels. Accordingly, labels of one class will prevail over another one, due to which the model will learn to predict buy deals more often than sell deals, which however can be invalid for new data.

  • Autocorrelation of features and labels. If random sampling is used, the labels of the same class follow one another, while the features themselves (such as for example, increments) change insignificantly. This process can be shown using an example of a regression model training - in this case it will turn out that autocorrelation will be observed in the model residuals, which will lead to a possible model overestimation and overtraining. This situation is shown below:


Model 1 has autocorrelation of residuals, which can be compared to model overfitting on certain market properties (for example, related to the volatility of training data), while other patterns are not taken into account. Model 2 has residuals with the same variance (on average), which indicates that the model covered more information or other dependencies were found (in addition to the correlation of neighboring samples).

Author: Maxim Dmitrievsky

 

Hi,

Thanks for the article. I tried it but somehow I dont get the nice equity curve (even for the training period) in MT5 backtest as it is shown in python (see below). When I backtest with your EURUSD EA from your article it works. What can I solve the error?


 
konorti:

Hi,

Thanks for the article. I tried it but somehow I dont get the nice equity curve (even for the training period) in MT5 backtest as it is shown in python (see below). When I backtest with your EURUSD EA from your article it works. What can I solve the error?

Hi, maybe problem with MARKUP for the custom tester for usdjpy, so results are different

 
Maxim Dmitrievsky:

Hi, maybe problem with MARKUP for the custom tester for usdjpy, so results are different

Thanks. I tried with different MARKUPs (higher & lower) and also different timeframes but not really successful. I saw some good results on 4H timeframe/USDJPY but with other forex pairs not really and I tried to redo the test sevaral times without success. Is it possible to somehow filter the trades so the EA is not always in the market but just with strong signals?
 

Hi Maxim,

The current article is okay, but limited computing power and curve fitting are biggest concerns to such traditional methods and hence, usually I stay away from testing such approaches.

Are you interested to write an article on implementation of "MuZero" from DeepMind in Forex?

https://deepmind.com/blog/article/muzero-mastering-go-chess-shogi-and-atari-without-rules

https://medium.com/applied-data-science/how-to-build-your-own-muzero-in-python-f77d5718061a

I am asking this to you since I am a basic level MQL5 programmer and it can take a long time for me to write from scratch which you can probably do easily.

Please let me know your thoughts.


I will define what to write for the following in forex coding and can you convert it to MQL5 code:

  • The  value: how good is the current position?
  • The  policy: which action is the best to take?
  • The  reward: how good was the last action?

Thanks.

MuZero: Mastering Go, chess, shogi and Atari without rules
MuZero: Mastering Go, chess, shogi and Atari without rules
  • deepmind.com
In 2016, we introduced AlphaGo, the first artificial intelligence (AI) program to defeat humans at the ancient game of Go. Two years later, its successor - AlphaZero - learned from scratch to master Go, chess and shogi. Now, in a paper in the journal Nature, we describe MuZero, a significant step forward in the pursuit of general-purpose...
 
I spend a lot time,finally I get what you are doing.Because the label of the ml algorithm is  imbalance,you use a GaussMixtureModel to simulate the price born,then sampled from the model,then you can train a better ml algorithm 
Reason: