Discussion of article "Advanced resampling and selection of CatBoost models by brute-force method" - page 12

[Deleted]  
Evgeni Gavrilovi:

added file paths - traing prnew.csv and test prnews.csv

but the received R2 is almost always higher than 0.9, maybe look_back is not set correctly and that's why the received mqh file is wrong, because of which the test in terminal doesn't work.


https:// colab.research.google.com/drive/1eeyRA5bGaFMfX1THnMsL5hwKmxBkqvqP


https://drive.google.com/file/d/1LIRhpk5iU_dYQbefZ-FFQM6XMV_cOh26/view?usp=sharing test data


https://drive.google.com/file/d/18RpJec9EGSCSknwaHsevgHcZuCeoOvP5/view?usp=sharing training data

I'll look at it later, I'm busy at work.

 
Maxim Dmitrievsky:

I'll look at it later while I'm busy at work.

Okay.

Here's that mqh file: https: //drive.google.com/file/d/1UquXcaRJjIR2lxE81P8Pm2BWFQ9uM0N1/view?usp=sharing

The tester shows the following error: 2020.12.01 21:19:23.252 2020.08.03 00:05:00 array out of range in 'cat_model.mqh' (288,51)

 
Maxim Dmitrievsky : Hi, Rasoul. Try to reduce the training set size. It can depends of different settings, but key trick is that then less train size, better generalisation on new data. In the next article I'll try to explain this effect.

Hi Maxim,

I changed the periods of the sets to following,

1. Training Set: from 2018.01.01 to 2019.01.01
This only be used to train GMM.

2. Validation Set: from 2019.01.01 to 2020.01.01
This set will be used in brute force algorithm to find the best model.

3. Test Set: from 2020.01.01 to 2021.01.01

This set only be used to test the best model obtained from brute force algorithm.

Below is a typical result of running the script,

I am attaching the code, so you can take a look at it to find a possible mistake.

[Deleted]  
Rasoul Mojtahedzadeh:

Hi Maxim,

I changed the periods of the sets to the following,

1. Training Set: from 2018.01.01 to 2019.01.01
This only be used to train GMM.

2. Validation Set: from 2019.01.01 to 2020.01.01
This set will be used in brute force algorithm to find the best model.

3. Test Set: from 2020.01.01 to 2021.01.01

This set only be used to test the best model obtained from brute force algorithm.

Below is a typical result of running the script,

.

I am attaching the code, so you can take a look at it to find a possible mistake.

Sometimes just need to change learning interval an settings, then model can capture better dependencies, for example:

LOOK_BACK = 1
MA_PERIODS = [15, 25, 55, 100, 150, 200, 250, 300]

SYMBOL = 'EURUSD'
MARKUP = 0.00010
TIMEFRAME = mt5.TIMEFRAME_H1
START_DATE = datetime(2018, 9, 1)
VSTART_DATE = datetime(2019, 3, 1)
TSTART_DATE = datetime(2019, 7, 1)
STOP_DATE = datetime(2021, 1, 1)


[Deleted]  
Evgeni Gavrilovi:

All right. (chuckles)

Here's that mqh file: https: //drive.google.com/file/d/1UquXcaRJjIR2lxE81P8Pm2BWFQ9uM0N1/view?usp=sharing

The error is listed in the tester: 2020.12.01 21:19:23.252 2020.08.03 00:05:00 array out of range in 'cat_model.mqh' (288,51)

I have this suspicion that you are using a version of the bot from the last article. The bot in this article is different. Check it, there should not be such an error.

R^2 0.9 is good, I often get this too

 
Maxim Dmitrievsky:

Sometimes just need to change learning interval an settings, then model can capture better dependencies, for example:


Thanks for your quick reply!

Looks good with your settings! :)

Best regards,

Rasoul

[Deleted]  
Rasoul Mojtahedzadeh:

Thanks for your quick reply!

Looks good with your settings! :)

Best regards,

Rasoul

look_back just need to set 1, from my experience... and more different MA's. And sometimes need to change learning periods

sometimes need to change number of clusters in GMM from 75 to another... and so on )

maybe need to add better features instead of MA's, but I don't know which exactly... need to experiment

 
Maxim Dmitrievsky:

I have a suspicion that you are using a version of the bot from the last article. The bot in this article is different. Check it, there should not be such an error.

R^2 0.9 is good, I often have it too

only the new version has brute_force function, but the point is different - the received mqh file gives an error array out of range and it does not allow to test the bot with high R^2.

[Deleted]  
Evgeni Gavrilovi:

only in the new version there is a function brute_force, but the matter is different - the received mqh file gives an error array out of range and it does not allow to test the bot with high R^2.

I'm talking about the EA file that you compile.

 
Maxim Dmitrievsky:

I'm talking about the EA file you're compiling.

Yes, that's it.

it says

#include <MT4Orders.mqh>

#include <Trade\AccountInfo.mqh>

#include <cat_model.mqh>

and the most important thing is that when loading mqh directly from jupyter notebook everything works fine, I was surprised by that