Machine learning in trading: theory, models, practice and algo-trading - page 2107

 
elibrarius:
It should be. Balancing by NS class is necessary. The trees can handle it.

Well, they don't always cope - I've written before.

 

I think I burned a profitable Expert Advisor (training approach) in my article:

The chart shows the financial result of the model at the end of each month, if the first model is trained on 12 months and then adds to it the history of each new month - gluing the futures Si contract for USDRUB_TOM.

 
Aleksey Vyazmikin:

Well, not always cope - I wrote earlier.

I think increasing the depth of the trees will help just as much as balancing.
 
Aleksey Vyazmikin:

Well, yes, essentially adding noise to the predictor indices. This may affect the boundaries of quantization, increasing the allocation of areas with ones, but by idea the same effect should be with the addition of duplicates, the only thing I assume that the duplicates are cut by CatBoost algorithm before starting training (must verify), then yes - an option.

More likely quantization will negate that noise. If a column has 10000 different values, quantizing to 255 quanta will get an average of 40 different values into one quantum. Or another example - if there were originally 1000 examples, add noise with getting 10000 examples, then quantize it to 255 different quanta/values - unnecessary work in my opinion with this addition of noise.


I looked at the code recently - I didn't see any removal of duplicates. Rather the opposite, duplicates are made from 40 different examples by combining into 1 quantum.

 
elibrarius:
I think increasing the depth of the trees will help just as much as balancing.

You can try increasing the depth as well. You should also decrease the learning rate in parallel - it also improves the result on unbalanced samples.

elibrarius:

Rather quantization will negate this noise. If there are 10000 different values in a column, quantization to 255 quanta will get an average of 40 different values into one quantum. Or another example - if there were originally 1000 examples, add noise with getting 10000 examples, then quantize your own to 255 different quanta/values - unnecessary work in my opinion with this addition of noise.

There are different quantization methods used there, including taking into account the crowding of objects in the range.

elibrarius:

I looked at the code recently - I didn't see any removal of duplicates. Rather the opposite of 40 different examples make duplicates by combining into 1 quantum.

If you found the process of quantization (boundary setting) in the code, can you post this code? There must be functions there?

 

What does depth enhancement have to do with it?

you have a large point cloud of one class and a few samples of the other with side-to-side (or maybe even inside), which are never executed.

The second class should be bloated to a sane size, or use one class classification algorithms

 
Maxim Dmitrievsky:

What does depth enhancement have to do with it?

you have a large point cloud of one class and a few samples of the other with side-to-side (or maybe even inside) that never execute.

The second class needs to be inflated to a sane size.

Increasing the depth will help select areas with small numbers of samples in the leaves, another thing is that the percentage of leaves with zeros may remain the same, and then subsequent trees will again obscure those units. When training such samples, you can see how Recall goes to zero in the middle of training, and then goes back to small percentages again.

Can you inflate if I give a sample? If the method works, then I'll think how to implement it in MT5.

 
Aleksey Vyazmikin:

Increasing the depth will help select areas with small number of samples in the leaves, another thing is that the percentage of leaves with zeros may remain the same, and then subsequent trees will again obscure those units. When training such samples, you can see how Recall goes to zero in the middle of training, and then goes back to small percentages again.

Can you inflate if I give a sample? If the method works, then I'll think about how to implement it better in MT5.

I can. It's all this stuff about leaves and other stuff. The classes must be balanced
 
Maxim Dmitrievsky:
I can. It's all bullshit about leaves and stuff. Classes need to be balanced

Here's a sample - broken into 3 parts, I understand that you need to modify only train.csv?

Target column "Target_100" - the last 4 columns are not involved in the training (there you can focus on the column with the dates) - to build the balance needed.

Файл из Облака Mail.ru
Файл из Облака Mail.ru
  • cloud.mail.ru
Облако Mail.ru - это ваше персональное надежное хранилище в интернете.
 
Aleksey Vyazmikin:

I think I burned a profitable Expert Advisor (training approach) in my article:

The chart shows the financial result of the model at the end of each month, if the first model is trained on 12 months and then add to it the history of each new month - gluing the futures Si contract for USDRUB_TOM.

the profitable one's balance goes UP at the same angle

or geometrically, if reinvesting

Reason: