Machine learning in trading: theory, models, practice and algo-trading - page 2106

 
Vladimir Perervenko:

Where to?

You know, not to minimize RMSE or whatever, but to put your fitness fund there.

 
Vladimir Perervenko:

How do you do that?

I'm just making a prediction of the model 500 points ahead

4 sine waves (model) to predict very simply, in fact a linear prediction

 
mytarmailS:

I deleted it, I thought no one is interested, I can send you the code, but you need to translate it into a readable form

By the way, I faced the instability of the annealing method, I don't even know how to work with it, very unstable results, parameters jump a lot...


I've come to the following way

First I randomly initialize the starting point,

then when some solution is found I save it

Then I do it again, but with the starting parameters of the solution I've found, and so on...

Please drop me a line.

The annealing is unstable. Use rgenout. Tested, reliable.

Models apply loss function. Write your own and if the model allows you to insert your loss function, try it.

 
mytarmailS:

I just make a prediction of the obtained model for 500 points in advance

but I think to trade only the first 1-2 deals

and need to learn how to find quality parameters

 
Maxim Dmitrievsky:

It is possible to use two differently directed models.

I tried to learn it separately by my basic strategy - the results were worse, I think it's because of unbalanced sampling - too many zeros are obtained and training is based on them.

I want to try another version - to teach the direction using a separate model. It will turn out that the first model is trained on volatility, and the second on its vector. But again, the sample size should be large.

 
Aleksey Vyazmikin:

I tried to teach my basic strategy separately - the results were worse, I think due to unbalanced sampling - too many zeros are obtained and training is based on them.

I want to try another option - to teach the direction with a separate model. It will turn out that the first model is trained on volatility, and the second on its vector. But again, the sample size should be large.

For unbalanced classes you can use oversampling. I was spinning both 2 and 3 models, there is essentially no difference
 
Maxim Dmitrievsky:
For unbalanced classes you can use oversampling. I've tried both 2 and 3 models, there's essentially no difference.

I.e. duplicate rows with target "1"? Tried it - my result hardly changed at all on CatBoost. Probably need to add some noise.

 
Aleksey Vyazmikin:

I.e. duplicate lines with target "1"? I tried it - my result hardly changed at all on CatBoost. Maybe I need to add some noise.

Do not duplicate. Googling oversampling such as SMOTE. I also do not learn with a big imbalance. After oversampling it's fine.
 
Aleksey Vyazmikin:

I.e. duplicate lines with target "1"? I tried it - my result hardly changed at all on CatBoost. Maybe I need to add some noise.

This is how it should be. Balancing on NS classes is necessary. The trees will do just fine.
 
Maxim Dmitrievsky:
Do not duplicate. Googling oversampling, such as SMOTE. I also do not learn when the imbalance is large. After oversampling it's fine.

Well yes, essentially adding noise to the predictor metrics. This could affect the quantization bounds by increasing the selection of areas with ones, but by idea the same effect should be when adding duplicates, the only thing I assume is that duplicates are cut by the CatBoost algorithm before starting training (need to verify), then yes - an option.

Reason: