Machine learning in trading: theory, models, practice and algo-trading - page 1292

 
Grail:

Volumes help to predict the change of state from trend to flat, but not "without difficulty", in general, predicting the state "trend / flat" is not much more accurate than the direction of the next increment, per unit time, somewhere around 57% in accuracy, what they said about some incredible numbers, clearly the result of an error.

what are those numbers?

 

After all, machine learning is a strange and unpredictable business. Continuing debugging work with CatBoost got a model that works like this (training+test+exam)

Maybe not many deals (346) from 2014-2019, but I got 1299 drawdown during all this time, which is less than 10%. Of course in 2014 there was a strong growth, which may not happen again, but after quite smoothly.

Below is a graph just on the exam sample (conditional, because the sample is smaller than this test)

But I'm not just showing the graphs, it's not uncommon here, and I want to say that I was very surprised when I looked at the contents of the model - there are only 4 predictors out of 38!

TimeH - time in hours

DonProcVisota_M15 - Relative width of the Donchian channel at M15

LastBarPeresekD_Down_M15 - Number of bars since the last time the Donchian channel was crossed

BB_PeresekN_Total_M1 - Number of times the price has crossed iDelta levels during the last x bars

Of course, I have a big number of predictors in my sample, I fractionalize them and then I sip, and it all fits my theory that dividing a sample by greed is not always effective - it is just a method that does not guarantee anything.

These are the kinds of samples I want to collect and pool.

 
Aleksey Vyazmikin:

After all, machine learning is a strange and unpredictable business. Continuing debugging work with CatBoost got a model that works like this (training+test+exam)

Maybe not many deals (346) from 2014-2019, but I got 1299 drawdown during all this time, which is less than 10%. Of course in 2014 there was a strong growth, which may not happen again, but after quite smoothly.

Below is a graph just on the exam sample (conditional, because the sample is smaller than this test)

But I'm not just showing the graphs, it's not uncommon here, and I want to say that I was very surprised when I looked at the contents of the model - there are only 4 predictors out of 38!

TimeH - time in hours

DonProcVisota_M15 - Relative width of the Donchian channel at M15

LastBarPeresekD_Down_M15 - Number of bars since the last time the Donchian channel was crossed

BB_PeresekN_Total_M1 - Number of times the price has crossed iDelta levels during the last x bars

Of course, I have a big number of predictors in my sample, I fractionalize them and then I sip, and it all fits my theory that dividing a sample by greedy principle is not always effective - it is just a method that does not guarantee anything.

That's the kind of model I want to collect and pool.

Quite expectedly, most of the predictors are, in fact, noise or correlated with each other.

Sidewalking is what? Yandex only talks about torrent distribution.
 
elibrarius:
Quite expectedly, most predictors are, in fact, noise or correlated with each other.

Sideways is what? Yandex only talks about torrent distribution.

The idea is not that they are noise, but that some predictors overlap others -- the relationships formed are important and must be generated.

Sideways, this is of course a term I invented for myself - I apply the--random-seed flag with a specific numerical value. True, I don't know what ranges this value has, but I see that it has a significant effect on learning, and this controlled randomization suits me fine.

 
Hello, guys. How do I find the indicator that builds the funds chart by the results of strategy testing in the tester? I can't find it... I remember it. If anyone has it handy throw it at me. Thank you!!!
 
Aleksey Vyazmikin:

Sitting, of course, I invented a term for myself - I apply the--random-seed flag with a specific numeric value. True, I don't know what ranges this value has, but I see that it has a significant effect on learning, and this controlled randomization suits me fine.

Fixing randomness. Usually this is used for reproducibility of results when restarting.
It is desirable that it does not greatly affect the result. Otherwise you get a fitting for a particular randomness. I.e. there appears one more feature (significantly affecting), which must be optimized.
 
Renat Akhtyamov:

what kind of numbers?

I thought I saw above who said that trends/flights are predicted by almost 90%, someone's grandson or apprentice seemed to say

 
Grail:

I thought I saw above who said that trends/flights are predicted by almost 90%, someone's grandson or apprentice said

Yes, 100% that after a flat, there will be a trend. What's there to predict.
 
Grail:

I think I saw above who said that trends/floats are predicted by almost 90%, someone's grandson or apprentice said

Ahhhh

Well if there's no ticks, it's probably a flat in the market, 100%

and if there are a lot of ticks, then it's not a flat
 
elibrarius:
Fixing randomness. This is usually used for reproducibility of results on restarts.
Preferably, it should not greatly affect the result. Otherwise you get a fitting for a particular randomness. I.e. there appears one more feature (significantly affecting it), which must be optimized.

Yes, I need it to reproduce the result later and generate results in general.

Only to the end it is not clear how it works, I understand that this parameter is responsible for randomness of calculation of split results when selecting the best option, but I can not find details anywhere.

And about the fitting... We have to proceed from the fact that everything is a potential fit, and we can only check the stability of connections over time and control the effectiveness of these connections, for example that model consists of 4 trees, each of which is also 4 deep, i.e. because of the small number of combinations the fit here is very effective, and therefore may be some kind of regularity, not just a description of sampling.

Reason: