Discussion of article "Advanced resampling and selection of CatBoost models by brute-force method" - page 4
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
is not to look for patterns in the future, but to look for dependencies in a series. The sequence is not important. You can search in the middle and test front and back, it won't change anything
it's so simple to understand that it doesn't require further explanation.
the advantage is that the pattern found may fade over time. In this case, learning from recent data is preferableIt's not easy. It always seems that the closer, the truer. Substitution of concept. Actually the same for the task of finding patterns).
This is not an abstract series. There are obvious "dependencies" (the same word, but the meaning is different for understanding) from left to right (from the past to the future), but not vice versa. There are hardly any scientific publications on quote forecasting, where they would do tests on the past.
If the signs had a linear trend or any other time dependence, it would be correct. The model from the article does not take time into account in any way, consistency is not important
And if you look at more recent econometric approaches like bootstrap or neural networks, sequences are mixed up there. I.e. there are no time dependencies.
В первую очередь необходимо провести кластеризацию исходных данных, включая метки классов
I think it leads to peeking.
Run it on a demo account with a signal, for a month to test it.
This is not an abstract series. There are obvious "dependencies" (the same word, but the meaning is different for understanding) from left to right (from the past to the future), but not vice versa. There are hardly any scientific publications on quotation forecasting, where they would do tests on the past.
On forecasting I have not met, but on research then 13 years on minutes 4 million points. The CaP index. From '84 to '96. The beginning of econophysics. They proved its non-stationarity, the presence of SB, and its similarity to physical processes.
I think it leads to peeking.
Run it on a demo account with a signal for a month to test it.
The bot source is attached, you can test it.
There is no peeking.
The bot source is attached, you can test it.
There's no peeking.
Another point.
You choose the one that gives the best result on the test from 50 random trainings. This could be called fitting to the test. It may not be as good on new data.
You should rather do averaging from all 50 models.
Thought about it some more. I agree.
Another point.
You choose the one that gives the best result on the test out of 50 random training sessions. This could be called fitting to the test. It may not be as good on new data.
You should rather do averaging from all 50 models.
I run another test on earlier data, an independent test. If the result is bad, I throw it away
For example, the training is done in 2 months, the model is selected over a one-year period. Then an independent test - 5-10 years.
I outlined the approach in the article, but there is still room for improvement.
I don't see the point of averagingI run another test on the earlier data, independent. If the result is bad, I throw it away
I have outlined the approach in the article, but there is still room for improvement.
Does the worst model fail? And the middle one?
Is the worst model draining? And the middle one?
differently
differently