Machine learning in trading: theory, models, practice and algo-trading - page 3678
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Interesting channel
https://www.youtube.com/watch?v=WxerKohm2pI
It's kind of a simple chip selection. With casteurisation.
Picked up by the forest. Bi-forest we'll call it))))
It's kind of a simple chip selection. With casteurisation.
Picked up by the forest. Bi-forest we'll call it))))
If someone uses cross-validation or Walking Forward, there is a variant of solving the problem with peeking ahead better than the embargo period, which was advised by Prado.
Save the opening time and closing time for each trade.
Then when selecting rows, for example, the next period starting from the 2000th row is tested. Remember the time of its opening. Then we read the past history for markup and simply skip lines with closing time greater than the opening time of the 2000th line, i.e. lines closed in the future in relation to the 2000th line.
I did this thing after another time when I forgot to change the embargo and started to get beautiful OOS.
Pros: as well as the embargo allows you to get rid of peeking and gets data close to the beginning of the test, not with reserve/guessing, as in the embargo.
I couldn't understand what was written.
Couldn't understand what was written.
A deal from line 2000 opened for example 2024-01-18 00:00:00, and closed 2024-01-19 12:33,
The deal from line 1999 opened on 2024-01-17 23:58 and closed on 2024-01-18 16:00 (later than the opening of the deal from line 2000) in + or in -. Using it for training you know the future 16 hours ahead, (and you can open 1000 trades in the same direction and like a cool win, but only on the test), so it can not be used for training with testing crossvalidation.
And the deal from line 1988, for example, closed at 2024-01-17 23:44 - before the opening of the 2000th line. It can be used for tests, no peeking.
This technique is specific for crossvalidation, when a lot of data is taken and then cut into pieces, which are then glued together by OOS. If you train these chunks directly from the tester, for example, once a week, you immediately have only closed deals at any point in time and there is no need to sift them out.
Read more about the embargo period here http://web.archive.org/web/20210413104803/https://dou.ua/lenta/articles/ml-vs-financial-math/ (the site is no longer working, but the link remains and the web archive).
Cross-validation should also be done correctly
It's not enough to just test the performance of algorithms on some piece of data "in the future". Standard K-Fold cross-validation will accurately predict the "past", but specialised cross-validation for time series will break IID sampling, especially if the chips are counted with a lag.
There are several ways to improve cross-validation: for example, alternate training and testing windows chronologically with "gaps" in between. All to keep the samples as independent as possible (example in the illustration below). The second method is combinatorial cross-validation, which you can read more about in Dr Lopez de Prado's book.
And you can just study the question about crosvalidation and find out that for time series it is divided differently. You don't need a Prado, it's kind of a well-known classic.
How? Interesting alternatives. For BP, the main thing is not to mix the rows, as in the classics.
I use Walking-Forward myself, but crossvalidation can be useful as well.
How? Interesting alternatives. For BP, the main thing is not to mix strings, as in the classics.
I use valking-forward myself, but cross-validation can also be useful.
Can't agree with that at all - only random string sampling.
And that is not the problem at all.
Classically, we take a file, most importantly a ready file in the sense that all predictors are calculated, and divide it into parts, maybe even by cross validation. It doesn't matter, because in the future real trading will be different: a bar comes and new values of predictors are calculated for new price values.
Therefore, the main thing is that the results when moving one bar forward should coincide, almost coincide with the results on a pre-prepared file, i.e. testing tester or any even primitive imitation of it - the main thing is one bar at a time.
For some reason we believe that this new bar will NOT break all those patterns, which we have found, checked, tested, specified the offset, but on a previously prepared file? Why?
A simpler example.
There is advice everywhere that predictors should be scaled, for example, into the interval [0:1]. The formula for mapping to an interval is based on the min and max values of the predictor. Will the new predictor value ALWAYS be in the interval [min : max]?
the bot I threw in the group works without normalisation (just prices or MA on input), and the model does not stick in one position (0 or 1) on new data.
but it can also work on increments, but then it requires multiple more signs, otherwise it generalises worse.
cv is not used in estimation and in general almost nothing from the classics is used :)