Machine learning in trading: theory, models, practice and algo-trading - page 3163
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
I found another problem.
In general there is a fit not only to the size of the window, but also to the beginning of the window. Small offsets make a big difference in the result. There are no strong features, everything is on the edge of 50/50 ± 1-2%.I found a good variant with training once a week on 5000 lines of M5 (3,5 weeks). And decided to shift all the data to 300 lines - like training not on Saturdays, but on Tuesdays. As a result, the model on OOS from profitable became unprofitable.
These new 300 lines ( about 8% of the total) brought out other chips and other splits, which became better for slightly changed data.
Repeated the shift by 300 for 50000 rows. It would seem to be only 0.8% of new rows. But the changes on the OOS are significant too, though not as strong as with 5000 rows.
This seems to be a common problem for trees - lack of robustness.
There is a faint hope that some improvement is possible by moving to more elaborate (in terms of matstat) split rules. This is something like the same "difference trees" I gave a link to an article about recently. Or something like the CHAID chi-square statistics.
Of course, this is not a panacea and it is not a fact that these specific examples of split rules will work for us at all. But it is an example that split rules can and should be treated creatively.
The main idea to take from the matstat is to stop tree growth when a critical p-value is reached, not for some left-wing reasons.I found another problem.
In general there is a fit not only to the size of the window, but also to the beginning of the window. Small offsets make a big difference in the result. There are no strong features, everything is on the edge of 50/50 ± 1-2%.I found a good variant with training once a week on 5000 lines of M5 (3,5 weeks). And decided to shift all the data to 300 lines - like training not on Saturdays, but on Tuesdays. As a result, the model on OOS from profitable became unprofitable.
These new 300 lines ( about 8% of the total) brought out other chips and other splits, which became better for slightly changed data.
Repeated the shift by 300 for 50000 rows. It would seem to be only 0.8% of new rows. But the changes on the OOS are significant too, though not as strong as with 5000 rows.
What model?
What model?
wooden
Interesting article about trees and reinforcement learning in them.....
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4760114/
============================
main idea
2.2 Motivation
In short, the proposed reinforcement learning tree (RLT) model is a traditional random forest model with a special type of selection of separation variables and suppression of noise variables. These features are made available by implementing the reinforcement learning mechanism at each internal node. Let us first consider a checkerboard example demonstrating the impact of reinforcement learning: Assume that X ~ uni f [ 0, 1 ] p and E ( Y | X ) = I { I ( I ( X (1) 0 .5) = I ( X (2) >0 .5)} , so that p 1 = 2 and p 2 = p -2 . The difficulty in estimating this structure using the usual random forests is that neither of the two strong variables shows insignificant effects.The immediate reward, i.e., the reduction in prediction errors, from partitioning into these two variables is asymptotically identical to the reward obtained by partitioning into either of the noise variables. Hence, when p is relatively large, it is unlikely that either X (1) , or X (2) will be chosen as the separation variable. However, if we know in advance that splitting on either X (1) , or X (2) will yield significant future benefits for later splits, we could confidently force a split on either variable regardless of the immediate rewards.
=========================
Well, and package on R accordingly
https://cran.r-project.org/web/packages/RLT/RLT.pdf
wooden
What's the exact name? Or is it homemade?
I have been using different "wooden" models for many years and have never seen anything like this.
I can force it, but I don't know by which fiche one should X1, X2, or X157
What's the exact name? Or is it homemade?
I have been using different "wooden" models for many years and have never seen anything like this.
You need to find a coreset that has a pattern and train on it only. It can be on any piece of the graph, it is searched through enumeration. Otherwise, noise does not allow the model to concentrate. The trend now is coresets - small representative subsamples. It's pretty simple and yields results.
How to search? Go through all chunks (e.g. 100 by 5000 pp) and see how successfully the other 500,000 rows on that model predict the other 500,000 rows?
How to search? Go through all chunks (e.g. 100 by 5000 pp) and see how successfully the other 500,000 rows on that model predict?
Yeah, you can randomly pull samples instead of chunks in a row, that's more correct.