Machine learning in trading: theory, models, practice and algo-trading - page 1383

You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
I apply not division (P[i] / P[0]), but subtraction (P[i] - P[0]), i.e. not relative price change, but absolute. I preliminarily remove outliers (1% by quantity from the largest and smallest values).
Does division give any advantages? I'm currently using a forest that doesn't need normalization and scaling.
You are doing a shift, but not a scaling.
That is, there are no particular advantages to division over subtraction. Except in the case of logarithm, which will make the data more similar to a normal distribution, as Alexei Nikolaev said. I.e. the density between points will change, but not the order. But I don't see any advantages in this case either - the tree will do the split not by one level, but by another - i.e. it will adjust for any distribution. Since the tree is essentially a simple memorizer.
Take the price as a percentage of the last 100 prices. wrong ?
The graph should not be lost at all, the gradients are not suitable for the role of features in any form (except maybe their amount, specially selected, which should also take time)
The chart should be divided into levels, the same pieces or maybe not the same, vertically, and each piece should be normalized to a range, i.e. analogous to the usual price levels. If we don't normalize the entire series or in a sliding window, we will again lose very important information.
But another problem arises when dividing into levels - when the price is in the boundary conditions or when you take a lot of latest prices for training, some of which are at one "level" and the others are at another. I have not yet figured out how to do it.
I may need to mirror levels in relation to each other in order to make transitions between them painless
It is natural that the unconverted graph will still be the most informative. Therefore you should be very careful with any transformations - the quality of the model will decrease in proportion to your ignorance. Hence all tales about 50% error being normal and other nonsense, the model simply does not learn anything from such "chips".
Something complicated, I don't know why, and I really don't know how to do it.
Yuri works well with simple gradients, too
x.append((SD.history[i-j][c.c]/SD.history[i][c.c]-1)*1000)
the result will be the following: importance of the return with the biggest lag will be the biggest (more variance, more information gain), and this return contains all variance of other features
Yuri is doing well with simple increments too.
it can't be because it never can be
It's not clear why and it's really not clear how to do something like this.
Imagine the situation.
The price in the market reflects the balance of supply and demand mostly, at different historical moments
You're sticking with a limited normalized portion of history as a feature, reflecting only the current situation
Different historical moments are merged by your model into one normalized faceless flow (all market situations are equated to each other), which no longer contains any historical sequences or a fair price
You are left with a pile of identical template normalized patterns, which overlap when increasing the depth of history, leading to a 50/50 error. You don't teach the model anything, you throw out the most important information when preprocessing, because you do everything incorrectly, i.e. following books, designed for absolutely different tasks and describing completely different processes.
A quote is not a signal, it is a completely different process, you can't work with it that way. Yuri is a radiophysicist, for example, and therefore he has nothing else to do.
In this kind of training, you take time into account, but you don't take price into account. The price level (above/below) in the market is even more important than time, because it reflects the balance of supply/demand, which reflects all the basic market information.imagine the situation.
The price in the market reflects the balance of supply and demand, basically, at different historical moments
You put a limited normalized section of history as a feature, reflecting only the current situation
Different historical moments are merged by your model into one normalized faceless flow (all market situations are equated to each other), which no longer contains any historical sequences or a fair price
You are left with a pile of identical template normalized patterns, which overlap when increasing the depth of history, leading to a 50/50 error. You don't teach the model anything, you throw out the most important information when preprocessing, because you do everything incorrectly, i.e. following books, designed for absolutely different tasks and describing completely different processes.
A quote is not a signal, it is a completely different process, you can't work with it that way. Yuri is a radiophysicist, for example, and so he has nothing else to do.
In this kind of training, you take time into account, but you don't take price into account. The price level (above/below) in the market is even more important than time, because it reflects the balance of supply/demand, which reflects all the basic market information.It's a pity that you can't "Like" it.