Machine learning in trading: theory, models, practice and algo-trading - page 1383

 
elibrarius:

I apply not division (P[i] / P[0]), but subtraction (P[i] - P[0]), i.e. not relative price change, but absolute. I preliminarily remove outliers (1% by quantity from the largest and smallest values).

Does division give any advantages? I'm currently using a forest that doesn't need normalization and scaling.

You do the shifting, but not the scaling
 
Yuriy Asaulenko:
You are doing a shift, but not a scaling.
Yes. Trees/forests don't need scaling either.


That is, there are no particular advantages to division over subtraction. Except in the case of logarithm, which will make the data more similar to a normal distribution, as Alexei Nikolaev said. I.e. the density between points will change, but not the order. But I don't see any advantages in this case either - the tree will do the split not by one level, but by another - i.e. it will adjust for any distribution. Since the tree is essentially a simple memorizer.

 
Take the price as a percentage of the last 100 prices. not that ?
 
Evgeniy Chumakov:
Take the price as a percentage of the last 100 prices. wrong ?
This is similar to division
 

The graph should not be lost at all, the gradients are not suitable for the role of features in any form (except maybe their amount, specially selected, which should also take time)

The chart should be divided into levels, the same pieces or maybe not the same, vertically, and each piece should be normalized to a range, i.e. analogous to the usual price levels. If we don't normalize the entire series or in a sliding window, we will again lose very important information.

But another problem arises when dividing into levels - when the price is in the boundary conditions or when you take a lot of latest prices for training, some of which are at one "level" and the others are at another. I have not yet figured out how to do it.

I may need to mirror levels in relation to each other in order to make transitions between them painless

It is natural that the unconverted graph will still be the most informative. Therefore you should be very careful with any transformations - the quality of the model will decrease in proportion to your ignorance. Hence all tales about 50% error being normal and other nonsense, the model simply does not learn anything from such "chips".

 

Something complicated, I don't know why, and I really don't know how to do it.

Yuri works well with simple gradients, too

 

x.append((SD.history[i-j][c.c]/SD.history[i][c.c]-1)*1000)

it makes no sense, each subsequent feature contains half of useful information from previous feature, i.e. they, 1: strongly correlate, 2: the feature with the biggest lag contains all variance, which is contained in previous features, i.e. they do not give any information increment

the result will be the following: importance of the return with the biggest lag will be the biggest (more variance, more information gain), and this return contains all variance of other features

 
elibrarius:

Yuri is doing well with simple increments too.

it can't be because it never can be

 
elibrarius:

It's not clear why and it's really not clear how to do something like this.

Imagine the situation.

The price in the market reflects the balance of supply and demand mostly, at different historical moments

You're sticking with a limited normalized portion of history as a feature, reflecting only the current situation

Different historical moments are merged by your model into one normalized faceless flow (all market situations are equated to each other), which no longer contains any historical sequences or a fair price

You are left with a pile of identical template normalized patterns, which overlap when increasing the depth of history, leading to a 50/50 error. You don't teach the model anything, you throw out the most important information when preprocessing, because you do everything incorrectly, i.e. following books, designed for absolutely different tasks and describing completely different processes.

A quote is not a signal, it is a completely different process, you can't work with it that way. Yuri is a radiophysicist, for example, and therefore he has nothing else to do.

In this kind of training, you take time into account, but you don't take price into account. The price level (above/below) in the market is even more important than time, because it reflects the balance of supply/demand, which reflects all the basic market information.
 
Maxim Dmitrievsky:

imagine the situation.

The price in the market reflects the balance of supply and demand, basically, at different historical moments

You put a limited normalized section of history as a feature, reflecting only the current situation

Different historical moments are merged by your model into one normalized faceless flow (all market situations are equated to each other), which no longer contains any historical sequences or a fair price

You are left with a pile of identical template normalized patterns, which overlap when increasing the depth of history, leading to a 50/50 error. You don't teach the model anything, you throw out the most important information when preprocessing, because you do everything incorrectly, i.e. following books, designed for absolutely different tasks and describing completely different processes.

A quote is not a signal, it is a completely different process, you can't work with it that way. Yuri is a radiophysicist, for example, and so he has nothing else to do.

In this kind of training, you take time into account, but you don't take price into account. The price level (above/below) in the market is even more important than time, because it reflects the balance of supply/demand, which reflects all the basic market information.

It's a pity that you can't "Like" it.

Reason: