Machine learning in trading: theory, models, practice and algo-trading - page 1488

 
Aleksey Vyazmikin:

We have to assume that all we can do is develop an algorithm for the best fit of the data, because we do not know the future and there are many variations, even based on the available predictor values. And if we are lucky, we will be able to find a pattern that will continue to exist for some time, that's why it is important to look for such a pattern with certain criteria, and logic suggests that at least it should be a pattern that occurs throughout the entire sample.

There is one pattern on the market and it will always be - time cycles, periods: trading session, day, week, ..., and their half-periods. These cycles are indestructible in principle, form a complex temporal structure and determine the volume of a sample to work with. Having determined the price behavior inside this hierarchical structure, the trading system will always work.

 
The market is not fractal in itself. It has properties of self-similarity only within time periods, forming in them certain structures. It is impossible to choose the volume of a tick or any other sample on a sphere - it should be some certain value satisfying the enclosed time cycles.
 
Aleksey Vyazmikin:

Standard algorithms are designed to work with stationary phenomena, closed systems, so there any information is considered a priori useful and does not evaluate it in terms of randomness, but only the possibility of use for the task (classification by target), but we have a lot of noise and I suggested a logical way to combat it.

I.e. the uniformity of successful trades in the training area?
Everything is fine there, for the fitting is exactly for training, down to 0% error.

I suppose it must be regularization/degrading of the model by decreasing its depth or by other methods. And stop for example at 20% error in the training area.

I think there is only one way - after each added node version run all data through resulting part of the tree and analyze balance line.

Number of versions = (number of features * number of nodes in the tree * 3 (if divided by quartiles)) * number of trees

It will take a very long time to calculate, I'm afraid even longer than the NS.

 
Alexander_K:

There is one pattern in the market and it will always be - time cycles, periods: trading session, day, week, ..., as well as their semi-periods. These cycles are indestructible in principle, form a complex temporal structure and determine the sample volume to work with. By identifying the behavior of price within this hierarchical structure, the trading system will always work.

I do not deny the importance of time, but it is not enough for creation of a model - we need other values influencing the price.

 
Aleksey Vyazmikin:

I don't deny the importance of time, but it's not enough to create a model-we need other variables that affect price.

That's enough.

It is within the time cycles that the Grail sits. The structure in one time cycle is part of the structure in another.

If you work with the same sample sizes that correspond to different strictly defined time periods, these nested structures are just like on the palm of your hand.

Can't the NS handle it? I did it in my TS without a neural network.

 
elibrarius:

I.e. the uniformity of successful trades in the training area?

Personally, I assess the financial result for each year (now - 5 years), taking into account the drawdown and recovery factor, and well as other evaluation criteria. At the moment I do not even look at the classification, because there is a trend strategy, and even with 35% correct classification can be profit at the end of the year (other period).

elibrarius:


Everything is fine there, for the fitting is exactly for training, down to 0% error.

The question is how many trees are used for this, and essentially what memory the model has. One tree, with a depth of 6 splits, can't do that kind of fitting...


elibrarius:

I guess it should be by regularizing/loading the model with depth reduction or other methods. And stop for example at 20% error in the training area.

Restriction on splits and completeness I already use, and yes, it should be used in training.


elibrarius:

I think there is only one way - after each node version added, run all the data through the resulting part of the tree and analyze the balance line.

Number of versions of a node = (number of features * number of nodes in the tree * 3 (if divided by quartiles)) * number of trees

This will take a very long time to compute, I'm afraid even longer than NS.

This will be more efficient, which is more important, and the end result will be more marketable models.

At the moment I spend about 15 days for the calculation - I get about 800 unique leaves and on average 8 of them, half of which are similar, which show a stable result on time intervals (and checking still takes not a little machine time). I.e. slowing down the calculation of 800/8 by a factor of 100 will even give a commensurate result.

 
Alexander_K:

Enough is enough.

It is within the time cycles that the Grail sits. The structure in one time cycle is part of the structure in another.

If you work with the same sample sizes that correspond to different strictly defined time periods, then these nested structures are like the palm of your hand.

Can't the NS handle it? I did it in my TS without a neural network.

I do not get a grail from it, although I work with structures and similarity of fractals, i.e. nesting of time in different timeframes. This is not enough, maybe I have not realized everything yet.

NS is a tool, the human brain may or may not find a solution faster and more accurate...

 
Aleksey Vyazmikin:

Personally, I assess the financial result for each year (now - 5 years), taking into account the drawdown and recovery factor, and well as other evaluation criteria. At the moment I do not even look at the classification, because there is a trend strategy, and even with 35% correct classification can be profit at the end of the year (other period).

The question is how many trees are used for this, and essentially what memory the model has. One tree, with a depth of 6 splits, can't do that kind of fitting...


The split and completeness constraint is something I already use, and yes, it should be used in training.


It will be more efficient, which is more important, and the end result will be more marketable models.

At the moment I spend about 15 days for calculation - I get about 800 unique leaves and on average 8 of them, half of which are similar, which show stable results on time intervals (and checking still takes not a little machine time). I.e. slowing down the calculation of 800/8 by a factor of 100 will even give a comparable result.

It looks like you are doing valking forward testing.
Me too, but by hand. I think this is the best way to evaluate models.

I haven't found a model that is stable over time yet. Moving six months/years forward/backward the models already start to perform poorly or drain. Even newly trained on the same features and with the same model parameters. I.e. the importance of the features changes as well.

 
elibrarius:

Sounds like you are doing valking forward testing.
Me too, but manually. I think that's the best way to evaluate models.

Haven't found a time stable model yet. Moving six months/years forward/backward the models already start to perform poorly or drain. Even newly trained on the same features and with the same model parameters. I.e. the importance of the features changes as well.

That is why it is necessary to take it all into account when training, and make splits, taking into account if not the balance, then with an estimate of the probability of classification accuracy. The part that is questionable, just have to go either to the trade ban or to the cornered probability - 99%, then it can be filtered out when applying the model.

 
Aleksey Vyazmikin:

That is why it is necessary to take it all into account when training, and make splits, taking into account if not the balance, then with an estimate of the probability of classification accuracy. The part that is doubtful should be prohibited for trading or in the trailing probability - 99%, then it can be filtered out in the application of the model.

The splits are made based on the probability of classification. More accurately, not by probability, but by classification error. Because everything is known on the training drill, and we have not probability, but exact evaluation.
Although there are different separation fi ries, i.e. measures of impurity (left or right sampling).
Reason: