Machine learning in trading: theory, models, practice and algo-trading - page 2563

 
Maxim Dmitrievsky #:

Then you need to be more specific.

I am thinking of writing an article where I will tell you in more detail what I do.

Here I wanted to discuss similar approaches, but it turned out that there is no interest.

In brief, here is what I do in stages:

1. By means of CatBoost I save different types of quantum tables with different number of "quanta" (forced pre-splits).

2. I analyze each quantum by a script in terms of stability and predictive ability of the indicator.

2.1 Passing the threshold for completeness and accuracy from the entire sample.

2.2 Evaluate the stability of the deviation of the target predictor indicators from the target predictor on the sample area - I remove 7 points and sift out by RMS.

I select the best quanta from all tables for each predictor taking into account their non-overlapping in the range space where quantization took place.

4. I create a new sample (two types combined across all quanta and none) where the predictor of the quanta has a signal of 0 or 1.

5. I exclude predictors that have a similar signal in the sample.

6. Teaching the model.

If after point 5 we still do a robustness check on the test and exam sample, and select only those predictors that show a satisfactory result, then the training results improve considerably. This is a kind of cheat, but whether it is worth using or not is a matter of experimentation. My assumption is that the longer the indicators are stable, the more likely they will continue to be stable.

If you have questions about a particular step - ask, I'll try to give more information.

P.S. You can also just save the sampled quantum table, exclude inefficient predictors, and train on a regular sample - this will also improve learning.
 
Aleksey Vyazmikin #:

I'm thinking of writing an article where I'll tell you in more detail what I'm doing.

Here I wanted to discuss similar approaches, but it turned out that there was no interest.

In brief, here is what I do in stages:

1. By means of CatBoost I save different types of quantum tables with different number of "quanta" (forced pre-splits).

2. I analyze each quantum by a script in terms of stability and predictive ability of the indicator.

2.1 Passing the threshold for completeness and accuracy from the entire sample.

2.2 Evaluate the stability of the deviation of the target predictor indicators from the target predictor on the sample area - I remove 7 points and sift out by RMS.

I select the best quanta from all tables for each predictor taking into account their non-overlapping in the range space where quantization took place.

4. I create a new sample (two types combined across all quanta and none) where the predictor of the quanta has a signal of 0 or 1.

5. I exclude predictors that have a similar signal in the sample.

6. Teaching the model.

If after point 5 we still do a robustness check on the test and exam sample, and select only those predictors that show a satisfactory result, then the training results improve considerably. This is a kind of cheat, but whether it is worth using or not is a matter of experimentation. My premise is that the longer the indicators are stable, the more likely they will continue to be stable.

If there are questions on a particular step - ask, I will try to give more information.

P.S. You can also just save the sampled quantum table, exclude inefficient predictors, and train on a regular sample - this will also improve learning.

What are quantum tables? Tree partitioning tables? I've never done such a thing.

better article with examples
 
Maxim Dmitrievsky #:

What are quantum tables? Tree partitioning tables? I've never done such a thing.

a better article with examples

Quantum tables are partitioning the predictor into bounds/ranges, which are then involved in learning. I wrote about it many times before here.

 
Aleksey Vyazmikin #:

Quantum tables are a partitioning of the predictor into bounds/ranges, which are then involved in training. Yes, I wrote about it many times before here.

Oh, I see. It seems like fetch quantization is used only to speed up learning. Or is it a hard one? I'm just a supporter of the classic approach plus a little bit of my own twists.
 
Vladimir Baskakov #:
You guys haven't shown anything useful yet, just blabber. Nerds

Keep watching.

 
Aleksey Vyazmikin #:

Quantum tables are a partitioning of the predictor into bounds/ranges, which are then involved in training. Yes, I've written about it many times before.

The whole point is what we quantize, how and for what purpose.

 
Aleksey Nikolayev #:

The whole point is what we quantize, how we quantize, and for what purpose.

I once tried quantization by Equity Monotonicity plots, when a predictor is used instead of time. I did not see anything particularly good.

 
Has anyone tried to attach the montichol paradox to trading/decision making?
 
Maxim Dmitrievsky #:
Ah, I see. It seems that chip quantization is only used to speed up learning. Or is it a hard one? I'm just a supporter of the classic approach, plus some of my own twists.

Learning acceleration is one of the advantages, but there's also the effect of aggregation of similar predictor states. Roughly speaking, I regard the partitioning section as a separate binary predictor that allows to remove noise from the basic predictor.

In addition to the effect of improved learning, I get to reduce the number of trees in the model that give similar results, and thus reduce the noise of the model.

I am experimenting with rigid frame tables, this is when the partitioning is not based on data, but on given criteria, for example Fibonacci levels...

 
Aleksey Nikolayev #:

The whole point is what we quantize, how we quantize, and for what purpose.

That's what I wrote - that the purpose is to identify a stable pattern that gives a statistical advantage in a particular area. And we quantize predictors - any predictor.

And "how" to do it is an open question - so far only a search of pre-made tables made by empirical assumptions or statistical partitioning algorithm CatBoost.

In the figure there are 3 "quanta" - most likely the middle range, which has some kind of stat advantage.

Reason: