Machine learning in trading: theory, models, practice and algo-trading - page 3504

 
Aleksey Vyazmikin #:

You can read my articles on this topic.

At the very beginning of your article it is written:

" In this article we will not consider the possibility of applying quantisation to trained neural networks in order to reduce their size, as I have no experience in this matter at the moment.".

 
Maxim Dmitrievsky #:

You have it written at the very beginning of the article:

" This paper will not consider the possibility of applying quantisation to trained neural networks in order to reduce their size, as I have no experience in this matter at the moment.".

That's right - only theoretical knowledge, that's why I didn't consider peculiarities of methods in NS, but considered a general case, which is used in NS as well.

 
Aleksey Vyazmikin #:

That's right - only theoretical knowledge, so I didn't consider the specifics of methods in NS, but I considered the general case, which is used in NS as well.

I do not know other applications of quantisation in CS.

 
Aleksey Nikolayev #:

Imho, we can say about the dual classification problem. At the first stage - usual binary (possibly ternary) classification by a solving tree.

At the second stage - classification of leaves obtained at the first stage according to the degree of their resistance to change over time, possibly through regression, since probabilities are mentioned.

Leaves are different :) It may be hard to understand me, as I'm working on one task then another - and they are similar, but still different.

The point is that we build the tree not by the principle of greed - choosing from all split variants, but by evaluating the historical data on the split for the stability of the final pattern. Thus, we narrow down the variants for selection. And from them we choose by some criterion - not necessarily by greed. We have a closing split - the range from and to the predictor. Everything that was selected at each iteration is saved and evaluated. We get statistics, which predictors and with which ranges participated more often in splitting - this is how the quantum table is selected (formed). Here with this table we train already on CatBoost. Alternative - binarisation of sample and training only on selected segments - there are difficulties in training by standard methods because of big sparsity of data. We can then get statistics, how each quantum segment from the selected ones will behave on new data - more of its class will be detected there or not relative to the average value in the sample (hence the probability is mentioned). Tests have shown that the less data is left for estimation, the less quantum segments with probability shift (with the same vector). The challenge is to keep the percentage of such quantum segments high in subsequent iterations, as the probability of selecting the correct split depends on this.

Experiments show that when building a tree, the order of predictors used in splitting is critical, which means that the method of the greedy principle will not often give the optimal solution.

 
Maxim Dmitrievsky #:

I don't know of any other applications of quantisation in MO.

CatBoost, and other gradient boustings.....

 
Aleksey Vyazmikin #:

CatBoost, and other gradient boustings.....

Naturally, so that the trees are not infinite

 
Maxim Dmitrievsky #:

Naturally, so that the trees are not endless

Well, it's good to be reminded that it's not only used in NS.

 
Aleksey Vyazmikin #:

Well, it's good to be reminded that it's not only in NS that it's used.

You don't see the analogy?

You're at a terminological impasse again. I don't like to break the orderly structure in my head for your crooked definitions. Or misinterpretations.

 
Aleksey Vyazmikin #:

Leaves are different :) Maybe it's hard to understand me, as I'm working on one task then another - and they are similar, but still different.

The point is that we build the tree not by the principle of greed - choosing from all split variants, but by evaluating historical data on the split for the stability of the final pattern. Thus, we narrow down the variants for selection. And from them we choose by some criterion - not necessarily by greed. We have a closing split - the range from and to the predictor. Everything that was selected at each iteration is saved and evaluated. We get statistics, which predictors and with which ranges participated more often in splitting - this is how the quantum table is selected (formed). Here with this table we train already on CatBoost. Alternative - binarisation of sample and training only on selected segments - there are difficulties in training by standard methods because of big sparsity of data. We can then get statistics, how each quantum segment from the selected ones will behave on new data - more of its class will be detected there or not relative to the average value in the sample (hence the probability is mentioned). Tests have shown that the less data is left for estimation, the less quantum segments with probability shift. The challenge is to keep the percentage of such quantum segments high in subsequent iterations, as the probability of selecting the correct split depends on this.

Experiments show that the order in which predictors are used in splitting is critical in building the tree, which means that the greedy principle method will not often produce an optimal solution.

If we build a tree, all we have are leaves and nothing but leaves) Okay, there are branches. But in the end they are made of leaves!)

 
Maxim Dmitrievsky #:

the analogy doesn't ring a bell?

In NS quantisation is often used to reduce the size of layers, including by changing data types after quantisation. I didn't do it - that's why I didn't write about it.

In general I don't understand what you are leading to - I have already written my attitude and vision and agreed that it is possible to try clustering in my algorithm. For this purpose I have been working with clustering tree for two months ago - while the project is on pause.

Or what is this all about?

Reason: