Machine learning in trading: theory, models, practice and algo-trading - page 2254

 
Maxim Dmitrievsky:

I wasn't thinking much, just a guess based on poking around.

gmm will give you chips that the dog hasn't seen before. But similar to the ones you've seen. In a reverse transformation, it might have some effect, I guess. Add noise.

This is a guess.

I'm getting a little confused here...

Any way PCA is linear it doesn't distort anything, if you have all components you can put back what you have decomposed without losses

 
mytarmailS:

I'm already confused here...

Any way PCA is linear it does not distort anything, if you have all the components you can put back together what you decomposed without loss

So there is a pitfall somewhere else. PCA works well with images, it is worse with quotes, though they are faster.

Well, that's understandable... pictures and numbers are easy to predict, but the market is non-stationary. Your PCA doesn't help, components stop being relevant when volatility changes or something else.

Just like digital filters ))

 
Maxim Dmitrievsky:

Your PCA doesn't solve it, components stop being relevant when volatility changes

I don't know what you mean, but...

If you add up all the PCA components on new data, you get the same price tick by tick, so ... I don't know what you mean by relevancy

 
mytarmailS:

I don't know what you mean, but...

If you add up all the PCA components on the new data, you get the same price tick by tick, so... I don't know what you mean by relevance.

let's talk about the dog later, i'm sleepy.)

the coders didn't work out empirically

 
Maxim Dmitrievsky:

Anyway, let's talk about the dog later, I'm sleepy.)

coders didn't work out empirically.

ok

 
mytarmailS:

You'll be the first

Watching one course on Bayesian method 2019, there are some interesting ideas, but the formulas inhibit understanding. Here's a thought, who has tried modern approaches with Bayesian logic. The lecturer generally argues that all MO without Bayesian methods for estimating probability in MO is just a fit.


Speaking of fitting, I'm increasingly inclined to conclude that CatBoost models degrade their results on samples outside of training because of the unrepresentativeness of the sample and the way the model is built. The fact is that there in the classical models the trees are symmetric and there is no pruning, which can lead to a situation where there is very little data in one leaf, but the leaf gets no small weight, and if this is a faulty division, then on samples outside of training, if there are many examples in the faulty leaf, it will lead to a significant distortion of the results. And there could be thousands of such leaves. If the sample were representative, there would be no problem because the weight in the sheet would be adequate and consistent with the nature of the data distribution (entropy). You should try shunting leaves with a small number of examples by zeroing out their weights.

The idea is that the model would only respond to data that it has an idea about, not a "if this is right, that is wrong" kind of judgment, as it does now.
 
Aleksey Vyazmikin:

Watching one course on Bayesian method 2019, there are some interesting ideas, but the formulas slow down the understanding. Here's a thought, who has tried modern approaches with Bayesian logic. The lecturer generally argues that all MO without Bayesian methods for estimating probability in MO is just a fit.


Speaking of fitting, I'm increasingly inclined to conclude that CatBoost models degrade their results on samples outside of training because of the unrepresentativeness of the sample and the way the model is built. The fact is that there in the classical models the trees are symmetric and there is no pruning, which can lead to a situation where there is very little data in one leaf, but the leaf gets no small weight, and if this is a faulty partitioning, then on samples outside of training, if there are many examples in the faulty leaf, it will lead to a significant distortion of the results. And there could be thousands of such leaves. If the sample were representative, there would be no problem because the weight in the sheet would be adequate and consistent with the nature of the data distribution (entropy). We need to try shunting leaves with a small number of examples by zeroing out their weights.

The idea is that the model would react only to the data it has an idea about, not a "if this is right, that's wrong" kind of judgement, as it happens now.

Representativeness is an important prerequisite.

Does catbust, when divided, produce leaves with a small number of examples per leaf? The recommended depth there is 6, that's 2^6=64 i.e. a sheet will average 1/64th of the rows of the entire sample. If you have at least 10,000 rows of training, that would be about 156 examples per sheet on average. I think this is pretty representative, in my opinion.

Although if you make the trees symmetrical, there might be some distortion there. How small did you see the leaves and how many lines of training?

 
elibrarius:

Representativeness is an important condition.

Does a catbust, when divided, produce leaves with a small number of examples in the sheet? The recommended depth there is 6, which is 2^6=64 i.e. the sheet will average 1/64th of the rows of the entire sample. If you have at least 10,000 rows of training, that would be about 156 examples per sheet on average. I think this is pretty representative, in my opinion.

Although if you make the trees symmetrical, there might be some distortion there. How small did you see the leaves and how many lines of training?

I don't have the exact numbers right now - this is just a guess. I have to go back to my old codes, I think I had an opportunity to get such statistics there - I forget. You're right to say that the average doesn't look scary, but that doesn't mean that there won't be quite a few examples in one sheet.

We can see that the margins of extreme probability on the training sample and on the test sample are usually significantly different - I assume that just the reason is the leaves with small numbers of examples, such leaves are just rarely found on the test sample.

 

There is such a visualization of tree leaf activation statistics estimation - one of the old models.

The y is the leaf number, and the x is the sampling string. The color shows the coefficient of leaf weight modulo.

You can see that even here there are rare leaf activations, which means that the assumption is reasonable - it is a sample exam


 
Aleksey Vyazmikin:

There is such a visualization of tree leaf activation statistics estimation - one of the old models.

The y is the leaf number, and the x is the sampling string. The color shows the coefficient of leaf weight modulo.

You can see that even here there are rare leaf activations, which means that the assumption is reasonable - it is a sampling


Rare activation on Exam rather means that the market has changed and what often happened on the trayn has stopped happening. And it doesn't necessarily mean that there weren't many activations there either.
Reason: