Machine learning in trading: theory, models, practice and algo-trading - page 2727

 
Aleksey Nikolayev #:

We get a posteriori analysis of an already trained model. I would like to supplement it with a priori analysis for the stage of training sample selection.

I think so too. I decided to use the last formed top of the zigzag for simplicity, but I would like something more elaborate.

I start to pull out only working pieces from new data and apply a filter in the form of a second model to work on both old and new data, then I check it on other new data, like in the article.

also a kind of fitting, but on the basis of model errors. As if we select at least those variants that it is able to classify well, so there is something in them besides randomness (at least on training and validation and some other validation).

If a priori laying something down, it probably makes sense to take any long-running monitoring, it will give at least some adequate markup. Pick up the signs.


I came up with a new spammer of features and targets (it seems to be informative, and it is, compared to the usual random sampling). But there are a few variants, haven't tested it yet.

 
Aleksey Vyazmikin #:

Have you not tried experimentally? After all, according to your theoretical approach in this question, after a critical increase in the sample size, the patterns in the sample will arrive old, no longer working, and thus learning should deteriorate in a qualitative sense and on new data the results will be worse when the sample is increased.

You probably realise that it's a completely immense computational task to train on a large number of variants for the length of the history for a large number of points in time. Even if by some miracle you manage to collect all these statistics, then there will be a question of meaningful systematisation of this pile of information. Surely some different length of history will turn out to be optimal for each moment. And how to interpret this and, most importantly, how to extrapolate into the future?

I would like to go the other way round - to come up with some heuristics for drastically reducing the number of variants for the length of the training history (literally down to a few variants).

 
Maxim Dmitrievsky #:

I start pulling only working chunks from new data and apply the filter as a second model to work on both old and new data, then I check it on other new data like in the article.

also a kind of fitting, but on the basis of model errors. As if we select at least those variants that it is able to classify well, so there is something in them besides randomness (at least on training and validation and some other validation).

If a priori laying something down, it probably makes sense to take any long-running monitoring, it will give at least some adequate markup. Pick up the signs.


I came up with a new spammer of features and targets (it seems to be informative, and so it is compared to the usual random sampling). But there are a few variants, haven't tested it yet.

I'll have to ponder on it. I don't really understand how to translate it into my own ideas and concepts.

 
Aleksey Nikolayev #:

I have to think about it. I don't really understand how to translate it into the language of my perceptions and concepts.

Also, switching from ticks to bars reduces the predictive power a lot.

but removes potential conflicts with the dts.)

 
Maxim Dmitrievsky #:

Also, switching from ticks to bars reduces the predictive ability a lot

but removes potential conflicts with dts :)

By the way, this is also an important practical and interesting theoretical question. You can formulate it as a dependence of real bid-ask spread on volume (liquidity, volatility), calculate the corresponding regression, compare forex with stock exchange instruments, etc. Another thing is that it is interesting only for those whose TSs trade large volumes)

 
Aleksey Nikolayev #:

You probably realise that it is an absolutely immense computational task to train a large number of variants for a large number of moments in time for the length of the history. Even if by some miracle you manage to collect all these statistics, then there will be a question of meaningful systematisation of this pile of information. Surely some different length of history will turn out to be optimal for each moment. And how to interpret it and, most importantly, how to extrapolate it into the future?

I would like to go the other way round - to come up with some heuristics for drastically reducing the number of variants for the length of the training history (literally to a few variants).

The problem with the experiment is solvable, I've done something similar.

I came to the idea at that time that I should dig towards methods of estimating comparability of the sample. But I couldn't implement it - I didn't understand the formula.

 
Aleksey Vyazmikin #:

The problem with the experiment is solvable, I've done something similar.

It is technically quite solvable, probably. The question is how to interpret the results of such an experiment.

Aleksey Vyazmikin #:

I came to the idea then that we should dig towards methods of evaluating the comparability of a sample. But I couldn't realise it - I didn't understand the formula.

Matstat has a lot of tests to check the homogeneity of samples, for example. If of course I understand your terminology correctly.

 
Aleksey Nikolayev #:

By the way, this is also an important practical and interesting theoretical question. It can be formulated as a dependence of real bid-ask spread on volume (liquidity, volatility), calculate the corresponding regression, compare forex with stock exchange instruments, and so on. Another thing is that it is interesting only for those whose TS trade large volumes).

Oh, it's such a mess that nothing is clear. Where do they get these quotes with volumes from, what suppliers are there, whether they exist at all and everything like that. In the end, even if they succeed, they will ban such a toxic ts like all the others on similar principles. Or run to different places with a hat and collect what will fall into it before the magic pendel

Welcome TS from an hour transaction length, it is possible on several instruments, they seem to nobody particularly strain in terms of toxicity, but it is difficult to make such a difficult, probably because they do not strain.
 
Maxim Dmitrievsky #:
Oh, it's such a mess that it's incomprehensible. Where they get these quotes with volumes from, what suppliers they use, whether they exist at all, and everything like that. In the end, even if it works out, they will ban such a toxic TS like all the others on similar principles

I think fxsaber wrote that problems start with some large turnovers. Perhaps your TSs have fallen victim to too high popularity among copyists)

 
Aleksey Nikolayev #:

I think fxsaber wrote that the problems start with some large turnovers. Perhaps your TCs have fallen victim to too high popularity with copywriters).

When the amount becomes about 10k and constantly increases, they start to pay attention. But more often the limits are much more modest 😀

There is an option to do it on crypto, but there is no metatrader there.

It's still kind of funny to read the suckers who think that if you have a cool TS you can buy a Boeing and an island.

Although Saber should definitely have his own island, who else but him?
Reason: