Machine learning in trading: theory, models, practice and algo-trading - page 2564

 
Aleksey Vyazmikin #:

That's what I wrote - that with the purpose of identifying a stable pattern that gives a statistical advantage in a particular area. And we quantify the predictors - any predictors.

And "how" to do it better is an open question - so far only by trying prepared tables made by empirical assumptions or statistical partitioning of the CatBoost algorithm.

In the figure 3 "quantum" - most likely selected the middle range, which has some statistical advantage.

According to my understanding, the problem with collinearity (correlation) of almost all predictors. There is also a combinatorial problem - if there are a lot of predictors, then there may be too many quanta. Probably it's worth lowering dimensionality via PCA or PLS first.

 
mytarmailS #:
Has anyone tried to attach the montichol paradox to trading/decision making?

The whole paradox there is that the problem is not fully formalized mathematically. The answer is different, depending on how the full formalization is done.

In the sense of usefulness - except as an instructive example that for one real phenomenon can be different mathematical models that give different answers.

 
Funny thing. I am picking ticks on the herst and I am getting values very different from 0.5 in the spread scale and the larger the time scale, the closer the herst is to 0.5. I have made a primitive system on a mask and began to substitute periods of 10, 100, 1000, 10000. All of them have approximately the same expectation. That's what an efficient market is like.
 
Aleksey Nikolayev #:

As I see it, the problem is with collinearity (correlation) of almost all predictors. There is also a combinatorial problem - if there are many predictors, then there may be too many quanta. Maybe we should first decrease dimensionality by PCA or PLS.

I wrote above, that I exclude predictors having similar signal on sample, i.e. correlation between quantile predictors decreases, though I use my method of grouping and choosing best result from group of similar ones.

As for the combinatorial problem - where exactly do you see it? In the training sample? If so, then theoretically it can be, and probably there is sense here to apply PCA, but not before the final sample will be ready. I have not encountered such a problem yet; on the contrary, there are fewer predictors than in the initial sample.

 
Aleksey Nikolayev #:

The paradox there is that the problem is not fully formalized mathematically. The answer is different, depending on which way the complete formalization is carried out.

In terms of usefulness - except as an instructive example that for one real phenomenon can be different mathematical models that give different answers.

How is it?

Here's an article with the code.

as well as a million other realizations

All formalized mathematically, or I do not understand?

 
mytarmailS #:

How's that?

Here's an article with the code.

There are a million other implementations as well.

Everything is formalized mathematically, or don't I get it?

It's like the problem of two flasks, the conditions are not complete, and the answer will be as you fancy the rest of the conditions.

 
mytarmailS #:

How's that?

Here's an article with the code.

There are a million other implementations as well.

Everything is formalized mathematically, or am I missing the point?

Look at the wiki, it talks about the incorrectness of the initial formulation and does not quite clearly say that it can be made correct in different ways. The point of the paradox is precisely that intuition fills in the initial reticence in different ways for different people. A purely psychological effect.

 
Rorschach #:

It's like a problem about two flasks, not full conditions are given, as the remaining conditions you can guess, that will be the answer.

I don't get it, but let's chalk it up to my illiteracy...

So what's the point of this Hearst in a nutshell?

I read your post, built this herst and what to do with it?


 
Aleksey Vyazmikin #:

I wrote above that I exclude predictors having similar signal in the sample, i.e. the correlation between quantile predictors decreases, although I use my method of grouping and selecting the best result from a group of similar ones.

As for the combinatorial problem - where exactly do you see it? In the training sample? If so, then theoretically it can be, and probably there is sense here to apply PCA, but not before the final sample will be ready. In reality I have not encountered such a problem yet, on the contrary, there are fewer predictors than in the initial sample.

Well, if we divide each predictor into just two pieces and look at all possible rules where one half of each predictor is included, then different such pieces will be 2^N, where N is the number of predictors. Now each such piece can either be taken or discarded - you get 2^(2^N) variants. This is a huge number even with a small N.

 
Aleksey Nikolayev #:

Well, if we divide each predictor into just two chunks and look at all possible rules where one half of each predictor enters, then there will be 2^N different chunks, where N is the number of predictors. Now each such piece can either be taken or discarded - you get 2^(2^N) variants. This is a huge number even with a small N.

First discard, and then combine.

Reason: