Machine learning in trading: theory, models, practice and algo-trading - page 3665

 
Maxim Dmitrievsky #:
This is a preprocessing function. Clusters do not participate in trading.
Well, any higher split in the tree can be called preprocessing, because it leads to the most successful leaf (in your case cluster).
 
and then this sheet (you have a cluster) can be further subdivided to deepen the learning.
 
Forester #:
Well any superior split in the tree can be called preprocessing, because it leads to that most successful leaf (in your case a cluster).
It builds boundaries greedily, for better categorisation. This is not consistent with saying that examples with different labels should be spread as far apart as possible in the feature space.

A better analogy would be the principle of determining boundaries between classes using the SVM method, where the boundary with the maximum distance to the classes is sought.
 
Maxim Dmitrievsky #:
It builds greedily boundaries, for better categorisation. This is not consistent with saying that examples should be spread as far apart as possible in the feature space.
Yes - tree greediness is not a very useful thing for noisy data. I would say not for better classification, but for fast classification.
 
Forester #:
Yes - tree greed is not a very useful thing for noisy data. I would say - not for better classification, but for fast classification.
Well look how the support vector method works, but it will still be incomplete, because you still need filtering of examples, transfer to a separate group don't trade. Otherwise they introduce uncertainty. That way you can get some alpha out.
 
The only problem there is that clusters on new data are also floating :) what you have marked is already outdated on new data. That's why you don't need to make too many of them, 2-10 is enough. The fewer clusters, the more reluctant they float, or not as fast.

The balance to find, to pick the number of clusters, is not that hard.

I just observe it in my head in the form of pictures in dynamics and it's kind of obvious there :)
 
Maxim Dmitrievsky #:
The only problem there is that clusters on new data are also floating :) what you have marked is already outdated on new data. That's why you don't need to make too many of them, 2-10 is enough. The fewer clusters, the more reluctant they are to float, or not as fast.

Finding the balance, selecting the number of clusters, is not so difficult.

I just observe it in my head in the form of pictures in dynamics and it's kind of obvious :)

It's the same with leaves. I for example can only trade leaves with 10% error on the traine. On OOS the error is already approaching 40-50%, or even start to drain.

How many leaves are selected there I have not counted, but I think not much.

 
Forester #:

It's the same with leaves. I, for example, can only trade leaves with 10% error on the traine. On OOS, the error is already approaching 40-50%, and even begin to drain.

How many leaves are selected there I have not counted, but I think not much.

Only in the tree examples immediately intersect with opposite classes, and in the case of this approach the distance between classes is greater.
 
Tried a new markup on ZZ. And then I just filter trades with MO.
Almost all variants without filtering are draining with spread+commission+swap rate. But MO can sometimes pull out. Here is an example (on the charts only OOS on Walking Forward):
Without filtering: 66k trades


With MO filtering: 5k trades, i.e. 90% filtered out


But still bad. Profit on trayne 7 pts per trade, but on OOS 0.7 pts per 1 trade on average. Profitability has dropped very badly(( Like other MO explorers - is it dropping as badly? In 10 times?
Slippage of only 1 pt will make this trade plum.
But you can see that MO still works.

In general, there is nothing else to catch on ZigZag. The charts above are marked for ZZ moves from 50 pts. It makes little sense to go lower - there is an even stronger base plum and MO is unlikely to pull out. The most interesting was at 200 pts. If we go into strong movements over 500 pts, there will be few deals there and the representativeness of the sample will become small, i.e. there will be little confidence in it on new data.
I think what else to investigate.... to try MA crossings...
Who uses what other algorithms for targeting?

Maxim Dmitrievsky #:
You said 30 pts per trade. Must be strong moves in the target markup? How strong (200, 500, 1000...)?
 
I mark the duration of trades. From one to 15 bars, with random selection.

With filtering clearly to the right and up, I need to learn more.