Machine learning in trading: theory, models, practice and algo-trading - page 3610

 
СанСаныч Фоменко #:

What are "profitable trades"? What is that probability? 0.5, 0.6 ....

Where is the threshold that should formalise the notion of "profitable deal"? The problem is that the calculated threshold should remain a "threshold" at least at the next step of prediction.

I can't afford to write so much, or mudirators and skufas will again consider that I don't let anyone speak :) and my hands are not tin-plated. I'm not Artemy Lebedev.

Profitable - it brings Profit after fixing cluster labels in the marked dataset before training. We're discussing them. That is, we do not even touch the training.

If you can prove that playing with probabilities allows you to improve the overall result taking into account OOS, it will be a formal answer to why it is better to check groups of clusters rather than one by one.
 
Maxim Dmitrievsky #:
Profitable is the one that brings Profit after fixing cluster labels in the marked dataset before training. We're discussing them. I mean, we're not even talking about training.

If we can prove that playing with probabilities allows to improve the overall result taking into account OOS, it will be a formal answer to why it is better to check groups of clusters instead of one by one.

Let us first try to group the observations into clusters and identify those clusters with the fewest contradictions.

Contradictions are when transactions within the same cluster have different labels {0;1}, which creates confusion when training the final model.

The clusters with the least contradictions can be corrected. By assigning the same labels to all observations within a cluster, the contradictions are removed. The label is determined according to the average value of labels within the cluster. If it is greater than 0.5, then all labels will be 1 and vice versa.

Since there are many clusters, there will be both buy and sell clusters. The more clusters there are, the more balanced the dataset is.

The second model will filter the selected clusters (trade) from those that have been discarded because they have the most uncertainty (do not trade).

Then we will measure the errors of both models and test the whole construction on new data, with different number of clusters.

At the very end, we will try to group the clusters in such a way as to maximise the final probability of a profitable trade, according to the theorver. And compare these probabilities with the final results (trade taking into account the OOS). If the result will be found to depend on such probability estimation, we can make more meaningful markup of the datasets.


The repartitioning of the dataset without taking probabilities into account will then look like this (python):



To be continued.... :)

 

First test with random settings. OOS to the right of the vertical dotted line.

Algo works, no errors. Further tests with different settings (probably tomorrow).

Learning errors for the main and meta models:

>>> models[-1][1].get_best_score()
{'learn': {'Accuracy': 0.6983374882369451, 'Logloss': 0.5674870408082323}, 'validation': {'Accuracy': 0.6428351309707242, 'Logloss': 0.5995984001032943}}
>>> models[-1][2].get_best_score()
{'learn': {'Logloss': 0.17194942035963293, 'F1': 0.775644666590935}, 'validation': {'Logloss': 0.17439066040537665, 'F1': 0.7693925848014725}}
>>> 

It's a bit unclear that the second model failed to error-free separate bad clusters from good clusters, even though they are linearly separable.

The error of the second one is evaluated via F1, because the classes can be highly unbalanced.

Another example with a different number of clusters. It's strange why some forumers can't do it, because it's so simple and there's even code for it.


 

Thus, model overtraining is removed by removing inconsistencies in the markup. Signs are not touched here at all. And for better fits we need a different theory.

The most tasty (about probabilities) will be later. We need to prepare the tests.

You can also do some triggering ... It works at once, without optimisable parameters/hyperparameters.

 
Maxim Dmitrievsky #:

The OOS is to the right of the vertical dashed line.

Unfortunately, the shorter the OOS, the higher the probability of clipping.
 
fxsaber #:
Unfortunately, the shorter the OOS, the higher the probability of a break.
F-I for self-checking of failures.
More often the probability of failures grows proportionally to the number of bulk passes.
 
Maxim Dmitrievsky #:
F-ya for self-checking bummer

Constructive.

 
Maxim Dmitrievsky #:
More often than not, the probability of failures increases in proportion to the number of opt passes.
Multiple testing error.

It is mathematically regular.
 
Maxim Dmitrievsky #:
So nothing was optimised/tweaked.

There was no training?

 
fxsaber #:

There was no training?

Training on marked-up data is not optimisation of the TS. It is training a ready-made TS, on ready-made examples.

TC optimisation is go to I don't know where, bring I don't know what. Many times we have gone there, brought a lot of things.

.
In general, I would not like to continue this topic because someone does not understand how learning differs from optimisation. There are more interesting topics.