Machine learning in trading: theory, models, practice and algo-trading - page 3610
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
What are "profitable trades"? What is that probability? 0.5, 0.6 ....
Where is the threshold that should formalise the notion of "profitable deal"? The problem is that the calculated threshold should remain a "threshold" at least at the next step of prediction.
Let us first try to group the observations into clusters and identify those clusters with the fewest contradictions.
Contradictions are when transactions within the same cluster have different labels {0;1}, which creates confusion when training the final model.
The clusters with the least contradictions can be corrected. By assigning the same labels to all observations within a cluster, the contradictions are removed. The label is determined according to the average value of labels within the cluster. If it is greater than 0.5, then all labels will be 1 and vice versa.
Since there are many clusters, there will be both buy and sell clusters. The more clusters there are, the more balanced the dataset is.
The second model will filter the selected clusters (trade) from those that have been discarded because they have the most uncertainty (do not trade).
Then we will measure the errors of both models and test the whole construction on new data, with different number of clusters.
At the very end, we will try to group the clusters in such a way as to maximise the final probability of a profitable trade, according to the theorver. And compare these probabilities with the final results (trade taking into account the OOS). If the result will be found to depend on such probability estimation, we can make more meaningful markup of the datasets.
The repartitioning of the dataset without taking probabilities into account will then look like this (python):
To be continued.... :)
First test with random settings. OOS to the right of the vertical dotted line.
Algo works, no errors. Further tests with different settings (probably tomorrow).
Learning errors for the main and meta models:
It's a bit unclear that the second model failed to error-free separate bad clusters from good clusters, even though they are linearly separable.
The error of the second one is evaluated via F1, because the classes can be highly unbalanced.
Another example with a different number of clusters. It's strange why some forumers can't do it, because it's so simple and there's even code for it.
Thus, model overtraining is removed by removing inconsistencies in the markup. Signs are not touched here at all. And for better fits we need a different theory.
The most tasty (about probabilities) will be later. We need to prepare the tests.
You can also do some triggering ... It works at once, without optimisable parameters/hyperparameters.
The OOS is to the right of the vertical dashed line.
Unfortunately, the shorter the OOS, the higher the probability of a break.
F-ya for self-checking bummer
Constructive.
So nothing was optimised/tweaked.
There was no training?
There was no training?
.