Machine learning in trading: theory, models, practice and algo-trading - page 3612

You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Your method makes it easier to train the model. There is less noise for it and it is easier to train. It will be easier for the same tree to identify a cluster in the data in which all 100% of examples =1, not 60% of examples.
But by labelling 40% of examples with a different class, you add noise to the trade, you put 1 instead of 0.
But your charts look stable and promising enough on OOS.
Do you plan to make a signal? Even without subscription, just to see what will happen in real life on OOS.
Are we gonna Google it?
I don't want signals, I want a hedge fund. Or just a lot of money on the card.
)))))))))
maturely
To see all the horror that is going on in the markup, I printed the number of elements in each cluster and the deviation of the mean of labels from 0.5, in descending order:
The following is not a complete list. There are only 3 clusters with deviation >= 0.1. That is, the number of buy and sell marks differs by 10%.
In the other clusters, the opposite labels fell out with a probability of almost 50%.
This is hell for training the model, because it is not sure of anything.
Let's see what kind of trading will happen if you select, fix and trade just 3 of these clusters:
Well such a thing, sitting around picking these pieces, even though they are pluses on the OOS.
But if you take a whole bunch of clusters above a certain probability threshold and fix them:
already better, at the threshold of 0.03. I prepared a bit of code for the main purpose - to calculate correct combinations of clusters.
To see the horror that goes on in the markings.
Do you have the full code, starting from data loading to model testing?
I'll post it later when I finish the probabilities.
Maybe an article.
And it is clear that the more clusters you split into, the more good ones you can find, but the number of samples in them will be small.
In fact, most datasets are rubbish. And often even fixing it doesn't help, or there are few deals.
In the example, I split into 500 clusters, the best ones are deduced. It makes no sense to combine and combine them manually.
F-ya to output such information and to fix clusters by threshold, not by the number of best clusters:
via chatgpt can be rewritten to other languages I guess.