Machine learning in trading: theory, models, practice and algo-trading - page 2553

 
Dmytryi Nazarchuk #:
Logistic regression
Forest/bush regression will be more accurate.
 
elibrarius #:

No, it's a database of memorized history...

It was a rhetorical question:))

 
mytarmailS #:

It was a rhetorical question:)))

I know you're on the subject.))
It's for newcomers to the MO. If they come in here.

 
Vladimir Perervenko #:

There are three options for processing noise samples: delete, re-partition (fix the markup), and separate noise samples into a separate class. In my experience, about 25% of the sample is "noise". Quality improvement of about 5%, depending on models and data preparation. I use it sometimes.

There is one more problem when using predictors - their drift. And this problem should be defined and taken into account in testing and in operation. In the appendix you can find the translation of the article (look for others on the net) and there is a drifter package. It is not the only one. But the point is that when selecting predictors, you need to consider not only their importance, but also their drift. For high drifters throw them away or transform them, for low drifters take them into account (correct them) when testing and working.

Good luck

What do you mean "apply sometimes?"

Either there is some pipelining that has proven itself, or it's just an idle thought

Making noise a separate class theoretically doesn't improve the model (it stays inside the model and doesn't go anywhere).

about drift - it's the basics, bias-variance tradeoff
 
Maxim Dmitrievsky #:
Have you tried predicting the distribution of future quotes?
 
mytarmailS #:
Have you tried to predict the distribution of future quotes?

I did something like that, but I don't understand what it's for

Remember, I clustered future chunks of fixed length and predicted the cluster number. Each cluster had a different distribution, each had a different strategy for each. It worked on trayne, it fails on new data if you do it head-on.
 
Maxim Dmitrievsky #:

did something like that, but I didn't understand what it was for

Remember, I clustered future chunks of fixed length and predicted the cluster number. Each cluster has its own distribution, each has its own strategy. It worked on Trayne, but on new data it fails if you do it directly.

I remember...

I have a slightly different idea...

If it's possible to forecast qualitatively the distribution of future quotes for say 50 candlesticks forward, then from this distribution we can make several thousand series and train the model, and in this way the model will work adequately on the new 50 candlesticks in theory...

 
I periodically visit the topic, the same faces, the same discussion of models, maybe someone has something to show?
 
Farkhat Guzairov #:
I periodically visit the topic, the faces are the same, the discussion of models is the same, maybe someone has something to show?
This is not a problem to be solved here.