Machine learning in trading: theory, models, practice and algo-trading - page 99

 
DAFomenko:

Classification is not a panacea or a grail-making tool.

The first thing the application of classification does is to apply the tools to the problems to which these tools are applicable. For example, the idea of applying spectral analysis to financial markets has been discussed many times, everything seems to be an excellent tool, but for other objects, ah, no, it is offered again.

Second. The classification is quite applicable to financial markets, but there are many troubles, as it was written above. But with the classification we can focus on the main problem - the problem of retraining (overfitting) of the TS. What could be more important? It is not pleasant, of course, to be deprived of illusions about having a favorite grail, but here is the choice: happiness is good, but truth is better?

Third. Classification poses the question quite specifically: What are we predicting. Let us compare it with TA. We take indicators. It is always a bar [1]. The current bar is not used. What does it mean for H1? We use the information of hourly freshness for market entry forecasts! This is in the best case.

This is completely different in the classification. You take the current value of the target variable and match it to yesterday's raw data - shift the target by one or more bars. When you use a model fitted to such data, you always really predict the future when the next bar arrives.

PS.

If you're going to use it to predict sharp market movements (news), you'll succeed if you can generate a target variable, and that's a big problem in much simpler cases.

I subscribe to all of this.

I don't know about the spectrum, I've never used it.

Second. Classification is quite applicable to financial markets, but there are a lot of intricacies, as written above. But when classifying you can put the main problem - the problem of retraining (overfitting) of the TS. What could be more important? It's not nice, of course, to be deprived of the illusion of having your favorite grail, but here's the choice: happiness is good, but truth is better?

There, there! We only have one problem - overtraining. And it weighs on everyone. The flip side is undertraining (and bad results everywhere).

I've posted some nice graphs for you here, Monte Carlo included. Basically, I've come to the conclusion that I've fitted the data to an out-of-sample segment without training the model(s) on it. I seem to have the models pass out-of-sample well. But the problem is, until I can see the out-of-sample, I can NOT pick a model that works. That's too bad.

 
Alexey Burnakov:


There, there! We only have one problem - retraining.

And as for me, the problem is just the other thing, well, that's so.......
 
mytarmailS:
As for me, the problem is just the other thing, well that's so.......
Just, it involves a lot of things. And when the data, predictors, models are ready, the design of the experiment is lined up. It remains to check whether the model overtrains or not, and it does tend to overtrain. (Purely my experience.)
 
Yuri Evseenkov:

L What am I, a doctor? Here is Sanych writing:

"Here we are discussing classification-based predictions, which do not take into account the previous state when predicting the next bar. Predictions (forecasts) based on classification are predictions based on patterns. And if there was news in the past that caused a change that does NOT follow from previous values (not extrapolated), then the classification will catch that change as such and if there is a similar change in the future (not exactly the same, but similar) it will be recognized and a correct prediction will be made. "

So I think we should dig in this direction :"the classification will catch such a change as such" .

You are absolutely right at the beginning. Finally some sensible people appeared in this thread. Yes, the classification estimates a pattern for its truth or lie, or says I don't know, like Reshetov suggested. And if this reaction is identical to the one in training, the network will make a correct conclusion. So it's like this....
 
Mihail Marchukajtes:
Finally there are some sensible people in the thread.
Have you thought this through?
 
mytarmailS:
have you thought this through well?
I always do. Thinking badly doesn't work :-)
 
Mihail Marchukajtes:
I always do that. I can't think badly :-)

no way

 
Alexey Burnakov:

I've been posting you some nice graphs here, Monte Carlo among others. Basically, I came to the conclusion that I was fitting the data to an out-of-sample segment without training the model(s) on it. I seem to have the models pass out-of-sample well. But the problem is, until I can see the out-of-sample, I can NOT pick a model that works. That's too bad.

Have you tried the committee? If gbm is trained multiple times with the same parameters on the same data, the result on the new data will be slightly different each time. If you choose one model at random, then you may be lucky and the trade will go well, or not, so you can't guess. In this case, train dozens (hundreds?) of models, and the final result is the one predicted by most models.

For example the following chart: on the left is the simulation of the results of trading with 100 models. It can be seen that by taking just one model to trade, you have almost a 50% chance to lose.
On the right side is trading on decision of the committee of these same models, without randomness, everything is clear and almost stable upwards.

 
mytarmailS:

experiment is the criterion of truth - don't think, but do

I personally think that spectral analysis is more promising, but that's for me personally...

Why don't you think about it first? Even the wolf thinks first, whether to chase a skinny hare or not. Energy is sometimes spent more than you can make up for in prey.
 
DAFomenko:

The first thing that makes the application of classification is the use of tools to the problems to which these tools are applicable. For example, the idea of applying spectral analysis to financial markets has been discussed many times; everything seems to be an excellent tool, but for other objects, ah, no, it is offered again.

Second. The classification is quite applicable to financial markets, but there are many troubles, as it was written above. But with the classification we can focus on the main problem - the problem of retraining (overfitting) of the TS. What could be more important? It is not pleasant, of course, to be deprived of illusions of having a favorite grail, but here is the choice: happiness is good, but the truth is better?

Third. Classification poses the question quite specifically: What are we predicting. Let us compare it with TA. We take indicators. It is always a bar [1]. The current bar is not used. What does it mean for H1? We use the information of hourly freshness for market entry forecasts! This is in the best case.

This is completely different in the classification. You take the current value of the target variable and match it to yesterday's raw data - shift the target by one or more bars. When you use a model fitted to such data, you always realistically predict the future when the next bar arrives.

If you are going to use it to predict abrupt market movements (news), then you will succeed if you can generate a target variable, and there are big problems with that in much simpler cases.

Are you related to Sanych?

Yes, I am. Naive Bayesian classifier, as it filters spam, will it work here or not?

And about the news - no way! Some news will be so retrained in all the crevices, that it will not be enough. I gave examples.

Reason: