Machine learning in trading: theory, models, practice and algo-trading - page 789

 

Do I understand correctly that AUTO ARIMA calculates everything by itself, I only need to load quotes.

I checked on several sites and the ARIMA(0,1,0) model appears everywhere.

 
Anatolii Zainchkovskii:

Here's a picture of the situation, just because you saw this drop three bars before, does not mean that you take it into account now.


It's a nice picture, I'll use it as an example...

If there is no error in the forecast how can we get minus if we see that there will be fall. When the zero bar closes with the opposite sign, we still know that there will be fall. I hope there will be no questions about this picture..... We have chosen the window at random but we forecast every bar in this window and consequently when the third bar is zero and we see a big difference in the forecast we sell. But when we see a slight modulus difference in the forecast on the first bar we already know on which bar there will be a significant increase. There are just two misunderstandings that are the problem. The first one, I explain myself by omitting some points that I think are obvious. Second, you do not understand because of the lack of experience. But it is normal to ask questions, so let's continue ...

 
Mihail Marchukajtes:

Nice picture, I'll use it as an example...

So in the absence of errors in the forecast, how can we get minus if we see that there will be a fall. When the zero bar closes with opposite sign, we still know that there will be fall. I hope there will be no questions about this picture..... We have chosen the window at random but we forecast every bar in this window and consequently when the third bar is zero and we see a big difference in the forecast we sell. But when we see a slight modulus difference in the forecast on the first bar we already know on which bar there will be a significant increase. There are just two misunderstandings that are the problem. The first one, I explain myself by omitting some points that I think are obvious. Second, you do not understand because of the lack of experience. But it's okay to ask questions, so let's go on...

I think that in order to get a profitable forecast you need to have a forecast, in fact the NS should not think about losses, if there is a forecast it means we trade it, if there's a new one, we trade a new one. But all this is great in 100% forecasting, but what to do when there is even 90% of forecast, in fact 10% erroneous forecasts can turn over very far. And here we need to understand what to do with a wrong forecast, in fact the deal is already open.

 

I think I'll be home late today, but I've been wanting to write a long post for a few days now. I'll do it now and see. But first, a small aside.

I will really put the written part of the article on the blog and link to it. Then instead of this article I began to write another, which is "Methodological Guidelines for working in the field of machine learning". One day I was walking around town and thought that it would be nice to have a methodology, like in universities, which would be described without any water, the basic rules of what you can and can not do. Well then, as they say, we'll see. And now the subject of the long-post. The whole point of it.....

During this month I've made a breakthrough and you see it clearly, someone recognizes it, and someone refutes, but no one has a question that Reshetov's optimizer I have long ago, but get a good model, I started when the doc fill me the commands from the P with explanations and the next day I get the results of tests that I have never seen in 15 years. I know it's not about the reel and the optimizer is doing a good job, but it's not about him. I'm just 100% sure that it's not the only optimizer in the world that does the best job. Of course it's not. Most P packages work just as well as it does. So what's the problem. Why are the results so poor and the whole backbone of the machine learning branch is still searching and can't find anything. The answer is simple. You make mistakes in one stage or another in the preparation of the model. Logically, you think you are doing the right thing with this or that prediction or transformation, but you make some little mistake thinking you are right, but you're not.

That's why I started talking about regression, to figure out what you can and can't do. I confronted my own misconception in the following way. Logically, if you think about it. That model with more inputs and more polynomial length is supposedly smarter and more parametric, but practice showed quite the opposite result when models with minimal number of inputs earned more than models with more inputs. This is the very example when you think you're right by logic, but in practice it turns out you're not.

But the problem of machine learning turns out not to be in the method of obtaining the model or application of some super secret transformations. The main Achilles' heel is something completely different and I can explain it with an example, it will be understandable and I will smash Maxim's pictures at the same time.

Suppose we created a system for getting a regression or classification model. And we believe that we didn't make any gross errors during the design. Suppose.

We have a training file. We run it for optimization 10 times and get 10 models. So the hardest question for me before that was. How to choose the model that is not overtrained and undertrained, but is adequate to the market, etc. That is the question that is the Achilles' heel. Even if you have made an AI system and made some mistakes in it, but this does not mean that YOUR system CANNOT GENERATE a generalizing model.

The quality of AI system is determined just by such indicator as the number of adequate models to the total number of optimizations. Let's say one system out of 100 optimizations gives only one model which can be used, and another one out of the same 100 optimizations gives 20 models which are usable. It is clear that the second system is better than the first one, as it has more correct models than the first one with the same amount of optimizations. For example Optimizer Reshetova out of four models (as a rule make them no more than four) gives from one to two models suitable. Sometimes four is not enough. No problem, on the fifth sixth or tenth optimization it will give a model adequate to the market. And now the most interesting thing is how to determine and find this model. I have found a way. I do something like this. I generate a training file and perform four triangulations. Then I evaluate these models and choose which one is adequate and for this I need only a section of the tripping in which there is a section of validation or testing in the process of tripping. I leave a small portion of the OOS in the form of 3-4 signals to finally make sure that it is it and then I put it on the road. That is why the question of model selection is one of the most important when preparing the TS. I will continue.

 
forexman77:

Do I understand correctly that AUTO ARIMA calculates everything by itself, I only need to load quotes.

I checked the ARIMA(0,1,0) model in several places.

auto.arima {forecast}
> y <- dd$OPEN
> auto.arima(y)
Series: y 
ARIMA(3,1,5) 

Coefficients:
         ar1     ar2      ar3      ma1      ma2     ma3      ma4      ma5
      0.3956  0.4421  -0.6151  -0.4159  -0.4165  0.6288  -0.0257  -0.0515
s.e.  0.0904  0.0701   0.0827   0.0905   0.0708  0.0797   0.0105   0.0115

sigma^2 estimated as 3.406 e-06:  log likelihood=66279.3
AIC=-132540.6   AICc=-132540.6   BIC=-132473
 

Pisets wrote wrote and the forum glitched, so read this way. There is no strength to rewrite...


 
Mihail Marchukajtes:

I think I'll be home late today, but I've been wanting to write a long post for a few days now. I'll do it now and see. But first, a small aside.

I will really put the written part of the article on the blog and link to it. Then instead of this article I began to write another, which is "Methodological Guidelines for working in the field of machine learning". One day I was walking around town and thought that it would be nice to have a methodology, like in universities, which would be described without any water, the basic rules of what you can and can not do. Well then, as they say, we'll see. And now the subject of the long-post. The whole point of it.....

During this month I've made a breakthrough and you see it clearly, someone recognizes it, and someone refutes, but no one has a question that optimizer Reshetova I have long ago, but get a good model, I started when the doc fill me the commands from the P with explanations and the next day I get the results of tests that I've never seen in 15 years. I know it's not about the reel and the optimizer is doing a good job, but it's not about him. I'm just 100% sure that it's not the only optimizer in the world that does the best job. Of course it's not. Most P packages work just as well as it does. So what's the problem. Why are the results so poor and the whole backbone of the machine learning branch is still searching and can't find anything. The answer is simple. You're making mistakes in one stage or another in the preparation of the model. Logically, you think you are doing the right thing with this or that prediction or transformation, but you make some little mistake thinking you are right, but you're not.

That's why I started talking about regression, to figure out what you can and can't do. I confronted my own misconception in the following way. Logically, if you think about it. That model with more inputs and more polynomial length is supposedly smarter and more parametric, but practice showed quite the opposite result when models with minimal number of inputs earned more than models with more inputs. This is the very example when you think you're right by logic, but in practice it turns out you're not.

But the problem of machine learning turns out not to be in the method of obtaining the model or application of some super secret transformations. The main Achilles' heel is something completely different and I can explain it with an example, it will be understandable and I will smash Maxim's pictures at the same time.

Suppose we created a system for getting a regression or classification model. And we believe that we didn't make any gross errors during the design. Suppose.

We have a training file. We run it for optimization 10 times and get 10 models. So the hardest question for me before that was. How to choose the model that is not overtrained and undertrained, but is adequate to the market, etc. That is the question that is the Achilles' heel. Even if you have made an AI system and made some mistakes in it, but this does not mean that YOUR system CANNOT GENERATE a generalizing model.

The quality of AI system is determined just by such indicator as the number of adequate models to the total number of optimizations. Let's say one system out of 100 optimizations gives only one model which can be used, and another one out of the same 100 optimizations gives 20 models which are usable. It is clear that the second system is better than the first one, as it has more correct models than the first one with the same amount of optimizations. For example Optimizer Reshetova out of four models (as a rule make them no more than four) gives from one to two models suitable. Sometimes four is not enough. No problem, on the fifth sixth or tenth optimization it will give a model adequate to the market. And now the most interesting thing is how to determine and find this model. I have found a way. I do something like this. I generate a training file and perform four triangulations. Then I evaluate these models and choose which one is adequate and for this I need only a section of the tripping in which there is a section of validation or testing in the process of tripping. I leave a small portion of the OOS in the form of 3-4 signals to finally make sure that it is and then I put it on the road. That is why the question of model selection is one of the most important when preparing the TS. I will continue.

I can't read all this nonsense anymore, you've killed me

the rest of the captains of obviousness and gendarmes of clarity

i'm out of here )

 
Maxim Dmitrievsky:

I can't read all this crazy nonsense anymore, you're killing me.

the rest of you captains of obviousness and gendarmes of clarity

bugger off )

He wrote that he was drunk and did not sleep for two nights. He wanted to "talk").

Mihail Marchukajtes:
In general, Michael, you should sleep at night.

 
Maxim Dmitrievsky:

I can't read all this crazy nonsense anymore, you're killing me.

the rest of you captains of obviousness and gendarmes of clarity

I'm outta here.)

Max found a grail for sure ))

 
Maxim Dmitrievsky:

the topic has gone beyond the boundaries of reasonableness - someone has long been hardened and fixated on retraining and "packets"

Some are just like the dumbasses in MO and still are.

It's not that the topic has gone anywhere - it's "fuzzy". There is no regular moderation. Some kind of pile, even worse than my thread.

But, we have to wait for the man with the Grail to come to the forefront.

We are waiting.

Reason: