
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Overtraining is a well-established and quite specific term. You are not only substituting it but also not explaining what it is in your understanding.
It reminds me very much of Sulton ) in the manner of communication.
Model training is when the model extracts the essence of the process, in the local jargon "patterns", which take place not only in the training sample, but also outside the training sample.
Overtraining (overfitting) is when the model starts to pick out some randomness that does not exist outside the training sample and because of this the model gives an error value that is very different from the training error.
Many people on this forum have faced overtraining (maybe without realising it), as it is very easy to get an overtrained EA with the help of optimisation in the tester.
But this is all philosophy.
The reality is in the specific skill, in the tools that are used in this process.
With my article and book I am not trying to solve the problem of overtraining, but only to open the door to the world of predictive models, which (models) teach to predict quality things in the future. If the esteemed gpwr hadrun Rattle after spending a couple of hours, his questions would have much more content without demonstrating that he didn't bother to read my article but has an opinion about its content. But most importantly, Rattle covers the entire process of creating very complex models including estimating the performance of out-of-sample models, the list of significant predictors, the number of patterns found ..... I still use Rattle, although the actual models are different. Just to figure it out, test an idea..... 20-30 minutes and the development direction can change radically.
A very limited goal.
On top of the article and book I offer paid services. And it is up to everyone to decide if I have the necessary qualifications to perform the advertised list of works. And I, before taking up a particular order, decide whether the customer is able to understand the result of my labour.
1. form a rather large set of predictors, for example, 50 pieces with the number of bars 15000
2. Using one of the above algorithms, we select predictors on these 15 thousand bars - we usually get 15 to 20 pieces, which are used in model building more often than in 20% of cases.
3. Then we take a smaller window, for example 2000 bars and start moving it one bar at a time, selecting significant predictors from the previously selected 20 out of 50.
4. The specific list of significant predictors changes all the time.
1. How? Do you need 60 years of daily history?
2. You are looking into the future, honourable one. You select 15-20 predictors on the whole history, and then check them on the "untrained" sample of the same 15000 bars? If someone told me today what 15-20 predictors will be "sampled" in the future, I wouldn't need anything else to become a billionaire.
I use MatLab. There are lots of different toolboxes in there. So you don't need to assume that I don't know your methods. Nor should you assume that I have some superficial or dilitant approach (I just think that about your approach). Everyone can use different packages and toolboxes. But not everyone can understand their essence. And you don't need to advertise your services and book to me. My problems are much more complicated than the definition of overtraining. I have 10 thousand economic indicators-predictors, and it is very difficult to choose those influencing the market. If I go through each predictor individually, I will miss situations like my example above, when the target series does not collate with one of the predictors, but this predictor is included in the model of the target series. If you go through all possible combinations of different predictors, you won't have enough time to do it. Even searching two predictors out of 10,000 takes more than a week. So far, I have a biological self-pruning network (it takes a long time to explain, you need some training to understand) to select N relevant variables. This network is faster than the search of all possible combinations of predictors, but still slow. So if you have a brilliant idea how to determine that x1 is part of the y model by looking only at y and x1 in my example, I'll give you a gold medal.
1. How's that? 60 years of daily history is needed?
2. You are looking into the future, honoured one. You select 15-20 predictors on the whole history and then check them on the "untrained" sample of the same 15000 bars? If someone told me today what 15-20 predictors will be "sampled" in the future, I wouldn't need anything else to become a billionaire.
I use MatLab. There are lots of different toolboxes in there. So you don't need to assume that I don't know your methods. Nor should you assume that I have some superficial or diligent approach (which is exactly what I think of your approach). Everyone can use different packages and toolboxes. But not everyone can understand their essence. And you don't need to advertise your services and book to me. My problems are much more complicated than the definition of overtraining. I have 10 thousand economic indicators-predictors, and it is very difficult to choose those influencing the market. If I go through each predictor individually, I will miss situations like my example above, when the target series does not collate with one of the predictors, but this predictor is included in the model of the target series. If you go through all possible combinations of different predictors, you won't have enough time to do it. Even searching two predictors out of 10,000 takes more than a week. So far I have a biological self-pruning network for selecting N relevant variables (long to explain, you need some training to understand). This network is faster than the search of all possible combinations of predictors, but still slow. So if you have a brilliant idea how to determine that x1 is part of the y model by looking only at y and x1 in my example, I'll give you a good idea.
1. How's that? 60 years of daily history needed?
Let's not exaggerate.
I work in forex. I predict trends and I am quite satisfied with trends that have 50-100 pips reversals. I don't need a day history for that. In my examples, this is H1, 18000 is three years.
2. You look into the future, honoured one. Do you select 15-20 predictors on the whole history and then test them on an "untrained" sample of the same 15000 bars?
I already explained that. It is very desirable that you read what I am explaining for you personally. I don't look into the future. Performance is always out of sample. Rattle doesn't offer the other possibility you suggest, even if I wanted to.
I have problems far more complex than defining overtraining. I have 10 thousand economic indicators-predictors, and how to choose those influencing the market is very difficult. If you go through each predictor individually,
There's no such thing as a complete search. Random forests work best when the number of predictors is measured in thousands, I once saw a figure of 50 thousand. On my data the figures are as follows: 90 predictors, 10000 rows (bars) - model training time about a minute on one core. As a result: class labels, probability of class labels, significance of each predictor in model building.
This network is faster than a search of all possible combinations of predictors,
There's no predictor search. The algorithm is as follows. There is a parameter - the number of predictors in a tree node, on the basis of which the decision on classification is made. You can set it yourself, default = sqrt (I have 90 of the maximum number of predictors). Approximately, 9 predictors are used in each node. The algorithm, when considering each node of the tree always takes 9 predictors, but always chooses randomly from the total number=90. Through cross validation, the most significant predictors are eventually selected and used in the final tree construction.
PS.
Why don't you use R? If paid, then SAS....
Matlab is not among the specialised statistical packages at all.
Explain, please, why do you go to the trouble of selecting predictors? After all, modern algorithms are capable of processing thousands of predictors, even if most of these predictors are random. And the models are reliable.
Unfortunately, that's not the case. At least not for me.
I am not aware of predictor selection algorithms (although I know several dozens of selection algorithms), after which (algorithms) overtraining of the model would be excluded.
The old rule of statistics still applies: "Garbage in - rubbish out".
Unfortunately, it's not. At least for me.
I am not aware of any predictor selection algorithms (although I know several dozens of selection algorithms), after the operation of which (algorithms) overtraining of the model would be excluded.
The old rule of statistics still applies: "Garbage in - rubbish out".
I don't know, I tested my programme and it seems to be fine.
If I may quote from my writing on another resource: "We tested the correctness of the programme on the data taken from here:
http://archive.ics.uci.edu/ml/index.html
(Site of the Centre for Machine Learning and Intelligent Systems).
The length of the training example was 10000 features. 3000 of which were random in nature and specially added to test the quality of classification. In total, 100 examples were used in the training sample, which is undoubtedly very small, but we could not find more. During testing we used other examples in the amount of 100 pieces as well.
(Link to the archive with the original data).
The recognition accuracy was 75% on unfamiliar data. We are convinced that if there were more data to train, we could significantly increase the accuracy of the predictions." End of quote.
P.S.: The training time took about 2 minutes on a PC with an 8-core processor and 8 Gb memory, because the training time of the algorithm does not grow exponentially with the number of predictors. And uninformative predictors are not automatically used.
If anyone is interested, here is the link cybercortex.blogspot.ru (don't consider it an advertisement:)
I don't know, I've tested my programme and it seems to work fine.
If I may quote from my writing on another resource: "We tested the correctness of the programme on the data taken from here:
like indicators are used. And as it turns out, it is very easy to include various kinds of rubbish into the model and the rule "rubbish in - rubbish out" starts to work.
Inclusion of rubbish predictors into the model, i.e. predictors that have a weak influence on the target variable, leads to overtraining of the model, in which everything is fine on the training sample, but problems arise outside the sample.
You have just a great, and typical example on which to show the problem.
1. On the surface there is the fact that, unlike your example, all financial series belong to the so-called time series, in which the order of values is important. Therefore, models in financial markets, unlike models in medicine, should take this nuance into account.
2. But there is a much more serious circumstance, and it is directly related to the topic of your question.
In medicine, the solution of the question "sick-not sick" refers to diagnosis, and this is half of all medicine. A lot of people research, justify, look for "predictors" in our terminology, which according to these researchers are relevant to the verdict "sick or not sick". We see nothing of the sort in forex. In the stock market, research on the relationship between economic causes and direction of movement is common, but none of this applies to intraday intervals.
Therefore, when building machine learning models on intraday intervals, formal, mathematical values like indicators are used. And as it turns out, it is very easy to include various kinds of rubbish into the model and the rule "rubbish in - rubbish out" starts to work.
Inclusion of rubbish predictors into the model, i.e. predictors that have a weak influence on the target variable, leads to retraining of the model, in which everything is fine on the training sample, but outside the sample there are problems.
1. "All financial series belong to the so-called time series, in which the order of values is important." - nobody denies this and this order is not violated, even though it is a time series. You, having trained the model on the prices P1, P2, P3...Pn, do not change their order when testing on Out Of Samples or in real use.
2. I agree with you on one thing: if the input is 100% rubbish predictors, we get 100% rubbish in the output. This is obvious and no one is arguing with it. All I am saying is that there are algorithms where it doesn't matter to cull the data, because they give good results on Out Of Samples with any amount of rubbish data other than 100%, because rubbish data is not used de facto. It is also important here to distinguish between algorithms for which data dimensionality reduction is critical, such as with principal component analysis or autoencoders, and algorithms that are insensitive to data dimensionality.
"In the stock market, research on the relationship between economic causes and direction of movement is common, but none of this applies to intraday intervals." - Well, it does, and it does apply to intraday intervals, such as the release of Non-Farm Payrolls.
3. of course I understand you, everyone earns as he can, but have you ever implemented any machine learning algorithms yourself? I am convinced that in order to understand how an algorithm works, you need to write it yourself from scratch. Believe me, in this case you will discover things that are not written about in books. And even seemingly obvious elements that seemed easy before, actually work differently than you thought:) Regards.
In addition, when building machine learning models on intraday intervals, formal, mathematical values such as indicators are used. And as it turns out, it is very easy to include various kinds of rubbish in the model and the rule "rubbish in - rubbish out" starts to work.
....
3. of course I understand you, everyone earns as he knows how, but have you ever implemented any machine learning algorithms yourself? I am convinced that in order to understand how an algorithm works, you need to write it yourself from scratch. Believe me, in this case you will discover things that are not written about in books. And even seemingly obvious elements that seemed easy before, actually work differently than you thought:) Respectfully.