Over-optimization (curve fitting)

 

Hello Traders,

How do you solve the problem of over optimization ?

i tried the mt5 tester with the forward tester 
the in sample data was 2 years and the out of sample data was 3 years, 2 of them defined in the tester and the last one year, i test the final forward results on them.

when i do that i find that the results that are both good in backtesting optimization results and the forward test when putting these settings again in test on a new out of sample data it just become horrible!.

so what is your methods to solve this problem?

Thanks.

Testing trading strategies on real ticks
Testing trading strategies on real ticks
  • www.mql5.com
The article provides the results of testing a simple trading strategy in three modes: "1 minute OHLC" using only Open, High, Low and Close prices of minute bars; detailed modeling in "Every tick" mode, as well as the most accurate "Every tick based on real ticks" mode applying actual historical data. Comparing the results allows us to assess...
 

How many types of exit do you have in your trading system?

A basic trading system with just tp and sl is more prone to curve fitting

 

My suggestion is to use as much data as possible, use as few oscillators as possible, like 1 or 2.

Attempt to keep the logic simple.

 
Abdelrahman Shehata:

Hello Traders,

How do you solve the problem of over optimization ?

i tried the mt5 tester with the forward tester 
the in sample data was 2 years and the out of sample data was 3 years, 2 of them defined in the tester and the last one year, i test the final forward results on them.

when i do that i find that the results that are both good in backtesting optimization results and the forward test when putting these settings again in test on a new out of sample data it just become horrible!.

so what is your methods to solve this problem?

Thanks.

Hi, I testing strategies on 10 last years data or even more, also from my research results, the profit factor gives you some clue, more than 3 can mean, the strategy is to fit history. Regards Greg

 
Always depends on how much time you backtest.
3 years is more likely to optimize too much than 10 years.
 

I'd say if you achieve profit in backtest only by optimization. The strategy is worthless. Optimization only makes sense on a strategy already generating profit.

The more optimizable parameters you have, the more room you give for these parameters to fit to the data. So keep them to a minimal. Max 2-3 parameters.

Also there is no general rule of how much history to test. For longer term strategies like using day chart you need more, because only 300 some days in a year. While shorter time frames like minute chart require less because there's thousands of minutes in a year.

Also flip around the *forward* test. Optimize on recent history and for the *forward* test to verify results use period BEFORE the optimized period when trading conditions where worse (higher spreads).

 

This is a very interesting and important topic (@Enrique Dangeroux: many good points, totally agree).

In order not to be fooled by overfitting, we (1) need to be able to detect the presence of overfitting and (2) take measures against it.

The first part is easier. You already did it by observing a significant mismatch between backtesting and foward testing.

For the second part I think we need to take a close look at what overfitting exactly is. To my understanding it involves dependence on overly specific rules that worked in the past but don't necessarily in the future. For example the case when a 50 periods SMA leads to glorious results, but in the future only a 53 period SMA makes the magic happen, whereas the example of 50 periods suddenly "stops working". So how can we take action?

This is my personal assessment and not engraved in stone, so just some ideas:

1. LESS SETTINGS. If we perform a genetic optimization with 15 input parameters that each can have 100 different values we end up with an insanely high amount of possible parameter combinations and the probability that the best combination is infact a random outlier is much higher, because something truly significant with reproducible good trading results should also reveal itself with a little less detailed and rigorous settings.

2. CRITERIA AS RANGES INSTEAD OF SPECIFIC NUMBERS. This part requires some coding work, because the built-in genetic optimizer can only return specific numbers as results, but we could for example optimize for the best mean and multiples of its standard deviation of an entry/exit criterion instead of a single number and then also trade accordingly, i.e. for example "entry trade if price is within a range of x +/- n standard deviations and my indicator is within y +/- m standard deviations..." instead of something like "entry if price exactly crosses the 17 period MA and RSI is >78.15" ... you get the idea.

3. ASSESS DISTRIBUTION GRAPHS OF EA PARAMETERS VS. FINAL BALANCE RESULT. What I mean by that: is there a steady correlation  (doesn't need to be linear) between a parameter and the outcome, e.g. something like "the outcome is steadily improving as the parameter is increases to a peak at x and then worsens again upon any further increase of my parameter..." versus a chaotic distribution with a seemingly random relationship between parameter value and outcome?

4. USE OF PARAMETER POPULATIONS INSTEAD OF FIXED PARAMETERS. This part is the hardest part to put into code, but in my personal view the best trick against overfitting. What do I mean by that: create your own genetic optimization function instead of using the built-in optimizer. The problem with the built-in algorithm is that it only comes up with one overly specific parameter combination, i.e. only exact values. Those may have been the best in the past, but the future may be different (it most definitely will be). However, another way to optimize is to start with a random population of parameter combinations and apply the "survival of the fittest" method to them, "kill" the weakest performers (according to continously adapting "fitness" scores for each combination) and replace them via creating new combinations ("specimen") by recombination ("cross-breeding") of combinations selected from the better performing parts within the population (edit: let me add that the built-in algorithm probably works in a similar way, but what we get is a results table related to one specific testing period instead of a continuous process) . The advantage with a custom genetic optimization is that the optimization doesn't end where the backtest ends, because also in realtime trading we still have this population (saved to a file) to chose the parameters from instead of relying on just one  (best) specific setting (and this population continuous to evolve with every new winner or loser). This is a bit like repeating the whole optimization process after each trade. Who does that? Nobody. One might argue that repetitive reassessment of the "fittest" might actually lead to more overfitting. This isn't true, because at any given time by nature the current "fittest" performer isn't per se any less or more overfitting than the result of the standard method with a single genetic optimization. Both methods come with the important flaw that they rely purely on the past, yet still it can be assumed that if there is any value to the past at all, the more recent past should have a higher validity. But the real difference with custom genetic optimization is in the variability of the parameters and continuous adaptation (market periods!). Just like with biological evolution the population retains a certain amount of "diversity".

And yes, I know... especially option 4 is a bit more complicated and surely isn't for everyone, but I guess this still doesn't invalidate the other options.

Also, this is a subjective analysis and I may be missing some points. Any other ideas / input / critique is much appreciated.

 
Chris70:

4. USE OF PARAMETER POPULATIONS INSTEAD OF FIXED PARAMETERS. This part is the hardest part to put into code, but in my personal view the best trick against overfitting. What do I mean by that: create your own genetic optimization function instead of using the built-in optimizer. The problem with the built-in algorithm is that it only comes up with one overly specific parameter combination, i.e. only exact values. Those may have been the best in the past, but the future may be different (it most definitely will be). However, another way to optimize is to start with a random population of parameter combinations and apply the "survival of the fittest" method to them, "kill" the weakest performers (according to continously adapting "fitness" scores for each combination) and replace them via creating new combinations ("specimen") by recombination ("cross-breeding") of combinations selected from the better performing parts within the population ( edit: let me add that the built-in algorithm probably works in a similar way, but what we get is a results table related to one specific testing period instead of a continuous process) . The advantage with a custom genetic optimization is that the optimization doesn't end where the backtest ends, because also in realtime trading we still have this population (saved to a file) to chose the parameters from instead of relying on just one  (best) specific setting (and this population continuous to evolve with every new winner or loser). This is a bit like repeating the whole optimization process after each trade. Who does that? Nobody. One might argue that repetitive reassessment of the "fittest" might actually lead to more overfitting. This isn't true, because at any given time by nature the current "fittest" performer isn't per se any less or more overfitting than the result of the standard method with a single genetic optimization. Both methods come with the important flaw that they rely purely on the past, yet still it can be assumed that if there is any value to the past at all, the more recent past should have a higher validity. But the real difference with custom genetic optimization is in the variability of the parameters and continuous adaptation (market periods!). Just like with biological evolution the population retains a certain amount of "diversity".

And yes, I know... especially option 4 is a bit more complicated and surely isn't for everyone, but I guess this still doesn't invalidate the other options.

Also, this is a subjective analysis and I may be missing some points. Any other ideas / input / critique is much appreciated.

The built-in Genetic Algorithm works this way. that's what GA is. For the continuous process, it can be coded and totally automated using the built-in GA. The only drawback is there is no mql API for the Strategy Tester, so you need to be creative. So there is no need to create a custom genetic code, at least for the reason invoked here.

 
Chris70:
...

I think the main point to avoid curve fitting is your point 3 (in relation to 1). Preliminary checks should be done even before considering using an optimization (whatever the method used). To goal being to know if the strategy can be optimized at all. Chaos can't be really optimized and doing it will always lead to curve fitting.

 
Alain Verleyen:

For the continuous process, it can be coded and totally automated using the built-in GA. The only drawback is there is no mql API for the Strategy Tester, so you need to be creative. So there is no need to create a custom genetic code, at least for the reason invoked here.

If an automated review of the optimization is implemented, then great! But who actually goes down that route in practice? If you do - perfect!


All this genetic stuff really was to be interpreted as a sidenote only, it's by no means necessary, it's just one aspect of this whole overfitting  thing and I guess the majority here won't care about it. For my personal needs, I coded a genetic algorithm a while back, so the mentioned possibility of using self-updating "populations" of parameter combinations instead of one single combination was more about sharing an idea rather than a question. I should mention that I personally didn't try to create a workaround with the built-in GA, so I can only guess what's possible. Still, I suppose(!) that it also requires some serious coding work (and skills) and although it may be possible to create a contiuous process, I suspect some problems:  doesn't it require a complete repetition of the whole optimization every time? Also, the built-in GA performs its "selection" in chunks of "generations" of multiple whole test period passes instead taking action based on a realtime adaptation of the fitness ranking on a transaction-based schedule, so the approach is a little different (pros+cons). In practice, via custom made solutions it's possible to get a ranking update within milliseconds, whereas a complete new optimization via built-in GA might take >24h, depending on the number of parameters. I don't know if there is a shortcut to avoid that. If yes, this would be interesting! Again... all this only on a sidenote - probably irrelevant to most here.

 

Let me add another point to the list, that may help a little

5. LOOKING AT THE EARLY GENERATIONS, TOO: by this I mean that a truly robust strategy can't that easily be overthrown by a little "less perfect" parameters, so that during the early generations of genetic optimization (far left part of the results graph) we should see a good balance between some minor/medium losers and some at least moderate winners (with regard to the final balance) instead of 95% net losers or many near-bancruptcy passes. If the okay-ish and good passes are almost all only located on the right part of the graph, then we most likely are dealing with outliers due to overfitting and therefore no results anywhere close to being reproducible.

Reason: