Over-optimization (curve fitting) - page 2

 
Chris70:

This is a very interesting and important topic (@Enrique Dangeroux: many good points, totally agree).

In order not to be fooled by overfitting, we (1) need to be able to detect the presence of overfitting and (2) take measures against it.

The first part is easier. You already did it by observing a significant mismatch between backtesting and foward testing.

For the second part I think we need to take a close look at what overfitting exactly is. To my understanding it involves dependence on overly specific rules that worked in the past but don't necessarily in the future. For example the case when a 50 periods SMA leads to glorious results, but in the future only a 53 period SMA makes the magic happen, whereas the example of 50 periods suddenly "stops working". So how can we take action?

This is my personal assessment and not engraved in stone, so just some ideas:

1. LESS SETTINGS. If we perform a genetic optimization with 15 input parameters that each can have 100 different values we end up with an insanely high amount of possible parameter combinations and the probability that the best combination is infact a random outlier is much higher, because something truly significant with reproducible good trading results should also reveal itself with a little less detailed and rigorous settings.

2. CRITERIA AS RANGES INSTEAD OF SPECIFIC NUMBERS. This part requires some coding work, because the built-in genetic optimizer can only return specific numbers as results, but we could for example optimize for the best mean and multiples of its standard deviation of an entry/exit criterion instead of a single number and then also trade accordingly, i.e. for example "entry trade if price is within a range of x +/- n standard deviations and my indicator is within y +/- m standard deviations..." instead of something like "entry if price exactly crosses the 17 period MA and RSI is >78.15" ... you get the idea.

3. ASSESS DISTRIBUTION GRAPHS OF EA PARAMETERS VS. FINAL BALANCE RESULT. What I mean by that: is there a steady correlation  (doesn't need to be linear) between a parameter and the outcome, e.g. something like "the outcome is steadily improving as the parameter is increases to a peak at x and then worsens again upon any further increase of my parameter..." versus a chaotic distribution with a seemingly random relationship between parameter value and outcome?

4. USE OF PARAMETER POPULATIONS INSTEAD OF FIXED PARAMETERS. This part is the hardest part to put into code, but in my personal view the best trick against overfitting. What do I mean by that: create your own genetic optimization function instead of using the built-in optimizer. The problem with the built-in algorithm is that it only comes up with one overly specific parameter combination, i.e. only exact values. Those may have been the best in the past, but the future may be different (it most definitely will be). However, another way to optimize is to start with a random population of parameter combinations and apply the "survival of the fittest" method to them, "kill" the weakest performers (according to continously adapting "fitness" scores for each combination) and replace them via creating new combinations ("specimen") by recombination ("cross-breeding") of combinations selected from the better performing parts within the population (edit: let me add that the built-in algorithm probably works in a similar way, but what we get is a results table related to one specific testing period instead of a continuous process) . The advantage with a custom genetic optimization is that the optimization doesn't end where the backtest ends, because also in realtime trading we still have this population (saved to a file) to chose the parameters from instead of relying on just one  (best) specific setting (and this population continuous to evolve with every new winner or loser). This is a bit like repeating the whole optimization process after each trade. Who does that? Nobody. One might argue that repetitive reassessment of the "fittest" might actually lead to more overfitting. This isn't true, because at any given time by nature the current "fittest" performer isn't per se any less or more overfitting than the result of the standard method with a single genetic optimization. Both methods come with the important flaw that they rely purely on the past, yet still it can be assumed that if there is any value to the past at all, the more recent past should have a higher validity. But the real difference with custom genetic optimization is in the variability of the parameters and continuous adaptation (market periods!). Just like with biological evolution the population retains a certain amount of "diversity".

And yes, I know... especially option 4 is a bit more complicated and surely isn't for everyone, but I guess this still doesn't invalidate the other options.

Also, this is a subjective analysis and I may be missing some points. Any other ideas / input / critique is much appreciated.

Hi ,

Your findings  are interesting and  coincide with my thinking  actually.

You   said  optimize for the best mean and multiples of its standard deviation of an entry/exit criterion instead of a single number and then also trade accordingly, i.e. for example "entry trade if price is within a range of x +/- n standard deviations and my indicator is within y +/- m standard deviations..."

Can  you  explain this in more details ?Calculating the standart deviation of what  range   exactly ?

 
cemalCan  you  explain this in more details ?Calculating the standart deviation of what  range   exactly ?

This really depends on the individual strategy (and also isn't always possible!). Standard deviations (over n periods) make sense if a value has a changing amount of fluctuation (volatility) over time. This is especially true for the price itself, so just to give an example - if it makes sense with the strategy - one might use SD based trend zones (trend line +/- x*SD) instead of sharp-cutoff trend lines, meaning that you have a spectrum of prices instead of a single cut-off price that fulfills your given criterion. Or we could add SD based error tolerances to oscillators like RSI or STOC. With other examples that show a more stable fluctuation, simple fixed ranges (hence optimizing for the width of the range) could be used instead of standard deviation multiples. The general thought is that overfitting usually results from being overly specific ("black and white"). In this context I may add:

6. A GOOD AGENT IS HARD TO KILL. ;-) What's true for the secret service is also true with Expert Advisors: if we change all parameters by a little amount (let's say 10%), this should't be able to turn a well performing backtest into a complete loser. Apart from using this method to test for a strategy's robustness, we can also allow for such ranges in the first place (and optimize for the center and the deviation tolerance of parameters that fulfill the criteria of e.g. an indicator result instead of optimizing for one specific cut-off).

 

So you measure the SD based trend zones (trend line +/- x*SD) of  price (calculating the  amount of fluctuation-volatility)with the SD based trend lines of oscillators like RSI or stochastic  correct?

I do the similar things   in different way.Basically I do similarity analysis(measuring  the fluctuation-volatility)  and performance(portfolio analysis) analysis and then check how this similarity related to the profit.

 

I was just giving these examples for the sake of easy understanding. I wasn't suggesting a concrete strategy or anything that I apply myself exactly alike.

The essential message was about trying to use ranges instead of specific single values where it's possible and makes sense (sometimes it doesn't!). Then you optimize for the location and width of the range.

This way you allow for a limited amount of "gray spectrum" instead black&white criteria that may have worked out perfectly in the past,  but probably won't work exactly alike in the future.

How exactly it is implemented will be different between strategies. But this was not my point here. I was just mentioning this as part of a general list of tricks against curve fitting.

Whatever you call it - similarity / proximity / ranges - I guess we're talking about the same.

 
Thanks everyone for your help!
I have put your ideas into my concern while testing.
 
cemal:

So you measure the SD based trend zones (trend line +/- x*SD) of  price (calculating the  amount of fluctuation-volatility)with the SD based trend lines of oscillators like RSI or stochastic  correct?

I do the similar things   in different way.Basically I do similarity analysis(measuring  the fluctuation-volatility)  and performance(portfolio analysis) analysis and then check how this similarity related to the profit.

think of it like your strategy is a car engine
if the engine generates all the time a speed of 100km/h and it doesn't change at all it's just 100 it won't work always it will work very well in a straight road but when it comes to a curved small road it will just crash!
but if the car has accelerating is from 0 to 100km/h in 4 seconds and it can stop from 100 to 0 in 5 seconds and the handling is awesome it will just do an awesome performance in almost any road!
so the point is your system must have adaptive parts in it.

Reason: