Is MT4 optimization a random and inaccurate way to test multiple inputs?

 

Hello!

I have been an MQL4 developer for well over a year now and have come across around 200 different coding projects. I usually use the MT4 strategy tester to quickly see how the EA's I develop react to market movement and do they follow the rules I have set out for them.

I have now recently started developing my own system and am now in a point where I'd like to get the best possible inputs setup for a certain currency before I move on with the development. The easiest way would be to use the MT4 Strategy Tester's optimization feature, which (in theory) loops the tests through with all the possible inputs combinations that I set out for it to test.

What I noticed now is pretty frustrating - turns out that when I use the same exact inputs used by the optimization feature to get one result, I get a totally different result when I run the same test (with the same exact inputs that got the result on the optimization feature, the same exact time horizon and of course the same currency pair).

For example, I have attached two images which contain the information of the optimization result, and the result I got when I used the same inputs in a regular backtest (I had to attach the images since they were in an unreadable quality when I directly inserted them). You can see that during the optimization, the 10th pass got a profit of $33835 with the profit factor of 1.45. When I used the same setup for a regular test, the profit was 5470 and the profit factor 1.13. 

Now I get that with 90% (or even with 99%) modelling quality the results aren't very precise anyway. I also used the current spread but since the strategy uses long term entries on H1 chart, it really doesn't matter if the spread fluctuates by a tick or two.

Now to my question - is the optimization feature of the MT4 using some sort of different tick data for the tests or is it just randomly plotting up the results? How come the test results are so different? 

 
Robert Maidla:

Hello!

I have been an MQL4 developer for well over a year now and have come across around 200 different coding projects. I usually use the MT4 strategy tester to quickly see how the EA's I develop react to market movement and do they follow the rules I have set out for them.

I have now recently started developing my own system and am now in a point where I'd like to get the best possible inputs setup for a certain currency before I move on with the development. The easiest way would be to use the MT4 Strategy Tester's optimization feature, which (in theory) loops the tests through with all the possible inputs combinations that I set out for it to test.

What I noticed now is pretty frustrating - turns out that when I use the same exact inputs used by the optimization feature to get one result, I get a totally different result when I run the same test (with the same exact inputs that got the result on the optimization feature, the same exact time horizon and of course the same currency pair).

For example, I have attached two images which contain the information of the optimization result, and the result I got when I used the same inputs in a regular backtest (I had to attach the images since they were in an unreadable quality when I directly inserted them). You can see that during the optimization, the 10th pass got a profit of $33835 with the profit factor of 1.45. When I used the same setup for a regular test, the profit was 5470 and the profit factor 1.13. 

Now I get that with 90% (or even with 99%) modelling quality the results aren't very precise anyway. I also used the current spread but since the strategy uses long term entries on H1 chart, it really doesn't matter if the spread fluctuates by a tick or two.

Now to my question - is the optimization feature of the MT4 using some sort of different tick data for the tests or is it just randomly plotting up the results? How come the test results are so different? 

First thing to check is the spread. What is the spread used actually ? Current spread means the backtester takes the spread at the moment the test is run. You should used a fixed value if you want to compare.

How do you know the spread fluctuated by 1 tick or 2 ? Even if it's the case, you could be surprised by the influence of such difference.

Check your log the see which spread was used. Or try again with a fixed value.

 

Hi Robert 

The Strategy Tester has been discussed at length over many forum topics and, in my personal opinion, is only useful for testing EA logic, and not as an indicator of performance.

In your particular case, you are using "current spread" in your input options. This means you will get different results every time you run the tester. Your best bet, if you still want to persist with backtesting results, is to work out your broker's average spread on the pair you are testing and set it manually. That is the only way you will get repeatable results.

p.s. LOL, Alain beat me to it while I was typing my reply :D

 

Hello!

Thank you for the prompt reply guys, really appreciate it.

I gave it a go with a fixed spread and used 2 inputs instead of 3 for the optimization. The results were similar to when I used a floating spread.

The optimizer got:

 - 2 times more profit

 - 79 trades more

 - a better profit factor by .02

 - a 7% bigger drawdown when compared to the test run carried out with a regular backtest.

I do get that the results achieved in the strategy tester mean very little compared to the results achieved during live trading, but it's just curious to me how come the optimizer get that different results using the same data. I even tried using both the visual mode and the "Skip to" method, but the results are still totally different from what the optimizer got.

Is there a possibility the optimizer runs through the historical data differently? 

 
Hello Robert. 

I think you have to test 2 cases.

1.if the optimizer returns the same results twice (i recall it does)
2.if the tester returns the same results twice.

My guess is some kind of randomization takes place in your trades "Slippage" setting during the Tester run.
If your slippage is set to 0 then the above is invalid.
 
Lorentzos Roussos:
Hello Robert. 

I think you have to test 2 cases.

1.if the optimizer returns the same results twice (i recall it does)
2.if the tester returns the same results twice.

My guess is some kind of randomization takes place in your trades "Slippage" setting during the Tester run.
If your slippage is set to 0 then the above is invalid.
There is never slippage while backtesting.
 
Lorentzos Roussos:
Hello Robert. 

I think you have to test 2 cases.

1.if the optimizer returns the same results twice (i recall it does)
2.if the tester returns the same results twice.

My guess is some kind of randomization takes place in your trades "Slippage" setting during the Tester run.
If your slippage is set to 0 then the above is invalid.

Hello Lorentzos!

Thank you for the thought. The thing with the backtester is that when you run the same test twice in a row, it mostly uses the cache to simply mirror the prior results of the same test, even when re-optimizing. Nonetheless, I have ran tests with the same settings without the cache files as well. The results vary a bit, but that's understandable. What is totally unacceptable is the optimizer getting results nearly 2 times different from the same test run ran on a regular backtest. 

 

Hello everybody!

First off thank you for participating in the discussion. It has been a pleasure sharing ideas with you guys.

Today I got approached by a helpful person who gave me a new idea on how to approach the issue. Apparently I could go ahead and interpret a self calibration function in the EA that would be called every now and then that would loop through the historic bars itself to seek out the best possible inputs combo during the live run. That way the EA could keep itself up to date so to speak and I wouldn't have to spend time figuring out the best possible setup in the inaccurate backtester.

Before I start putting time into developing such a calibration mechanism, does anyone happen to have been in touch with this sort of system before and could refer me to a sample article or piece of code?

I appreciate all your time thus far, it has been very eye-opening and helpful! 

 

Not only this, it is also possible to loop over all available symbols in your EA to pick the best symbol / the highest odd trades.

This is not possible in the tester, so that fact alone brings great advantages because now it is possible to scan and calculate for example,

The symbol with the highest volume throughput, or

The symbol with the lowest spread, or

The symbol that is currently making the largest trending move,

And many more, it usually results in a combination of these elements.

 
Marco vd Heijden:

Not only this, it is also possible to loop over all available symbols in your EA to pick the best symbol / the highest odd trades.

This is not possible in the tester, so that fact alone brings great advantages because now it is possible to scan and calculate for example,

The symbol with the highest volume throughput, or

The symbol with the lowest spread, or

The symbol that is currently making the largest trending move,

And many more, it usually results in a combination of these elements.

Amen to that.
And you have your EA trade on the top performing Symbol+TF's combinations.
 
Robert Maidla:

Hello everybody!

First off thank you for participating in the discussion. It has been a pleasure sharing ideas with you guys.

Today I got approached by a helpful person who gave me a new idea on how to approach the issue. Apparently I could go ahead and interpret a self calibration function in the EA that would be called every now and then that would loop through the historic bars itself to seek out the best possible inputs combo during the live run. That way the EA could keep itself up to date so to speak and I wouldn't have to spend time figuring out the best possible setup in the inaccurate backtester.

Before I start putting time into developing such a calibration mechanism, does anyone happen to have been in touch with this sort of system before and could refer me to a sample article or piece of code?

I appreciate all your time thus far, it has been very eye-opening and helpful! 

Maybe you can start broad and narrow down the best possible settings.
i'll use the example of a period 
Test 10 20 30 40 50
Best on 30 
Now test 27 33
Best on 33
Now test 31-35
Reason: