My EA passes the strategy tester but fails forward testing.

 

The strategy that I am currently developing performs very well in the strategy tester, but when I do forward testing it fails.

How can I shorteh the gap between strategy tester data and forward testing?

 
Isaac Uriel Arenas Caldera:

The strategy that I am currently developing performs very well in the strategy tester, but when I do forward testing it fails.

How can I shorteh the gap between strategy tester data and forward testing?

It's true, the results of the backtest and forward test will most likely be different. As the common saying goes: "Past performance does not guarantee future results."

Some of the main causes of these gaps are:

1. Overfitting: Input that is overly forced to look “pretty” only on certain historical data.
2. Real Market Conditions: Backtests often use fixed spreads, whereas in live markets, spreads can widen drastically (during news or low liquidity) and there is slippage that is not present in the simulation.
3. Look-ahead Bias: A coding error that makes it appear as if the EA "knows" future prices (peeking at subsequent data) before execution.

My suggestions to narrow the gap:

1. Use a Long Time Range: Perform a backtest of at least 2-3 years (for example from January 1, 2024 to the present) to see the resilience of the strategy in various market conditions.
2. Every Tick Based on Real Ticks: This is mandatory, so that the price simulation is close to actual market movements.
3. Test on a Cent Account: If it passes the rigorous testing, start on a small real account (Cent) to see the real execution of the broker.

Additional Message:
When you start live, don't rush to judge your EA as being wrong when a loss occurs, as long as the loss is still within the drawdown limit of the backtest results. Likewise, when you make a profit, don't immediately feel that the EA is perfect. Keep monitoring long-term performance.
 
Bambang Christianto #:
1. Use a Long Time Range: Perform a backtest of at least 2-3 years (for example from January 1, 2024 to the present) to see the resilience of the strategy in various market conditions.

The appropriate length of a backtest depends on the trading frequency of the EA being tested. In other words, I use a minimum of 10 years for a swing trading strategy that trades 2 or 3 times per week. For a scalping strategy that trades over 100 times per month. I regard the test sample size as the total number of trades─not necessarily the total number of months, years, etc.

Bambang Christianto #:
2. Every Tick Based on Real Ticks: This is mandatory, so that the price simulation is close to actual market movements.

Testing on real ticks is mandatory for EA's that trade on every tick. In contrast, EA's that trade only on OHLC data can more efficiently be tested for longer periods of time on OHLC prices only. The issues of live spread and slippage can be mitigated by knowing/researching your broker-dealer's average spread and average slippage, and entering that information into the Tester. There are free utilities published in the CodeBase for collecting such data.

 
Ryan L Johnson #:

Testing on real ticks is mandatory for EA's that trade on every tick. In contrast, EA's that trade only on OHLC data can more efficiently be tested for longer periods of time on OHLC prices only. The issues of live spread and slippage can be mitigated by knowing/researching your broker-dealer's average spread and average slippage, and entering that information into the Tester. There are free utilities published in the CodeBase for collecting such data.

I've experimented with this, but didn't now that there were utilities in the CodeBase.

 
Bambang Christianto #:
My suggestions to narrow the gap:

1. Use a Long Time Range: Perform a backtest of at least 2-3 years (for example from January 1, 2024 to the present) to see the resilience of the strategy in various market conditions.
2. Every Tick Based on Real Ticks: This is mandatory, so that the price simulation is close to actual market movements.
3. Test on a Cent Account: If it passes the rigorous testing, start on a small real account (Cent) to see the real execution of the broker.

Additional Message:
When you start live, don't rush to judge your EA as being wrong when a loss occurs, as long as the loss is still within the drawdown limit of the backtest results. Likewise, when you make a profit, don't immediately feel that the EA is perfect. Keep monitoring long-term performance.

I'll experiment with this.


Thx!

 
Isaac Uriel Arenas Caldera:

The strategy that I am currently developing performs very well in the strategy tester, but when I do forward testing it fails.

How can I shorteh the gap between strategy tester data and forward testing?

A strategy that performs well in backtesting but fails in forward testing is usually overfitted to historical data. To reduce the gap:

  1. Use out-of-sample and walk-forward testing.
  2. Include realistic spread, slippage, commissions, and execution delay.
  3. Avoid over-optimization; simpler strategies are usually more robust.
  4. Test across different market conditions.
  5. Focus on real market logic, not only historical profit curves.

Also, backtest data will never be identical to real market data. Historical tick data is often compressed or averaged, while real markets may generate far more ticks inside each candle.

Many small price movements are lost in historical data, which is why forward testing is essential before going live.

 
Good points. I’d add one more check: separate the strategy logic from the execution assumptions and test them in two steps. First verify the entry/exit edge on out-of-sample data, then rerun with realistic spread, commission, and delay assumptions. If the edge disappears only after costs are added, the issue is probably that the expected move is too small for live conditions rather than a coding problem. If you want, I can help you outline a simple walk-forward test matrix for this.
 
Hesham Ahmed Kamal Barakat #:
Good points. I’d add one more check: separate the strategy logic from the execution assumptions and test them in two steps. First verify the entry/exit edge on out-of-sample data, then rerun with realistic spread, commission, and delay assumptions. If the edge disappears only after costs are added, the issue is probably that the expected move is too small for live conditions rather than a coding problem. If you want, I can help you outline a simple walk-forward test matrix for this.

That's a really good point!

The first draft of any EA that I code is basically "blind" other than entries and exits. If my test of that first draft EA doesn't approach a profit factor of 2.0 (or other impressive stats), I don't bother to refine that first draft. By the time that weekends, overnights, hours, spread, commission, swaps, and slippage are factored into a subsequent draft, a weak first draft simply isn't worth the effort─because of the "disappearing edge" that you mention.

Put simply, I don't spend time trying to resurrect a junk EA. Also, I really can't even recall the last time that I used the Optimizer.

 
definately recomend monte carlo testing its so easy to over fit backfit strategies
 
The first thing I would check is whether the strategy is overfitted to the tester conditions. Use real tick data, realistic spread/commission/slippage, and test the same execution rules you use live.
Also compare every forward trade with the tester: entry time, price, spread, SL/TP placement and modification logic. The gap usually comes from over-optimization, different broker conditions, spread spikes, or execution assumptions that look fine in the tester but fail live.
If it only works on the optimized backtest and not on fresh forward data, the edge is probably not robust enough yet.
 
That tester/live gap is usually a mix of execution and data assumptions rather than a broken idea. I would compare three things first: spread at entry, slippage on exits, and whether the tick history matches the broker session you trade on. If the live account only breaks when spread widens or fills slip, the edge may be too sensitive to real costs. Have you logged the first 20 to 50 live fills against the tester fills to see where the divergence starts?