How do you detect when your EA stops matching its backtest?

Michael Schouten 2026.04.14 17:36 #11

This thread evolved way beyond my original CUSUM question — and that's exactly what I was hoping for.

The progression from output-level monitoring (CUSUM on equity) → input-level filtering (Enrique's outlier cap) → generalized external filtering (fxsaber's BestInterval with any indicator) is a clean framework.

Key takeaway for me: the filter doesn't need to live inside the strategy logic. Decoupling detection from execution opens up a much wider design space. Appreciate the insights from everyone here.

Fully connected neural layer Return jump Forced test stop: TesterStop

[Deleted] 2026.04.14 17:43 #12

michael schouten:

I've been running live EAs for a while and kept running into the same blind spot: by the time I notice performance degraded, I'm already 10-15 trades deep into the drawdown. Looking at the equity curve doesn't catch it early — too noisy.

What I've started doing is comparing live trades against the backtest distribution statistically, trade by trade, instead of waiting for monthly review:

For each closed trade, compute the pip outcome
Maintain a rolling CUSUM (cumulative sum of deviations from backtest mean R)
Alert when CUSUM crosses a threshold calibrated to the backtest's own variance

The math is basically Page's CUSUM test — standard SPC stuff, but I haven't seen much discussion of applying it to EA monitoring specifically. Most people either (a) eyeball equity, or (b) wait for X losing trades in a row, which is way too late.

Two things I'm still figuring out:

Threshold calibration — I'm using 4σ but it feels arbitrary. Has anyone tuned this for forex specifically? The tail behaviour is non-gaussian enough that standard SPC assumptions feel shaky.
Regime changes vs. genuine strategy decay — CUSUM fires on both. Any ideas how to tell them apart without waiting weeks?

Curious how others here handle this. Do you monitor per-trade deviation, or something else entirely

Your approach is actually quite solid. Using CUSUM on trade outcomes is a much better early warning system than watching the equity curve, which is usually too noisy to detect small but persistent drift.

Regarding threshold calibration: instead of assuming something like 4σ, it's better to calibrate the threshold using the backtest trade distribution itself. One simple method is to run a Monte Carlo simulation on the backtest trades (shuffle or bootstrap the trade list thousands of times) and apply the same CUSUM logic to those sequences. Then measure how often the alarm triggers. This lets you pick a threshold based on a desired false-alarm rate (for example 1% or 5%) rather than relying on Gaussian assumptions, which rarely hold for trading returns.

Also, it's usually better to measure outcomes in R (profit divided by risk per trade) instead of raw pips, since that keeps the distribution more stable across different volatility conditions.

For distinguishing regime changes vs strategy decay, one practical approach is running multiple monitors at different horizons. For example a short window (~20 trades) and a longer window (~100+ trades). If only the short window fires, it's often just a temporary regime shift. If both short and long monitors start drifting, that’s a stronger indication the strategy edge may actually be degrading.

Another useful metric to monitor alongside trade results is MAE/MFE drift. If your maximum adverse excursion starts increasing compared to the backtest distribution while win rate stays similar, it often indicates a change in market conditions before the PnL degradation becomes obvious.

Testing Report - Algorithmic Trading Report - Trading How to Choose a

Michael Schouten 2026.04.14 17:58 #13

avantikajain jain #:
Your approach is actually quite solid. Using CUSUM on trade outcomes is a much better early warning system than watching the equity curve, which is usually too noisy to detect small but persistent drift.

Regarding threshold calibration: instead of assuming something like 4σ, it's better to calibrate the threshold using the backtest trade distribution itself. One simple method is to run a Monte Carlo simulation on the backtest trades (shuffle or bootstrap the trade list thousands of times) and apply the same CUSUM logic to those sequences. Then measure how often the alarm triggers. This lets you pick a threshold based on a desired false-alarm rate (for example 1% or 5%) rather than relying on Gaussian assumptions, which rarely hold for trading returns.

Also, it's usually better to measure outcomes in R (profit divided by risk per trade) instead of raw pips, since that keeps the distribution more stable across different volatility conditions.

For distinguishing regime changes vs strategy decay, one practical approach is running multiple monitors at different horizons. For example a short window (~20 trades) and a longer window (~100+ trades). If only the short window fires, it's often just a temporary regime shift. If both short and long monitors start drifting, that’s a stronger indication the strategy edge may actually be degrading.

Another useful metric to monitor alongside trade results is MAE/MFE drift. If your maximum adverse excursion starts increasing compared to the backtest distribution while win rate stays similar, it often indicates a change in market conditions before the PnL degradation becomes obvious.

Good additions. Monte Carlo on bootstrapped trade sequences for threshold calibration is cleaner than assuming normality — that's going into my next iteration.

The MAE/MFE drift point is sharp. Adverse excursion increasing before win rate drops is exactly the kind of leading indicator that CUSUM on PnL alone misses. That's monitoring the quality of the trade, not just the outcome.

Multi-horizon CUSUM (short vs long window) for separating regime shift from decay is practical. Short fires alone = weather. Both fire = climate change. Simple heuristic but effective.

Testing Report - Algorithmic How to Choose a Trading Report - Trading

Ryan L Johnson 2026.04.15 15:33 #14

Enrique Dangeroux #:
[T]he patterns found are structural and can last for years without any re-optmizmation.

I've recently found this to be absolutely true─almost by mistake. I am now wondering in amazement why this is the case. Despite my deep-dive research, all that I've found are ongoing philosophical debates. I'm thinking more like... lengthy historic price patterning is popular among institutional algorithmic traders, but I can't find anything to confirm that theory. Do you have any additional information as to why this works so well?

Forum on trading, automated trading systems and testing trading strategies

Indicators: Price prediction by Nearest Neighbor found by a weighted correlation coefficient

Ryan L Johnson, 2026.04.15 03:06

@Stanislav Korotky, Thank you for directing me to the documentation. The "first form" is highly useful for copying future buffer values─especially on custom charts.

@Vladimir, Thank you, albeit belatedly, for posting your source code of this indicator. I had no idea how valid such price patterns from decades ago could be today. I used it to catch 2 winning trades right out of the gate. One was circa 1999, and the other was circa 2005.

Here is the indicator code that I am using now (with the GV removed):

Search - Toolbars Articles on the development Return jump

Enrique Dangeroux 2026.04.15 16:31 #15

Ryan L Johnson #:
Do you have any additional information as to why this works so well?

Define this.

Most likely our view of what is a pattern do not match. I was talking in general sense. Specifially i am searching for relative change of price patterns in a tick stream, not patterns on charts.

ZigZag - Trend Indicators ZigZag - Trend Indicators How to Purchase an

Ryan L Johnson 2026.04.15 17:26 #16

Enrique Dangeroux #:

Define this.

Most likely our view of what is a pattern do not match. I was talking in general sense. Specifially i am searching for relative change of price patterns in a tick stream, not patterns on charts.

It seems to me that "this" is a price pattern─regardless of the specific price resolution studied. I'm presently riding another profitable pattern circa 1986. Correlation coefficients between the historic patterns and current price activity, thus far, are all above 96%.

I suppose that I just find it odd that historic price patterning works so well. Me having stumbled across a patterning indicator and then researched it, versus you having purposefully delved into a project, I thought that you had already done some research as to any underlying reason that price patterning works. That is all.

Recurrent neural networks Real and Generated Ticks Additional Features - Price

Enrique Dangeroux 2026.04.18 10:58 #17

Ryan L Johnson #:

It seems to me that "this" is a price pattern─regardless of the specific price resolution studied. I'm presently riding another profitable pattern circa 1986. Correlation coefficients between the historic patterns and current price activity, thus far, are all above 96%.

I suppose that I just find it odd that historic price patterning works so well. Me having stumbled across a patterning indicator and then researched it, versus you having purposefully delved into a project, I thought that you had already done some research as to any underlying reason that price patterning works. That is all.

Yes i have done research. Without it, it would not even be possible and I fully understand my method and model. But again, it is in no way related to what you call "historic price patterning". Perhaps pattern, in the context of my method is the wrong word but the language barier prevents me from coming up with a better one.

The only reason i use historical data is bceause future data does not exist and the sole use of the data is to align (curve fit) the model to market structure. Once aligned, it it does not change and as a result there is no dependancy on history. From the models perspective time does not exists.

I did take a look at your indicator but there are issues with it which makes it almost impossible to explain why it "works".

- Dependancy on history. The depth of the searchable history drastically impacts results.

- Outcome shifting with new data. Every time a new data point is added, the nearest neighbor for a given pattern can change entirely.

- Arbitrary pattern size. The chosen window size directly impacts the results.

There are probably more issues, but these 3 things makes it highly fragile. If you want to do more research, at the very least, anchor the history, limit the depth of the history to atleast not take into account new data points so you have a fixed point of reference instead of a continuesly changing reference.

Recurrent neural networks Problem statement Statistical analysis and fuzzy

Ryan L Johnson 2026.04.18 14:19 #18

Enrique Dangeroux #:
Perhaps pattern, in the context of my method is the wrong word but the language barier prevents me from coming up with a better one.

Got it.

Enrique Dangeroux #:
I did take a look at your indicator...

Just to give credit where credit is due, it is Vladimir's indicator. I merely updated the code to the current MQL5 language.

Enrique Dangeroux #:
If you want to do more research, at the very least, anchor the history, limit the depth of the history to atleast not take into account new data points so you have a fixed point of reference instead of a continuesly changing reference.

I'm actually content with the indicator's current functionality. The indicator scans all available price history which I like. Granted, I'm not likely to stray away from the Fiber (EURUSD) because it generally has the lengthiest quality historic data. Programmatically, I'm detecting when current price action is closely aligned with the sampled pattern─which I'm trading. Continuously updating to the pattern having the highest correlation coefficient is positively incorporated into my logic. On the flip side, it's rather obvious when current price breaks away from the sampled pattern. I suppose that the breakaways could be traded but, in my case, they trigger my dynamic exit. Of course, I have my usual fixed "safety stop" placed a "red bus" away from price just in case...

Data types and values Technical aspects of timeseries First program

How do you detect when your EA stops matching its backtest? - page 2