preview
Building an Object-Oriented Z-Score Statistical Arbitrage Engine in MQL5

Building an Object-Oriented Z-Score Statistical Arbitrage Engine in MQL5

MetaTrader 5Trading systems |
111 0
Amanda Vitoria De Paula Pereira
Amanda Vitoria De Paula Pereira

Introduction

When I build a simple mean‑reversion routine in MQL5, the instinct is to use standard oscillators (RSI, Stochastic) with fixed 70/30 thresholds. That works during flat markets, but cross‑pair tests revealed a structural flaw: in strong trends these oscillators can remain in the extreme zone for days or weeks, producing repeated false counter‑trend entries and avoidable drawdowns. The root cause is not mean reversion itself but the normalization: static levels ignore the instrument's current volatility. I needed a normalization that measures price deviation from its mean in units of standard deviation and that can be reused unchanged across visual indicators and automated logic. The Z‑Score fulfills this role — it gives an adaptive, volatility‑aware signal that can be implemented as a single, reusable calculation engine for both chart verification and trading.


The Formula in Simple Words

The core mechanism behind the Z-Score is surprisingly simple. We take the current price, subtract the simple moving average over a specific number of historical bars, and divide that result by the rolling standard deviation calculated over that identical lookback period.

The formula translates to: Z = (Price - Mean) / StdDev

If the resulting value is near zero, the price is fluctuating very close to its local historical average. When the Z-Score deviates and reaches extreme mathematical levels like 2.0 or -2.0, it points to a statistically significant anomaly. In a mean reversion strategy, these specific stretched moments can be considered viable candidates for a potential short or long entry.

Of course, it is extremely important to remember that financial markets do not behave like a perfect normal distribution. Extremes and unexpected breakouts occur much more frequently during high-impact news events and strong macroeconomic shifts than standard mathematical models predict. Therefore, I do not treat the Z-Score as a guaranteed reversal predictor. In my experience, it works better as a dynamic baseline tool to measure how unusual the current market movement appears relative to recent history.


The Z-Score Signal Contract

Before translating this mathematical architecture into MQL5 code, we must establish a strict operational agreement for our trading logic. This signal contract defines exactly how and when a statistical anomaly turns into a trade rule. For our blueprint, the entry threshold is set symmetrically at positive and negative 2.5 sigma. Any value crossing above positive 2.5 indicates an overvalued market, while a cross below negative 2.5 indicates an undervalued asset.

To guarantee statistical validity and prevent the engine from reacting to temporary intraday noise, the execution contract dictates that calculations operate exclusively on fully closed bars. We enforce this constraint by using a fixed bar shift parameter of 1 throughout our execution functions. This means the algorithm ignores the unconfirmed fluctuations of the active candle (index 0) and bases its entries and exits purely on finalized historical data, establishing a clean and verifiable execution path.


Architecture: Keeping the Math in a Separate Class

To avoid mixing complex statistical calculations directly with trade execution routing, I preferred to isolate the Z-Score engine inside a custom include file. Creating a dedicated class achieves two critical things for a developer. First, it keeps the indicator and the advisor deployment scripts completely clean and readable. Second, it allows us to use the exact same calculation logic for both visual chart analysis and automated trade execution without rewriting the foundation.

Let's start by defining the structure of our engine.

//+------------------------------------------------------------------+
//|                                                 ZScoreEngine.mqh |
//|                                  Copyright 2026, MetaQuotes Ltd. |
//+------------------------------------------------------------------+
#property copyright "Open Source"
#property version   "7.10"

//+------------------------------------------------------------------+
//| Class: CZScore                                                   |
//+------------------------------------------------------------------+
class CZScore
  {
private:
   int               m_period;            
   string            m_symbol;            
   ENUM_TIMEFRAMES   m_timeframe;         

   double            CalculateMean(const double &prices[]);
   double            CalculateStandardDeviation(const double &prices[], double mean);

public:
                     CZScore(string symbol, ENUM_TIMEFRAMES tf, int period);
                    ~CZScore(void);
                    
   double            GetZScore(int shift = 1); 
  };

The private section of the CZScore class strictly manages the internal environment variables. We need the lookback period, the target symbol, and the specific timeframe.

Next, we implement the constructor. The public constructor ensures that the initial parameters are safely assigned when the object is instantiated. I included built-in fallbacks here; for instance, if a developer mistakenly enters a lookback range smaller than 2, the class automatically overrides it to a safe default of 20 periods to prevent initialization errors.

//+------------------------------------------------------------------+
//| Constructor and Destructor                                       |
//+------------------------------------------------------------------+
CZScore::CZScore(string symbol, ENUM_TIMEFRAMES tf, int period)
  {
   m_symbol = (symbol == "") ? _Symbol : symbol;
   m_timeframe = tf;
   m_period = (period < 2) ? 20 : period; 
  }

CZScore::~CZScore(void)
  {
  }

Managing object lifecycles is a critical aspect of MQL5 programming. Unlike managed languages with automatic garbage collection, dynamic objects allocated via the new operator exist in the heap space permanently until explicitly destroyed. If a user removes the Expert Advisor from a live chart or switches timeframes, an orphaned pointer will remain in the terminal memory footprint. This accumulation results in a progressive memory leak that eventually degrades system performance. By structure, wrapping the lifetime allocation inside the OnInit and OnDeinit functions of the execution layers ensures that resources are returned cleanly to the operating system.


Implementing the Mathematical Functions

The first required internal calculation is the arithmetic mean. The implementation uses a simple linear loop. It iterates through the provided array, accumulates the historical closing prices, and divides the total sum by the available array length.

//+------------------------------------------------------------------+
//| Calculates the Arithmetic Mean                                   |
//+------------------------------------------------------------------+
double CZScore::CalculateMean(const double &prices[])
  {
   double sum = 0.0;
   int total = ArraySize(prices);
   
   for(int i = 0; i < total; i++)
     {
      sum += prices[i];
     }
     
   return (total > 0) ? (sum / total) : 0.0;
  }
Once the mean is successfully established, the engine must determine the rolling population variance and the standard deviation. This requires a second pass over the data array. In this loop, the algorithm subtracts the calculated mean from each individual historical price and squares the difference to eliminate negative values.
//+------------------------------------------------------------------+
//| Calculates the Population Standard Deviation                     |
//+------------------------------------------------------------------+
double CZScore::CalculateStandardDeviation(const double &prices[], double mean)
  {
   double variance_sum = 0.0;
   int total = ArraySize(prices);
   
   for(int i = 0; i < total; i++)
     {
      double difference = prices[i] - mean;
      variance_sum += MathPow(difference, 2);
     }
     
   double variance = (total > 0) ? (variance_sum / total) : 0.0;
   return MathSqrt(variance);
  }

In classical statistics, developers often use the Bessel correction for sample variance, dividing by the number of observations minus one (N - 1). However, in this specific algorithmic implementation, I deliberately divide by the total number of periods (N). We are not dealing with a random, abstract statistical sample, but rather a fixed, rigid window of the last N bars. Avoiding the correction yields a more deterministic and stable normalization metric specifically tailored for machine execution.


Retrieving Data: The MQL5 Array Trap

The public GetZScore method acts as the bridge between the chart data and our mathematical loops. It extracts historical bars via the CopyClose function.

One of the most common traps in MQL5 is dealing with array indexing. By default, arrays are filled from left to right. To make the historical data intuitive, I enforce ArraySetAsSeries. This flips the index, meaning index 0 is always the current bar, and index 1 is the previous closed bar.

//+------------------------------------------------------------------+
//| Orchestrates data extraction and returns the Z-Score             |
//+------------------------------------------------------------------+
double CZScore::GetZScore(int shift)
  {
   double prices[];
   ArraySetAsSeries(prices, true);
   
   if(CopyClose(m_symbol, m_timeframe, shift, m_period, prices) < m_period)
     {
      return 0.0; 
     }
     
   double current_close = iClose(m_symbol, m_timeframe, shift);
   double mean = CalculateMean(prices);
   double std_dev = CalculateStandardDeviation(prices, mean);
   
   if(std_dev == 0.0) return 0.0;
   
   return (current_close - mean) / std_dev;
  }

Checking the return value of CopyClose is absolutely essential. When an Expert Advisor or Indicator is attached to a chart for the first time, the local terminal database may not contain the requested historical bars. The initial call to CopyClose triggers an asynchronous history download request to the broker server. During those initial brief moments, the function returns a value of zero or less than the required period length. Proceeding under incomplete data conditions will severely bias the calculations. Aborting the execution pass and returning a neutral zero value protects the system from calculating values based on empty memory blocks. The method also intercepts scenarios where the standard deviation evaluates to exactly zero during completely flat market conditions, avoiding critical division-by-zero crashes that would halt the Expert Advisor.


Indicator: Visualizing the Logic First

Before tying this calculation directly to automated market orders and risking capital, it is helpful to simply see how the metric behaves across historical data. Because the architecture is completely modular, we can instantiate the exact same CZScore class within a custom indicator.

Print

//+------------------------------------------------------------------+
//|                                                   Ind_ZScore.mq5 |
//|                                  Copyright 2026, MetaQuotes Ltd. |
//+------------------------------------------------------------------+
#property copyright "Open Source"
#property version   "9.00"
#property indicator_separate_window
#property indicator_buffers 1
#property indicator_plots   1

//--- Plotting Configuration
#property indicator_label1  "Z-Score"
#property indicator_type1   DRAW_LINE
#property indicator_color1  clrDodgerBlue
#property indicator_style1  STYLE_SOLID
#property indicator_width1  2

#include "ZScoreEngine.mqh"

//--- Input Parameters
input int InpZScorePeriod = 50; // Lookback Period

//--- Buffers and Engine
double   ZScoreBuffer[];
CZScore *g_zscore_engine;

In the initialization block, we set up the index buffer and plot horizontal levels at 2.5, 0.0, and -2.5. These static lines serve as fast visual references for statistical extremes.

//+------------------------------------------------------------------+
//| Custom indicator initialization function                         |
//+------------------------------------------------------------------+
int OnInit()
  {
   SetIndexBuffer(0, ZScoreBuffer, INDICATOR_DATA);
   IndicatorSetInteger(INDICATOR_DIGITS, 2);
   
   //--- Add horizontal levels to represent statistical extremes
   IndicatorSetInteger(INDICATOR_LEVELS, 3);
   IndicatorSetDouble(INDICATOR_LEVELVALUE, 0, 2.5);
   IndicatorSetDouble(INDICATOR_LEVELVALUE, 1, 0.0);
   IndicatorSetDouble(INDICATOR_LEVELVALUE, 2, -2.5);
   
   g_zscore_engine = new CZScore(_Symbol, PERIOD_CURRENT, InpZScorePeriod);
   return(INIT_SUCCEEDED);
  }

void OnDeinit(const int reason)
  {
   if(CheckPointer(g_zscore_engine) == POINTER_DYNAMIC)
     {
      delete g_zscore_engine;
     }
  }

The iteration loop maps the rolling calculation directly onto the indicator window. To protect CPU efficiency, I utilize the prev_calculated parameter. Instead of recalculating the entire history on every single tick, the indicator only calculates the new bars that form after the initial load. This makes parameter adjustments and diagnostic monitoring highly performant.

//+------------------------------------------------------------------+
//| Custom indicator iteration function                              |
//+------------------------------------------------------------------+
int OnCalculate(const int rates_total,
                const int prev_calculated,
                const datetime &time[],
                const double &open[],
                const double &high[],
                const double &low[],
                const double &close[],
                const long &tick_volume[],
                const long &volume[],
                const int &spread[])
  {
   if(rates_total < InpZScorePeriod) return 0;
   
   int start = (prev_calculated > 0) ? prev_calculated - 1 : InpZScorePeriod;
   
   for(int i = start; i < rates_total && !IsStopped(); i++)
     {
      int shift = rates_total - 1 - i;
      ZScoreBuffer[i] = g_zscore_engine->GetZScore(shift);
     }
     
   return(rates_total);
  }
//+------------------------------------------------------------------+


EA: Connecting the Trading Logic

Once it became clear that the calculation was rendering correctly across the chart timeline, the next step was to tie it to an automated script. When writing the execution logic, I had a choice: build a custom routing function using the native MQL5 OrderSend structure, or use the standard library. I opted to include the native CTrade class. Dealing with MqlTradeRequest and MqlTradeResult structures manually requires writing hundreds of lines of error-handling code. The standard CTrade class abstracts all of this, keeping the logic highly focused on our statistical rules rather than low-level order management.

I kept the strategy parameters straightforward: execute entries at the absolute extremes and close the trade on a zero return.

//+------------------------------------------------------------------+
//|                                          EA_ZScore_Reversion.mq5 |
//|                                  Copyright 2026, MetaQuotes Ltd. |
//+------------------------------------------------------------------+
#property copyright "Open Source"
#property version   "8.90"

#include <Trade\Trade.mqh>
#include "ZScoreEngine.mqh" 

//--- Input Parameters
input int    InpZScorePeriod = 50;   // Lookback period for Mean and StdDev
input double InpEntrySigma   = 2.5;  // Z-Score threshold for trade entry
input double InpLotSize      = 0.10; // Fixed execution volume

//--- Global Objects
CZScore *g_zscore_engine;
CTrade   g_trade;

int OnInit()
  {
   g_zscore_engine = new CZScore(_Symbol, PERIOD_CURRENT, InpZScorePeriod);
   return(INIT_SUCCEEDED);
  }

void OnDeinit(const int reason)
  {
   if(CheckPointer(g_zscore_engine) == POINTER_DYNAMIC)
     {
      delete g_zscore_engine;
     }
  }
A crucial part of creating a robust Expert Advisor is protecting the terminal from redundant calculations. Executing heavy array traversals on every microscopic tick of a 1-minute chart will overload the CPU and severely slow down the Strategy Tester. Therefore, I implemented a static datetime filter. This ensures the execution logic runs strictly once per bar formation, processing data only on finalized candle conditions.
//+------------------------------------------------------------------+
//| Expert tick function with CPU optimization                       |
//+------------------------------------------------------------------+
void OnTick()
  {
   static datetime last_bar_time = 0;
   datetime current_bar_time = iTime(_Symbol, PERIOD_CURRENT, 0);
   
   if(current_bar_time != last_bar_time)
     {
      last_bar_time = current_bar_time;
      
      if(CheckPointer(g_zscore_engine) != POINTER_INVALID)
        {
         double current_zscore = g_zscore_engine->GetZScore();
         
         //--- Exit Logic: Close position if Z-Score reverts to the baseline mean (0.0)
         if(PositionSelect(_Symbol))
           {
            long pos_type = PositionGetInteger(POSITION_TYPE);
            
            if(pos_type == POSITION_TYPE_BUY && current_zscore >= 0.0)
              {
               g_trade.PositionClose(_Symbol);
               Print("Z-Score reverted to mean. Buy closed.");
              }
            else if(pos_type == POSITION_TYPE_SELL && current_zscore <= 0.0)
              {
               g_trade.PositionClose(_Symbol);
               Print("Z-Score reverted to mean. Sell closed.");
              }
           }
         //--- Entry Logic: Open position on extreme statistical divergence
         else
           {
            if(current_zscore >= InpEntrySigma)
              {
               g_trade.Sell(InpLotSize, _Symbol, 0, 0, 0, "Z-Score Overvalued Reversion");
              }
            else if(current_zscore <= -InpEntrySigma)
              {
               g_trade.Buy(InpLotSize, _Symbol, 0, 0, 0, "Z-Score Undervalued Reversion");
              }
           }
        }
     }
  }
//+------------------------------------------------------------------+

When deploying code that evaluates open positions, developers must consider the account environment. The MetaTrader 5 platform supports two accounting models: Netting and Hedging. In a Netting model, an asset can only have a single open position at any given moment; subsequent buy or sell commands modify the volume or close the existing exposure. In a Hedging environment, the terminal allows multiple independent positions to coexist on the identical symbol. The current implementation relies on PositionSelect(_Symbol) . This correctly identifies exposure in a Netting account. However, inside a Hedging structure, it will only select the first open ticket, meaning complex risk parameters require adapting this section with ticket loops.


The Backtesting Environment Setup

To validate the system, I didn't just throw it onto a random chart. I set up the MetaTrader 5 Strategy Tester specifically to match this architecture. Since my OnTick function explicitly uses a static datetime filter to only evaluate the logic when a new bar physically opens, I configured the tester to run on the 'Open prices only' modeling mode. Running an 'Every tick' model for an EA that only triggers on the bar open is a massive waste of testing time and CPU resources. I focused my initial tests on the EURUSD pair using the H1 timeframe, which generally provides a good balance between daily volatility and structural market noise.

Print


Strategy Limitations and Optimization Paths

The strategy configuration used in these initial tests provides a basic baseline, meaning it exhibits clear sensitivity to specific market environments. Testing results show that during prolonged trending regimes, a simple mean reversion approach can experience drawdowns as the price continues to stretch further away before ever returning to the local average. Furthermore, I hardcoded a fixed 0.10 lot size to verify the mathematical execution, but a real-world deployment would require a dynamic margin calculation function to scale risk alongside the account balance.


Conclusion

Following the article you get a reproducible, modular toolkit and a clear checklist for verification. The core math is isolated in ZScoreEngine.mqh (class CZScore with GetZScore()), which prevents code duplication and guarantees identical calculations in any consumer. Ind ZScore.mq5 instantiates that engine for visual validation and plots horizontal reference levels; EA ZScore_Reversion.mq5 reuses the same engine for trading with a simple entry-at-extreme / exit-on-zero rule. To reproduce results: compile the three files, attach the indicator to inspect extremes, then run the EA in Strategy Tester on “Open prices only” (or use a once‑per‑bar evaluation) — for example, H1 EURUSD. This layout makes it easy to iterate: change signal thresholds, add position sizing, or implement volatility‑based scaling (eg, scale entries beyond 3σ) and extend the class to handle multi‑pair arrays for statistical arbitrage — all without rewriting the normalization logic.


File Structure Table

File Name
Description
ZScoreEngine.mqh
Source code for the object-oriented statistical class.
Ind_ZScore.mq5
Source code for the custom visualization indicator.
EA_ZScore_Reversion.mq5
Source code for the mean reversion Expert Advisor.
Attached files |
ZScoreEngine.mqh (3.54 KB)
Ind_ZScore.mq5 (2.94 KB)
Price Action Analysis Toolkit Development (Part 71): Weekend Gap Structure Mapping in MQL5 Price Action Analysis Toolkit Development (Part 71): Weekend Gap Structure Mapping in MQL5
The article delivers an object-based MQL5 implementation that detects weekend gaps from time discontinuities and renders them directly on the chart. It manages graphical objects, tracks state transitions (fresh, partial, reaction, filled), and preserves completed gaps as historical zones. The result is a reproducible framework for monitoring how price revisits and fills weekend gap structures.
Neural Networks in Trading: Anomaly Detection in the Frequency Domain (Final Part) Neural Networks in Trading: Anomaly Detection in the Frequency Domain (Final Part)
We continue to work on implementing the CATCH framework, which combines the Fourier transform and frequency patching mechanisms, ensuring accurate detection of market anomalies. In this article, we complete the implementation of our own vision of the proposed approaches and test the new models on real historical data.
MQL5 Wizard Techniques you should know (Part 93): Using Suffix Automation and an Auto Encoder in a Custom Money Management Class MQL5 Wizard Techniques you should know (Part 93): Using Suffix Automation and an Auto Encoder in a Custom Money Management Class
For this article we switch to a custom MQL5 Wizard class implementation that explores Money Management. We are labelling our custom class ‘CMoneySuffixAE’ that we derive by combining the Suffix Automaton algorithm with an Autoencoder neural network. As always, this formulation is testable with MQL5 Wizard Assembled Expert Advisors that can be tuned with various entry signals and trailing stop approaches.
Market Simulation: Getting started with SQL in MQL5 (I) Market Simulation: Getting started with SQL in MQL5 (I)
In today's article we will begin studying the use of SQL in MQL5 code. We will also look at how to create a database. Or, more precisely, how to create a SQLite database file using the features built into MQL5. We will also see how to create a table, and then how to establish a relationship between tables by using primary and foreign keys. All of this, once again, will be done with MQL5. We will see how easy it is to create code that can later be migrated to other SQL implementations by using a class that helps hide the implementation being created. And, most importantly, we will see that at various points we may face the risk that something will go wrong when using SQL. This happens because, in MQL5 code, SQL code will always be placed inside a string.