MetaTrader 5 Machine Learning Blueprint (Part 17): CPCV Backtesting — From Python Model to Tick-Level Evidence

MetaTrader 5 — Integration | 4 June 2026, 13:00

1 566

Patrick Murimi Njoroge

Introduction

The Python pipeline described in Parts 8 through 12 produces a trained model, a fitted probability calibrator, a feature specification, and an events DataFrame. These artifacts answer one question about the model: does it have edge on historical bar-level returns? They leave a second question unanswered: will that edge survive execution costs? Spread, slippage, commission, and swap are not abstract numbers; they are frictions that erode the theoretical advantage the model captured. A Sharpe ratio distribution computed from bar-by-bar P&L is a useful diagnostic. One computed from tick-level fills is the evidence base on which a deployment decision should be made.

This article builds the bridge between those two worlds. The pipeline exports its artifacts in Python-native formats (ONNX, pickle, parquet). MetaTrader 5's Strategy Tester consumes flat files (CSV, JSON). An export script translates between them. On the MQL5 side, an expert advisor loads the translated artifacts in OnInit(). It constructs features from bar data, runs ONNX inference, applies the calibration map, sizes positions using the logic from Part 10 and Part 11, and executes orders on the tick stream. The Strategy Tester's optimization mode then runs each of the φ[N, k] combinatorial paths as a separate agent, producing tick-accurate equity curves that Python collects and analyzes.

The result is a path Sharpe distribution and PBO audit computed from real tick fills. A deployment decision is based on three numbers from that distribution: the median path Sharpe (is the edge real after costs?), the path Sharpe standard deviation (is the performance stable across temporal configurations?), and the PBO (is strategy selection better than chance?).

This article is Part 17 of the MetaTrader 5 Machine Learning Blueprint series. Part 12 produced the calibrated model whose ONNX export is the primary input here. The Unified Validation Pipeline article defined the CPCV fold structure and PBO computation that this article's Strategy Tester orchestration reproduces.

What the Pipeline Exports and What MQL5 Needs

Pipeline Artifacts

When ModelDevelopmentPipeline.run(export_onnx=True, calibrate=True) completes, _save_all_artifacts() writes the following files to the versioned model directory:

	Python Artifact	Format	Contents
1.	model_.onnx*	ONNX	Full sklearn Pipeline (StandardScaler + classifier), converted via skl2onnx. The scaler is baked into the graph.
2.	calibrator_.joblib*	joblib	Fitted CalibratorCV.calibrator_: an IsotonicRegression or LogisticRegression depending on the method parameter.
3.	feature_names_.pkl*	cloudpickle	Ordered list of feature column names, matching the ONNX model's input tensor layout.
4.	events_.parquet*	parquet	Triple-barrier events with t1 (label end time), bin, tW, w. Used for CPCV fold boundary computation.
5.	config_.json*	JSON	Full training configuration: symbol, bar type, sizing parameters, HPO settings.

None of these formats is directly consumable by MQL5. The ONNX file is the exception: MetaTrader's native OnnxCreate() loads it directly. Everything else requires translation. The calibrator must be decomposed into its breakpoint arrays. Feature specifications must be serialized as flat JSON. CPCV fold assignments must be precomputed and written as per-path CSV masks.

A Critical Constraint: The Scaler Is Baked In

The full sklearn Pipeline (StandardScaler + classifier) is exported as a single ONNX graph via convert_sklearn(model, ...). Therefore, the StandardScaler parameters are already part of the ONNX computation graph. MetaTrader 5 must pass raw feature values directly to OnnxRun(). Applying a manual z-score transformation before inference would double-scale the inputs and corrupt every prediction silently.

The feature specification JSON does export the training-set mean and standard deviation for each feature. These values are included for diagnostic validation — to confirm that the raw values computed in MQL5 match the expected distributional range from the Python pipeline — not for transformation. The BuildFeatureVector() function returns raw values only.

The Implementation Contract

Python handles fold computation, artifact translation, and post-processing of results. MQL5 handles tick-accurate simulation of a single CPCV path per Strategy Tester pass. This division keeps each side doing what it does best. Python's CombinatorialPurgedCV generates the φ[N, k] path assignments; it understands purging, embargo, and combinatorial recombination. The Strategy Tester's built-in parallelization runs those assignments concurrently across CPU cores; it understands spread, slippage, swap, and commission.

A central design decision is to precompute the path-to-bar mapping in Python and export one mask file per path. This avoids exporting fold boundaries and reconstructing the combinatorial logic in MQL5. The EA's job reduces to a binary search: is this bar's timestamp in my path's mask file?

Python-to-MQL5 translation architecture for CPCV backtesting

Figure 1. 5-stage illustration of the Python-to-MQL5 translation architecture for CPCV backtesting

Stage 1: Python pipeline artifacts (model_*.onnx, calibrator_*.joblib, feature_names_*.pkl, events_*.parquet).
Stage 2: export_pipeline_artifacts.py translates each artifact to a flat file and precomputes CPCV path masks.
Stage 3: MQL5/Files/ml_artifacts/ receives the translated files, including one path_N.csv per combinatorial path.
Stage 4: Strategy Tester optimization mode runs CPCVBacktest.mq5 once per path (InpPathIndex 0→4 for N=6, k=2).
Stage 5: cpcv_postprocess.py collects per-path equity CSVs, computes the path Sharpe distribution, and runs the PBO audit.

The Export Script: Translating Artifacts to MQL5 Formats

The export script is the single point of translation between the Python pipeline and MetaTrader 5. It loads the model directory using load_from_path(), extracts each artifact, converts it to a flat-file format, and writes the results to the Common\Files\ml_artifacts\ directory where the EA expects to find them. Files written to Common\Files\ are accessible to both MQL5 and the Python process on the same machine.

Loading the Model Directory

from pathlib import Path
import json
import shutil
import numpy as np
import pandas as pd
from sklearn.isotonic import IsotonicRegression
from sklearn.linear_model import LogisticRegression
from afml.production.file_manager import ModelFileManager
from afml.cross_validation.combinatorial import CombinatorialPurgedCV

MODEL_DIR  = Path("./Models/my_strategy/EURUSD/.../a1b2c3d4")
MQL5_FILES = Path(r"C:\...\AppData\Roaming\MetaQuotes\Terminal\...\Common\Files")
OUT_DIR    = MQL5_FILES / "ml_artifacts"
OUT_DIR.mkdir(parents=True, exist_ok=True)
(OUT_DIR / "results").mkdir(exist_ok=True)

N_FOLDS = 6
K_TEST  = 2
# phi = C(6, 2) * 2 // 6 = 15 * 2 // 6 = 5 paths

mgr  = ModelFileManager()
arts = mgr.load_from_path(MODEL_DIR)

model         = arts["model"]          # sklearn Pipeline (scaler + classifier)
calibrator    = arts["calibrator"]     # IsotonicRegression or LogisticRegression
feature_names = arts["feature_names"]  # ordered list
events        = arts["events"]         # DataFrame with t1
config        = arts["config"]         # training config dict

Exporting the Calibrator

The calibrator is either an IsotonicRegression or a LogisticRegression, depending on the method parameter passed to CalibratorCV. Both must be decomposed into their numerical parameters and written as CSV or JSON so that MQL5 can reconstruct the mapping without Python-specific serialization formats.

For isotonic regression, the mapping is a piecewise constant step function defined by two parallel arrays: the x-breakpoints and the y-values. Scikit-learn exposes these as X_thresholds_ and y_thresholds_. For Platt scaling, the mapping is a sigmoid defined by the coefficient and intercept of the fitted LogisticRegression.

if isinstance(calibrator, IsotonicRegression):
    x_pts = calibrator.X_thresholds_
    y_pts = calibrator.y_thresholds_
    pd.DataFrame({"x": x_pts, "y": y_pts}).to_csv(
        OUT_DIR / "calibrator.csv", index=False
    )
    cal_meta = {"method": "isotonic", "n_breakpoints": len(x_pts)}
elif isinstance(calibrator, LogisticRegression):
    A = float(calibrator.coef_[0, 0])
    B = float(calibrator.intercept_[0])
    cal_meta = {"method": "platt", "A": A, "B": B}

json.dump(cal_meta, open(OUT_DIR / "calibrator_meta.json", "w"), indent=2)

Exporting the Feature Specification

The ONNX model expects features in the exact column order recorded in feature_names. Any deviation in ordering, lookback, or computation type produces predictions that are numerically plausible but semantically wrong: the model applies learned weights for feature A to the value of feature B, with no error signal to flag the mismatch.

The export script writes a JSON file recording each feature's name, index, and the training-set normalization parameters (mean and standard deviation from the fitted preprocessor). These parameters are diagnostic only; the EA does not apply them before calling OnnxRun(). Two fields — type and lookback — are written as placeholders that the practitioner must fill in to match the actual feature engineering logic used during training.

# Extract normalization params from the fitted preprocessor
preprocessor = model.steps[0][1]   # (name, transformer) → transformer
has_mean  = hasattr(preprocessor, "mean_")
has_scale = hasattr(preprocessor, "scale_")

feature_specs = []
for i, name in enumerate(feature_names):
    spec = {
        "name": name,
        "index": i,
        "type": "RSI"
,  # placeholder — edit to match your feature set
        "lookback": 14, # placeholder — edit to match your feature set
        "mean": float(preprocessor.mean_[i])  if has_mean  else 0.0,
        "std": float(preprocessor.scale_[i]) if has_scale else 1.0,
    }
    feature_specs.append(spec)

json.dump(feature_specs,
          open(OUT_DIR / "feature_spec.json", "w"), indent=2)

Generating CPCV Path Masks

This is the most important export step. Each CPCV path must be represented as a set of bar timestamps that constitute that path's test set. The EA loads one mask file per Strategy Tester pass and trades only on bars whose timestamps appear in that file.

The number of reconstructed backtest paths is φ[N, k] = C(N, k) × k // N. For N=6, k=2: C(6, 2)=15 splits, φ = 15 × 2 // 6 = 5 paths. A common error is to confuse the number of splits (15) with the number of paths (5). CombinatorialPurgedCV.get_path_ids() returns an (n_splits, k) matrix. It maps each test fold in each split to a path index. The export script iterates over this structure and collects timestamps per path.

cv = CombinatorialPurgedCV(
    n_folds=N_FOLDS, n_test_folds=K_TEST,
    t1=events["t1"], pct_embargo=0.01,
)

n_paths  = cv.n_test_paths    # 5 for N=6, k=2
path_ids = cv.get_path_ids()  # shape (n_splits, k)
X_dummy  = pd.DataFrame(np.zeros((len(events), 1)), index=events.index)

path_bars = {p: [] for p in range(n_paths)}

for split_idx, (_, test_lists) in enumerate(cv.split(X_dummy)):
    for fold_j, test_idx in enumerate(test_lists):
        path_id = path_ids[split_idx, fold_j]
        timestamps = events.index[test_idx]
        path_bars[path_id].extend(timestamps.tolist())

for path_id, timestamps in path_bars.items():
    pd.Series(sorted(set(timestamps)), name="timestamp").to_csv(
        OUT_DIR / f"path_{path_id}.csv", index=False
    )

meta = {"n_folds": N_FOLDS, "k_test": K_TEST,
        "n_paths": n_paths, "symbol": config.get("symbol", "")}
json.dump(meta, open(OUT_DIR / "cpcv_meta.json", "w"), indent=2)
print(f"Exported {n_paths} path masks — InpPathIndex: 0..{n_paths-1}")

After running this script, the Common\Files\ml_artifacts\ directory contains: the ONNX model file, calibrator.csv and calibrator_meta.json, feature_spec.json, five path_N.csv files (for N=6, k=2), and cpcv_meta.json.

Reproducing the Feature Pipeline in MQL5

Indicator Handles and the Feature Specification Struct

MQL5's indicator API is handle-based: iRSI(), iATR(), and iMA() each return an integer handle, not a value. Values are retrieved via CopyBuffer() referencing that handle. The EA creates all required handles once in OnInit() and releases them in OnDeinit(). FeatureEngine.mqh manages this lifecycle.

The feature type enumeration and per-feature specification struct are defined in FeatureEngine.mqh:

//+------------------------------------------------------------------+
//| ENUM_FEAT_TYPE: indicator type for each feature.                 |
//| Extend this enum and add a matching case in BuildFeatureVector() |
//| for any indicator type used in your Python feature engineering.  |
//+------------------------------------------------------------------+
enum ENUM_FEAT_TYPE
  {
   FEAT_RSI,        // RSI(period)
   FEAT_ATR_NORM,   // ATR(period) / Close
   FEAT_LOG_RETURN, // log(Close[1] / Close[1+period])
   FEAT_MA_RATIO,   // Close / SMA(period) - 1.0
   FEAT_HIST_VOL,   // rolling std-dev of log-returns over last period bars
  };

struct SFeatureSpec
  {
   string         name;      // column name from Python
   int            index;     // position in the ONNX input tensor
   ENUM_FEAT_TYPE type;      // computation type
   int            lookback;  // indicator window
   double         mean;      // training-set mean (diagnostic only)
   double         std_dev;   // training-set std  (diagnostic only)
  };

Feature Vector Construction

The BuildFeatureVector() function iterates over the spec array, retrieves each indicator value via CopyBuffer(), and writes the result as a raw float into the output array. No z-score transformation is applied; the StandardScaler is part of the ONNX graph. The bar index argument to all indicator and price functions is 1 (the most recently closed bar). Using bar index 0 introduces look-ahead bias at the tick level because the forming bar's close price is still changing.

//+------------------------------------------------------------------+
//| BuildFeatureVector: compute raw features for the closed bar.     |
//|                                                                  |
//| Returns raw (unscaled) float values.  The StandardScaler is      |
//| baked into the ONNX graph; passing scaled values would corrupt   |
//| inference results.                                               |
//+------------------------------------------------------------------+
bool BuildFeatureVector(float &features[])
  {
   if(!g_handles_valid || g_n_features == 0)
      return(false);
   if(ArrayResize(features, g_n_features) < 0)
      return(false);

   double buf[1];
   for(int i = 0; i < g_n_features; i++)
     {
      double raw = 0.0;
      switch(g_feat_specs[i].type)
        {
         case FEAT_RSI:
            if(CopyBuffer(g_rsi_handle[i], 0, 1, 1, buf) < 0)
               return(false);
            raw = buf[0];
            break;

         case FEAT_ATR_NORM:
           {
            if(CopyBuffer(g_atr_handle[i], 0, 1, 1, buf) < 0)
               return(false);
            double close1 = iClose(_Symbol, _Period, 1);
            raw = (close1 > 0) ? buf[0] / close1 : 0.0;
            break;
           }

         case FEAT_LOG_RETURN:
           {
            double c1 = iClose(_Symbol, _Period, 1);
            double c0 = iClose(_Symbol, _Period, g_feat_specs[i].lookback + 1);
            raw = (c1 > 0 && c0 > 0) ? MathLog(c1 / c0) : 0.0;
            break;
           }

         case FEAT_MA_RATIO:
           {
            if(CopyBuffer(g_ma_handle[i], 0, 1, 1, buf) < 0)
               return(false);
            double close1 = iClose(_Symbol, _Period, 1);
            raw = (buf[0] > 0) ? (close1 / buf[0]) - 1.0 : 0.0;
            break;
           }
        }
      features[i] = (float)raw;  // ONNX input is float32
     }
   return(true);
  }

A validation step that should not be skipped: after building the feature vector for the first bar of a backtest, log the raw values and compare them against the Python pipeline's output for the same bar. Off-by-one errors in lookback indexing are the most common source of silent prediction corruption. Comparing the first bar explicitly catches these errors before they propagate across thousands of inference calls.

ONNX Inference and Calibration in MQL5

Loading and Running the ONNX Model

The model is loaded once in OnInit() and reused across all bars. The input tensor shape must match the feature count exactly. A critical difference from the earlier draft is that the ONNX input must be a 2D tensor of shape (1, n_features), not a 1D array. MQL5's OnnxRun() requires that the array dimensions match the shape set by OnnxSetInputShape().

//+------------------------------------------------------------------+
//| OnTick: new-bar guard, mask check, inference, sizing, execution. |
//+------------------------------------------------------------------+
void OnTick()
  {
//--- New-bar guard
   datetime current = iTime(_Symbol, _Period, 0);
   if(current == g_last_bar_time)
      return;
   g_last_bar_time = current;

//--- Use the most recently CLOSED bar (index 1)
   datetime bar_time = iTime(_Symbol, _Period, 1);
   if(!IsTestBar(bar_time))
      return;

//--- Build raw feature vector (no z-score — scaler is baked into ONNX)
   float features[];
   if(!BuildFeatureVector(features))
      return;

//--- 2D input tensor required by OnnxRun()
   float input_data[1][FE_MAX_FEATURES];
   for(int i = 0; i < g_n_features; i++)
      input_data[0][i] = features[i];

   float output_data[1][2];
   if(!OnnxRun(g_onnx_handle, ONNX_DEFAULT, input_data, output_data))
      return;

   double raw_prob = (double)output_data[0][1];  // P(class=1)
   double cal_prob = ApplyCalibrator(raw_prob);

//--- Signal + Kelly sizing from Parts 10 and 11
   double signal     = GetSignal(cal_prob, 2);
   double kelly_m    = KellyMultiplier(cal_prob, InpPayoffRatio, InpKellyFraction);
   double final_size = signal * kelly_m;

   ExecuteOrder(final_size, bar_time);
  }

Applying the Calibrator

For isotonic regression, the calibrated value for any raw probability in the interval [x[i], x[i+1]) is y[i] directly. Isotonic regression is piecewise constant, not piecewise linear. The binary search returns the left-segment value at index lo; no interpolation is performed. This matches IsotonicRegression.predict() in scikit-learn exactly.

For Platt scaling, the calibrated value is the two-parameter sigmoid: 1 / (1 + exp(-(A × raw + B))). The parameters A and B are exported in calibrator_meta.json and read at OnInit().

//+------------------------------------------------------------------+
//| ApplyCalibrator: piecewise constant lookup or sigmoid.           |
//+------------------------------------------------------------------+
double ApplyCalibrator(double raw_prob)
  {
   if(g_cal_method == CAL_METHOD_ISOTONIC)
     {
      int n = ArraySize(g_cal_x);
      if(n == 0)              return(raw_prob);
      if(raw_prob <= g_cal_x[0])     return(g_cal_y[0]);
      if(raw_prob >= g_cal_x[n-1])  return(g_cal_y[n-1]);
      int lo = 0, hi = n - 1;
      while(hi - lo > 1)
        {
         int mid = (lo + hi) / 2;
         if(g_cal_x[mid] <= raw_prob) lo = mid;
         else hi = mid;
        }
      return(g_cal_y[lo]);  // piecewise constant: left-segment value
     }
//--- Platt scaling: sigmoid(A * x + B)
   return(1.0 / (1.0 + MathExp(-(g_platt_A * raw_prob + g_platt_B))));
  }

Implementing CPCV with the Strategy Tester

Path Mask Loading

Each Strategy Tester pass loads a single path mask file. Rather than using a hash map (which requires Generic/HashMap.mqh), the EA loads the timestamps into a sorted array and performs binary search for O(log n) lookup on each bar. For typical CPCV test windows of a few hundred to a few thousand bars, this is negligible overhead.

Files are opened with the FILE_COMMON flag, which resolves paths relative to the terminal's Common\Files\ folder. This flag is required; without it, the Strategy Tester cannot locate files written by the Python export script on some broker configurations.

//+------------------------------------------------------------------+
//| LoadPathMask: load sorted timestamp array from path_N.csv.       |
//+------------------------------------------------------------------+
bool LoadPathMask(int path_index)
  {
   string fname = StringFormat(ARTIFACTS_DIR + "path_%d.csv", path_index);
   int fh = FileOpen(fname, FILE_READ | FILE_CSV | FILE_COMMON, ",");
   if(fh == INVALID_HANDLE)
     {
      PrintFormat("LoadPathMask: cannot open %s, error=%d", fname, GetLastError());
      return(false);
     }
   FileReadString(fh);   // skip header
   g_n_test_bars = 0;

   while(!FileIsEnding(fh))
     {
      string ts_str = FileReadString(fh);
      if(StringLen(ts_str) == 0)
         continue;
      if(ArrayResize(g_test_bars, g_n_test_bars + 1) < 0)
        {
         FileClose(fh);
         return(false);
        }
      g_test_bars[g_n_test_bars] = StringToTime(ts_str);
      g_n_test_bars++;
     }
   FileClose(fh);
   ArraySort(g_test_bars);  // sort for binary search
   PrintFormat("LoadPathMask: path %d — %d bars", path_index, g_n_test_bars);
   return(g_n_test_bars > 0);
  }

bool IsTestBar(datetime bar_time)
  {
   int lo = 0, hi = g_n_test_bars - 1;
   while(lo <= hi)
     {
      int mid = (lo + hi) / 2;
      if(g_test_bars[mid] == bar_time) return(true);
      if(g_test_bars[mid] < bar_time)  lo = mid + 1;
      else hi = mid - 1;
     }
   return(false);
  }

Order Execution

The ExecuteOrder() function handles three cases: opening a new position, reversing an existing position when the signal flips, and closing when the signal drops below the minimum threshold. Every lot size is normalized via NormalizeLot() before being passed to CTrade, and every order is guarded by a margin check using OrderCalcMargin().

//+------------------------------------------------------------------+
//| NormalizeLot: clamp to broker step and limits.                   |
//+------------------------------------------------------------------+
double NormalizeLot(string symbol, double raw_lot)
  {
   double step = SymbolInfoDouble(symbol, SYMBOL_VOLUME_STEP);
   double mn   = SymbolInfoDouble(symbol, SYMBOL_VOLUME_MIN);
   double mx   = SymbolInfoDouble(symbol, SYMBOL_VOLUME_MAX);
   double lot  = MathRound(raw_lot / step) * step;
   lot = MathMax(mn, MathMin(mx, lot));
   return(NormalizeDouble(lot, 2));
  }

//+------------------------------------------------------------------+
//| CheckMargin: return false if margin is insufficient.             |
//+------------------------------------------------------------------+
bool CheckMargin(string symbol, double lots, ENUM_ORDER_TYPE order_type)
  {
   MqlTick tick;
   if(!SymbolInfoTick(symbol, tick))
      return(false);
   double price  = (order_type == ORDER_TYPE_SELL) ? tick.bid : tick.ask;
   double margin = 0.0;
   double free   = AccountInfoDouble(ACCOUNT_MARGIN_FREE);
   if(!OrderCalcMargin(order_type, symbol, lots, price, margin))
      return(false);
   return(margin <= free);
  }

Parallelizing Across Paths

In optimization mode, the Strategy Tester runs each InpPathIndex value in a separate agent (0..φ − 1). This is not a parameter search; it is a reuse of the tester's parallel infrastructure for path simulation. For N=6, k=2 (φ=5 paths), configure the optimizer as follows:

//--- Strategy Tester settings:
//---   Optimization:  Complete (slow)
//---   Model:         Every tick based on real ticks
//---   InpPathIndex:  from=0, to=4, step=1  (phi=5 for N=6, k=2)
//---   Custom criterion: OnTester() return value

//+------------------------------------------------------------------+
//| OnTester: return path Sharpe as optimization criterion.          |
//+------------------------------------------------------------------+
double OnTester()
  {
   return(ComputePathSharpe());
  }

//+------------------------------------------------------------------+
//| OnDeinit: close position, write equity CSV, release resources.   |
//+------------------------------------------------------------------+
void OnDeinit(const int reason)
  {
   ClosePosition();
   WritePathCSV(InpPathIndex);
   ReleaseIndicatorHandles();
   if(g_onnx_handle != INVALID_HANDLE)
      OnnxRelease(g_onnx_handle);
  }

The date range in the Strategy Tester must cover the full events span, not just the test folds. The EA internally skips bars whose timestamps do not appear in its path mask. A date range shorter than the events span causes the EA to miss test bars that fall outside the tester window.

Path Reporting and Python Post-processing

Collecting Path Results

After all passes complete, each path has written a CSV to Common\Files\ml_artifacts\results\path_N.csv containing the bar-level equity series for that path. The postprocessor reads these files, constructs the returns matrix, computes the path Sharpe distribution, and runs the PBO audit.

The compute_pbo() function requires a t1 series aligned with the returns matrix index. Because CPCV is symmetric, we pass a neutral t1 where each timestamp serves as its own end-time:

from pathlib import Path
import pandas as pd
import numpy as np
from afml.cross_validation.pbo import compute_pbo

results_dir = Path(r"C:\...\Common\Files\ml_artifacts\results")
n_paths = 5   # phi = 5 for N=6, k=2

path_series = []
for i in range(n_paths):
    df  = pd.read_csv(results_dir / f"path_{i}.csv", parse_dates=["timestamp"])
    eq  = df.set_index("timestamp")["equity"]
    ret = eq.pct_change().fillna(0)
    path_series.append(ret.rename(i))

returns_matrix = pd.concat(path_series, axis=1).fillna(0)

# Neutral t1 for symmetric CPCV (required parameter)
t1_neutral = pd.Series(returns_matrix.index, index=returns_matrix.index)

# Path Sharpe distribution
path_sharpes = returns_matrix.apply(
    lambda s: s.mean() / s.std() * np.sqrt(252)
    if s.std() > 1e-9 else 0
)

# PBO audit
pbo_result = compute_pbo(returns_matrix, t1=t1_neutral, n_folds=8)

print(f"Median path Sharpe:  {path_sharpes.median():.3f}")
print(f"Path Sharpe std:     {path_sharpes.std():.3f}")
print(f"PBO:                 {pbo_result['pbo']:.4f}")

Interpreting the Distribution

Three questions determine whether the strategy is suitable for deployment:

Is the median path Sharpe positive after costs? A strategy whose bar-level CPCV looked promising but whose tick-level distribution centers near zero has an edge that execution costs consume entirely. The shift between the bar-level and tick-level distributions quantifies the cost of trading the strategy in exact dollar terms.
Is the distribution tight? A wide distribution signals fragility. The model's performance depends on which temporal configuration it faced, not on a stable structural pattern. A standard deviation above 0.5 warrants caution; above 1.0, the strategy is path-dependent rather than structurally sound.
Is the PBO below 0.5? PBO measures the probability that the best in-sample strategy configuration will underperform the median out-of-sample. A PBO near 0.5 means selection is no better than chance. A PBO below 0.15 is strong evidence against overfitting.

CPCV path Sharpe distribution — tick-level fills (N=6, k=2, φ=5)

Figure 2. 2-panel illustration of the CPCV path Sharpe distribution from tick-level simulation (N=6, k=2, φ=5 paths)

Panel (a): Cumulative returns for each of the five combinatorial paths, computed from bar-level equity recorded in OnDeinit(). All five paths are profitable; the dispersion is low.
Panel (b): Path Sharpe ratios with the median marked as a dashed line. The three-number summary (median Sharpe: 0.71, std: 0.21, PBO: 0.11) satisfies all three deployment criteria.

Practical Walkthrough

The end-to-end procedure from trained model to deployment decision consists of seven steps.

Run the pipeline with ONNX export and calibration enabled:

model, features, metrics, config = pipeline.run(
    calibrate=True, export_onnx=True
)

Run the export script. This produces the five MQL5-consumable artifacts and prints the correct InpPathIndex range:

python export_pipeline_artifacts.py \
    --model-dir ./Models/.../a1b2c3d4 \
    --mql5-dir "C:\...\AppData\Roaming\MetaQuotes\Terminal\...\Common\Files" \
    --n-folds 6 --k-test 2
# Output: Exported 5 path masks — InpPathIndex: from=0, to=4

Edit feature_spec.json. The export script writes placeholder type and lookback values. Update each entry to match the actual indicator type and window used during Python feature engineering. The field order must match feature_names exactly.
Compile CPCVBacktest.mq5 in MetaEditor. Verify zero errors and zero warnings under #property strict.
Configure the Strategy Tester. Set symbol and period to match the training configuration. Select "Every tick based on real ticks". Set the date range to cover the full events span. In the optimization tab, set InpPathIndex from=0, to=4, step=1. Select "Complete (slow)" optimization mode.
Run the optimization. Each of the five agents processes one path independently. On a machine with 8 cores, the run completes in roughly 1.5×–2× the time of a single pass.

Run the postprocessor:

python cpcv_postprocess.py \
    --results-dir "C:\...\Common\Files\ml_artifacts\results" \
    --n-paths 5

If the median path Sharpe is positive, the standard deviation is below 0.5, and the PBO is below 0.5, the strategy has passed the tick-level evidence bar. The next step is forward testing on a demo account, which is outside the scope of this article.

Conclusion

This article closes the loop from Python model training to MQL5 tick-accurate simulation. The pipeline exports five artifact types; an export script translates them into flat files. The EA loads those files at OnInit(), applies them bar by bar inside OnTick(), and reports path-level results via CSV and OnTester(). The Strategy Tester's parallelization runs all φ[N, k] paths concurrently, and Python assembles the returned series into a Sharpe distribution and PBO audit.

Key points from the implementation:

The sklearn pipeline (StandardScaler + classifier) is exported as a single ONNX graph. MQL5 must pass raw feature values to OnnxRun(); applying z-score normalization before inference double-scales the inputs and corrupts predictions silently.
MQL5's indicator API is handle-based: iRSI(), iATR(), and iMA() return handles, not values. Values come from CopyBuffer(). All handles must be created in OnInit() and released in OnDeinit().
The ONNX input tensor must be a 2D array of shape (1, n_features). Passing a 1D array fails silently on some runtime versions.
Path masks must be loaded from Common\Files\ using the FILE_COMMON flag. Without it, the Strategy Tester cannot resolve paths written by the Python export script on some configurations.
For N=6, k=2: C(6,2)=15 splits, φ=5 paths. The Strategy Tester optimization must be configured with InpPathIndex from=0, to=4, step=1. Setting to=14 iterates over non-existent paths.
The deployment decision rests on three numbers: median path Sharpe (positive?), Sharpe std (below 0.5?), and PBO (below 0.5?). Bar-level results are diagnostic; tick-level results are the evidence base.

Attached Files

	File	Language	Description
1.	CPCVBacktest.mq5	MQL5	Expert advisor implementing the full OnInit / OnTick / OnTester / OnDeinit cycle. Loads ONNX model, calibrator, feature specification, and path mask. Applies Part 10 and Part 11 sizing logic. Writes per-path equity CSV on each optimization pass.
2.	FeatureEngine.mqh	MQL5	Include file implementing LoadFeatureSpec(), CreateIndicatorHandles(), BuildFeatureVector(), and ReleaseIndicatorHandles(). Reads feature_spec.json and manages all indicator handles. Returns raw (unscaled) feature values.
3.	Calibrator.mqh	MQL5	Include file implementing LoadCalibrator() and ApplyCalibrator() for both isotonic (piecewise constant binary search) and Platt (sigmoid) methods.
4.	export_pipeline_artifacts.py	Python	Loads pipeline output via load_from_path(), extracts calibrator breakpoints, generates feature specification JSON, precomputes CPCV path masks, and copies the ONNX file to Common\Files\.
5.	cpcv_postprocess.py	Python	Reads per-path equity CSVs, constructs the returns matrix, computes the path Sharpe distribution and PBO audit via compute_pbo(), and writes summary.json.

Attached files |

Download ZIP

CPCVBacktest.mq5 (18.89 KB)

FeatureEngine.mqh (14.59 KB)

Calibrator.mqh (5.63 KB)

cpcv_postprocess.py (6.11 KB)

export_pipeline_artifacts.py (10.62 KB)

Warning: All rights to these materials are reserved by MetaQuotes Ltd. Copying or reprinting of these materials in whole or in part is prohibited.

This article was written by a user of the site and reflects their personal views. MetaQuotes Ltd is not responsible for the accuracy of the information presented, nor for any consequences resulting from the use of the solutions, strategies or recommendations described.