Discussing the article: "MetaTrader 5 Machine Learning Blueprint (Part 18): Sequential Bootstrap, Corrected — Clone, Class Erasure, and the Comparison Toolkit"

MetaQuotes 2026.06.29 14:49

Check out the new article: MetaTrader 5 Machine Learning Blueprint (Part 18): Sequential Bootstrap, Corrected — Clone, Class Erasure, and the Comparison Toolkit.

The article diagnoses two defects that neutralize sequential bootstrap during cross‑validation: type erasure of SequentiallyBootstrappedBaggingClassifier and a fold‑level shape mismatch from cloning full samples info sets. It retains the classifier's identity, adds find seq bagging to re‑inject fold‑sliced t1 in CalibratorCV.fit, and resets state per split. A new bootstrap_comparison module reports OOF and OOB metrics and memory, letting you verify that sequential sampling is applied correctly and quantify its impact.

After the calibration step in ModelDevelopmentPipeline, a diagnostic check revealed that the out-of-fold Brier score of the sequential bootstrap arm was 0.2442 — the same, to four decimal places, as the standard bootstrap arm at 0.2441. The two numbers should not be that close. Part 5 demonstrated, both analytically and through Monte Carlo simulation, that sequential bootstrap draws samples of higher average uniqueness than standard bootstrapping, and that this uniqueness advantage propagates into better out-of-bag discrimination. Identical out-of-fold Brier scores from two samplers that differ in their construction principle are not a coincidence — they are a symptom.

Tracing the symptom reveals two defects operating in sequence. The first is a type erasure: _apply_sequential_bagging transferred the fitted estimators from a SequentiallyBootstrappedBaggingClassifier into a plain BaggingClassifier shell, discarding the class identity that CalibratorCV would need to treat the sequential arm differently from the standard one. With the genuine class erased, CalibratorCV saw two standard BaggingClassifier pipelines and produced identical fold refits for both. The second defect is a shape mismatch that a naive removal of the erasure would expose: sklearn's clone() preserves constructor parameters — including the full samples_info_sets of length N — but each fold refit receives only n_train < N rows, causing the classifier's indicator-matrix shape check to raise a ValueError before any training begins.

Both defects are corrected in this article. We retain the genuine SequentiallyBootstrappedBaggingClassifier in the production pipeline. We also add _find_seq_bagging() to CalibratorCV.fit() to detect the sequential classifier and re-inject fold-sliced samples_info_sets before each refit. Finally, we ship a standalone bootstrap_comparison module to show the before-and-after difference at both the sampling and predictive levels.

Author: Patrick Murimi Njoroge

New comment