Discussing the article: "MetaTrader 5 Machine Learning Blueprint (Part 9): Integrating Bayesian HPO into the Production Pipeline"

 

Check out the new article: MetaTrader 5 Machine Learning Blueprint (Part 9): Integrating Bayesian HPO into the Production Pipeline.

This article integrates the Optuna hyperparameter optimization (HPO) backend into a unified ModelDevelopmentPipeline. It adds joint tuning of model hyperparameters and sample-weight schemes, early pruning with Hyperband, and crash-resistant SQLite study storage. The pipeline auto-detects primary vs. secondary models, prepends a fitted column-dropping preprocessor for safe inference, supports sequential bootstrapping, generates an Optuna report, and includes bid/ask and LearnedStrategy links. Readers get faster, resumable runs and deployable, self-contained models.

The original ModelDevelopmentPipeline.train_model calls clf_hyper_fit, which wraps GridSearchCV or RandomizedSearchCV. Sample weights flow in as a pre-computed array, scoring uses those same weights, and the best pipeline is returned directly. This works, but it has three limitations that the Optuna integration resolves.

Limitation 1: Weight scheme is fixed before HPO. In the sklearn path, get_optimal_sample_weight selects the best weighting scheme before HPO begins. The weight scheme and model hyperparameters are therefore optimized sequentially rather than jointly. A configuration that only performs well with return-attribution weights might be eliminated because uniqueness happened to be selected first.

Limitation 2: No early stopping. GridSearchCV and RandomizedSearchCV evaluate every fold for every trial. With expensive PurgedKFold evaluations on several years of tick data, this wastes compute on configurations that are clearly inferior after fold 1.

Limitation 3: No persistent study. A crashed run discards all completed trials.

The Optuna path resolves all three: weight scheme, decay, and linearity are sampled jointly with model hyperparameters inside _WeightedEstimator; HyperbandPruner eliminates unpromising trials after the first fold; and SQLite storage means a re-run resumes from the last completed trial.

Author: Patrick Murimi Njoroge