BanditSelector EA

Overview

The BanditSelector EA is an advanced, adaptive trading system for MetaTrader 4 (MT4) that dynamically selects and manages multiple trading strategies across several symbols and timeframes.

It combines statistical evaluation, machine learning-inspired bandit optimization (UCB1 / ε-greedy), and strict risk & execution control to build a self-learning, low-maintenance portfolio EA.

Core Concept

Instead of relying on a single strategy, the EA continuously evaluates a universe of trading systems (called “arms”) — for example:

Donchian Breakout

Bollinger Mean Reversion

EMA Pullback

For each system and symbol/timeframe combination, it runs a quick historical performance scan (“QuickEvaluate”) to estimate profitability and stability.

The top-performing combinations are then selected and assigned a bandit-learning algorithm, which continuously adjusts position weightings according to real trading performance.

This allows the EA to adapt dynamically — favoring profitable systems while reducing exposure to underperforming ones.

Main Features

Category Description 🔍 Auto-Selection Scans all defined symbols/timeframes and selects the best-performing strategy combinations (“combos”). 🧠 Reinforcement Learning Uses Multi-Armed Bandit Algorithms (UCB1 / ε-greedy) to allocate more risk to consistently winning strategies. ⚙️ Strategy Portfolio Three core systems: Donchian breakout, Bollinger mean reversion, EMA pullback. Extensible design for adding custom arms. 📈 Performance-Adaptive Periodically re-evaluates historical performance and adjusts the active portfolio. 🛡️ Risk Control Risk-based position sizing, configurable max trades, and order rate limits. 🔒 Anti-Spam Protection Multiple safeguards: 1-trade-per-bar rule, cooldown, per-symbol backoff, and global rate limits. Prevents broker flooding. ⏱️ Automatic Re-Evaluation Rebuilds the active portfolio every InpReevalMinutes to adapt to new market conditions. 💾 Self-Learning Feedback Tracks real trade outcomes and continuously refines the reward model in real-time. 💹 Multi-Symbol Support Handles multiple pairs (e.g. EURUSD, GBPUSD, USDJPY, XAUUSD) simultaneously.

Technical Highlights

Written 100% in MQL4 (no DLLs, no external dependencies).

Modular structure ( Utils.mqh , Arms.mqh , Algorithms.mqh , Rewards.mqh ) for easy customization.

Compatible with 4- and 5-digit brokers.

Optimized for stability and low CPU load.

All trades include a unique comment format:

Bandit|<ArmName>|<Symbol>|<Timeframe> — enabling automatic reward tracking.

Anti-Spam & Safety Mechanisms

Mechanism Function HasOpenPosition() Prevents multiple open trades on the same symbol and Magic Number. OneTradePerBar Ensures only one trade per bar per strategy combination. MinSecondsBetweenTrades Cooldown between new entries for the same system/symbol. MaxNewOrdersPerHour Caps the total new orders per hour to avoid broker throttling. Backoff System Exponential delay after OrderSend failures (e.g., OffQuotes, Busy, Requotes). Lot-by-Risk Calculation Auto-adjusts lot size according to stop size and balance percentage.

Together, these protections make the EA broker-safe and execution-friendly even on fast-tick markets.

Inputs (Key Parameters)

Parameter Description InpUniverseSymbols Comma-separated list of symbols to scan (e.g. EURUSD,GBPUSD,USDJPY,XAUUSD ). InpTimeframes Timeframes to evaluate (e.g. M5,M15,H1,H4 ). InpLookbackBars Number of bars for performance evaluation. InpMinTradesEval Minimum trades required to validate a combo. InpTopCombos Number of top-performing combos to keep active. InpReevalMinutes Re-evaluation period (minutes). InpRiskPerTradePct Risk per trade as percentage of account balance. InpBanditAlgo Bandit algorithm (UCB1 or ε-greedy). InpEpsilon Exploration factor for ε-greedy algorithm. InpOneTradePerBar Enables “one trade per bar” rule. InpMinSecondsBetweenTrades Cooldown time between trades for the same ArmKey. InpMaxNewOrdersPerHour Global cap on new orders/hour. InpBackoffBaseSeconds / MaxSeconds OrderSend error backoff timing.

Recommended Usage

Start on a demo account to observe automatic symbol/strategy selection. Once stable, switch to a live account with micro lot sizes. Monitor the Experts log for re-evaluation cycles and backoff behavior. Adjust risk and re-evaluation parameters gradually.

Planned Enhancements (optional modules)

Spread filter and trading session restrictions.

Equity stop / daily drawdown guard.

Time-based or trailing exits.

News trading filter (pause before/after high-impact events).

Summary

BanditSelector EA is not a fixed trading system — it’s a portfolio manager and self-learning strategy selector.

It aims to minimize overfitting by continuously testing, ranking, and allocating risk among diverse strategies.

The result is an EA that evolves over time and adapts to changing market regimes, while maintaining strict control over risk and execution safety.