Building an Object-Oriented ONNX Inference Engine in MQL5
You trained a model in Python, exported it to ONNX, and now want to run inference directly inside MQL5 without external Python processes, websockets, or unstable DLLs.
In practice, such integrations typically fail in four places: the terminal errors out because input/output tensor shapes do not match the compiled graph; live predictions become invalid because live-data preprocessing differs from the training preprocessing; ONNX session handles leak memory when the session lifecycle is not tightly controlled; and running inference on every incoming tick overloads the CPU and kills backtesting performance.
This article targets a reproducible, production-oriented framework that treats these constraints as first-class requirements: explicitly define tensor shapes, implement deterministic in-terminal preprocessing, manage ONNX session lifecycle safely, and restrict inference frequency so the engine runs reliably both on live charts and in the Strategy Tester.

The Architecture of ONNX and Computational Graphs
To understand the implementation mechanics, one must analyze how machine learning models are structurally saved and exported. The Open Neural Network Exchange is a universal open format built to represent machine learning models uniformly across different programming environments. When a neural network is exported from Python using libraries like TensorFlow or Scikit-Learn to ONNX, it is serialized into a static mathematical graph consisting of nodes and edges. Each node represents a specific mathematical operation, such as matrix multiplication or a non-linear activation function, while the edges represent the multidimensional arrays, known as tensors, flowing between these operations.
When the MetaTrader 5 terminal loads an ONNX file, it utilizes its native internal engine to reconstruct this exact mathematical graph in the system memory. The terminal requires precise instructions regarding the dimensionality of the data it will receive and the dimensionality of the data it must output. If a neural network was trained in Python to accept five technical indicators as inputs, the input tensor shape is strictly defined as an array of five elements. Attempting to feed raw market closing prices directly into the ONNX engine without matching this predefined tensor shape will result in an immediate terminal error and the failure of the inference session. Managing these input and output shapes programmatically is the absolute foundation of a stable integration.
Hardware Acceleration and AVX Instruction Sets
A common misconception among retail developers is that running machine learning models via external Python websockets is computationally equivalent to running them natively. This is a severe architectural fallacy. When an Expert Advisor relies on local network ports to transmit live tick data to a Python script, it suffers from severe serialization and deserialization latency, often losing critical milliseconds during high-impact news events. By loading the ONNX graph directly into the MetaTrader 5 environment, the terminal automatically leverages Advanced Vector Extensions natively supported by modern processors. This hardware acceleration justifies binding tensors directly in MQL5 instead of using external integrations that add latency and reduce EA performance, providing an execution environment capable of handling complex institutional quantitative models seamlessly.
Object-Oriented Encapsulation of the Inference Session
Managing machine learning models requires meticulous resource allocation. Opening an ONNX session consumes significant local memory, and failing to release these resources will result in resource exhaustion. To ensure production stability, we encapsulate the entire inference pipeline within a dedicated MQL5 class. This approach avoids global variables and strictly regulates the lifecycle of the neural network graph. We declare private variables to hold the unique ONNX handle assigned by the terminal and the specific array structures required for passing data back and forth.
The initialization method of our class is responsible for locating the ONNX file in the terminal's data directory and securely creating the session handle using the native OnnxCreate function. Once the handle is successfully generated, the class must explicitly define the data types for the input and output tensors. In financial time series forecasting, models generally process floating-point numbers, requiring the MQL5 engine to format the tensor nodes strictly as double-precision variables.
//+------------------------------------------------------------------+ //| ONNX_Engine.mqh | //| Copyright 2026, MetaQuotes Ltd. | //+------------------------------------------------------------------+ #property copyright "Open Source" #property version "1.00" //+------------------------------------------------------------------+ //| Class: CONNXPredictor | //| Purpose: Manages ONNX session, tensor binding, and inference | //+------------------------------------------------------------------+ class CONNXPredictor { private: long m_onnx_handle; // Unique session identifier string m_model_path; // Directory path to the .onnx file bool m_is_initialized; // State validation flag bool SetTensorShapes(void); public: CONNXPredictor(string model_filename); ~CONNXPredictor(void); bool Initialize(void); double PredictDirection(const double &features[]); void NormalizeData(double &features[], double min_val, double max_val); bool IsReady(void) const { return m_is_initialized; } }; //+------------------------------------------------------------------+ //| Constructor: Assigns the file path and sets default states | //+------------------------------------------------------------------+ CONNXPredictor::CONNXPredictor(string model_filename) { m_model_path = model_filename; m_onnx_handle = INVALID_HANDLE; m_is_initialized = false; } //+------------------------------------------------------------------+ //| Destructor: Safely releases the ONNX graph from system memory | //+------------------------------------------------------------------+ CONNXPredictor::~CONNXPredictor(void) { if(m_onnx_handle != INVALID_HANDLE) { OnnxRelease(m_onnx_handle); } }
Tensor Shape Definition and Execution Pipeline
After creating the session handle, the engine must lock the tensor dimensions. The OnnxSetInputShape and OnnxSetOutputShape functions provide the MQL5 terminal with the exact architectural blueprint of the imported neural network. For demonstration within this framework, we assume a standard classification model trained to process five distinct market features and output a single predictive probability value. The input shape is therefore defined as an array containing the dimensions, specifically one row and five columns. The output shape is defined as one row and one column. If the shapes defined in the MQL5 code deviate from the shapes compiled during the Python training phase, the execution block will immediately abort.
The actual inference is executed through the PredictDirection method. This method accepts a dynamic array of double-precision features collected from the live market. The native OnnxRun function acts as the bridge, injecting the MQL5 array directly into the first node of the computational graph and extracting the final calculated result from the last node. We implement strict mathematical validation checks to ensure that the input array contains the exact number of elements expected by the tensor blueprint before authorizing the operation. This process prevents array access violations during volatile market conditions.
//+------------------------------------------------------------------+ //| Initializes the ONNX session and binds the required tensor shapes| //+------------------------------------------------------------------+ bool CONNXPredictor::Initialize(void) { m_onnx_handle = OnnxCreate(m_model_path, FILE_COMMON); if(m_onnx_handle == INVALID_HANDLE) { PrintFormat("Error: Failed to load ONNX model. Code: %d", GetLastError()); return false; } if(!SetTensorShapes()) return false; m_is_initialized = true; return true; } //+------------------------------------------------------------------+ //| Defines the strict mathematical boundaries for the data exchange | //+------------------------------------------------------------------+ bool CONNXPredictor::SetTensorShapes(void) { long input_shape[] = {1, 5}; if(!OnnxSetInputShape(m_onnx_handle, 0, input_shape)) return false; long output_shape[] = {1, 1}; if(!OnnxSetOutputShape(m_onnx_handle, 0, output_shape)) return false; return true; } //+------------------------------------------------------------------+ //| Injects live data into the graph and extracts the prediction | //+------------------------------------------------------------------+ double CONNXPredictor::PredictDirection(const double &features[]) { if(!m_is_initialized || ArraySize(features) != 5) return 0.0; double vector_input[]; double vector_output[1]; ArrayCopy(vector_input, features); if(!OnnxRun(m_onnx_handle, ONNX_NO_CONVERSION, vector_input, vector_output)) { Print("Error: ONNX inference execution failed."); return 0.0; } return vector_output[0]; } //+------------------------------------------------------------------+ //| Applies Min-Max Normalization to the extracted feature array | //+------------------------------------------------------------------+ void CONNXPredictor::NormalizeData(double &features[], double min_val, double max_val) { int total = ArraySize(features); double range = max_val - min_val; if(range <= 0.0) return; // Prevent division by zero for(int i = 0; i < total; i++) { features[i] = (features[i] - min_val) / range; // Clamp values to ensure strict boundaries if(features[i] > 1.0) features[i] = 1.0; if(features[i] < 0.0) features[i] = 0.0; } }
Feature Extraction and Real-Time Normalization Mathematics
A machine learning model is entirely dependent on the quality and formatting of its input data. Neural networks are exceptionally sensitive to the absolute scale of numbers. If a model was trained in Python using data normalized between zero and one, feeding it raw currency prices, such as 1.0950, will severely distort the mathematical weights inside the hidden layers and produce a nonsensical prediction. Developers must meticulously replicate the exact preprocessing logic used during the training phase directly inside the MQL5 environment before passing the array to the ONNX engine.
This requires dynamic data engineering. In our architecture, the Expert Advisor is responsible for extracting specific technical data points from the live chart. These raw values are then mathematically standardized or min-max scaled by the MQL5 script. We implement a specific NormalizeData method within our class. This method iterates through the extracted array, subtracts the theoretical minimum value, and divides by the theoretical range. Crucially, the minimum and maximum parameters fed into this method must be hardcoded or dynamically imported directly from the Python scaler object used during the original model training. This guarantees that the final array fed into the ONNX session perfectly mirrors the statistical environment the neural network expects, ensuring deterministic and reliable predictive outputs.
Out-of-Sample Decay and Statistical Degradation
Even a flawlessly integrated mathematical engine cannot protect an algorithmic portfolio from statistical degradation. Financial markets undergo continuous regime changes, shifting aggressively between low-volatility mean reversion phases and high-volatility momentum phases. A neural network trained on a specific historical dataset will inevitably suffer from out-of-sample decay as new, unseen market structures emerge that invalidate the original training assumptions. While our object-oriented class ensures the seamless execution of the ONNX model, developers must actively monitor the predictive accuracy over time. Deploying an Expert Advisor based on machine learning requires establishing rigid retraining schedules, automatically updating the model weights in Python and replacing the ONNX file to prevent trading on stale model parameters.
Matrix Multiplication Dynamics and CPU Protection
To deploy this engine, we construct an Expert Advisor that instantiates the CONNXPredictor class. The fundamental rule of high-performance quantitative development remains constant: protect the central processing unit from redundant calculations. Machine learning inference is a computationally expensive operation. Even a relatively small neural network requires the processor to perform heavy matrix operations to resolve the computational graph. Running the OnnxRun function on every single incoming market tick is a severe architectural error that will degrade live performance and dramatically slow down backtesting processes.
We avoid redundant scans by implementing a strict algorithmic filter within the OnTick event handler. By tracking the current bar open time, the EA establishes an explicit application rule: feature extraction and inference run strictly once per new bar. When a new bar closes, the extracted features are normalized and processed. If the neural network outputs a predictive probability greater than a defined threshold, the algorithm uses this data for execution decisions. This structural alignment ensures that the terminal operates with maximum efficiency.
//+------------------------------------------------------------------+ //| EA_ONNX_Predictor.mq5 | //| Copyright 2026, MetaQuotes Ltd. | //+------------------------------------------------------------------+ #property copyright "Open Source" #property version "1.00" #include <Trade\Trade.mqh> #include "ONNX_Engine.mqh" //--- Input Parameters input string InpModelName = "predictive_model.onnx"; // Model filename input double InpThreshold = 0.75; // Confidence threshold //--- Global Objects CONNXPredictor *g_ai_engine; CTrade g_trade; //+------------------------------------------------------------------+ //| Expert initialization function | //+------------------------------------------------------------------+ int OnInit() { g_ai_engine = new CONNXPredictor(InpModelName); if(!g_ai_engine.Initialize()) { return INIT_FAILED; } return(INIT_SUCCEEDED); } //+------------------------------------------------------------------+ //| Expert deinitialization function | //+------------------------------------------------------------------+ void OnDeinit(const int reason) { if(CheckPointer(g_ai_engine) == POINTER_DYNAMIC) { delete g_ai_engine; } } //+------------------------------------------------------------------+ //| Expert tick function with strict CPU optimization | //+------------------------------------------------------------------+ void OnTick() { static datetime last_bar_time = 0; datetime current_bar_time = iTime(_Symbol, PERIOD_CURRENT, 0); if(current_bar_time != last_bar_time) { last_bar_time = current_bar_time; if(CheckPointer(g_ai_engine) != POINTER_INVALID && g_ai_engine.IsReady()) { double live_features[5]; // Feature Extraction Logic (Example variables) for(int i = 0; i < 5; i++) live_features[i] = iClose(_Symbol, PERIOD_CURRENT, i+1); // Normalize the data between theoretical min and max bounds g_ai_engine.NormalizeData(live_features, 1.0000, 1.2000); // Request prediction from the computational graph double prediction = g_ai_engine.PredictDirection(live_features); PrintFormat("AI Model Confidence: %.2f", prediction); // Execute trade routing based on model probability if(prediction >= InpThreshold && !PositionSelect(_Symbol)) { g_trade.Buy(0.10, _Symbol, 0, 0, 0, "AI Output"); } } } } //+------------------------------------------------------------------+
Conclusion
We produced a compact, runnable integration that addresses the four practical failure modes above. The core artifact is the CONNXPredictor class which safely creates and releases the ONNX session handle, explicitly binds input/output tensor shapes, validates input vector dimensions, performs deterministic min-max normalization in MQL5, and executes OnnxRun with error checks. The deliverables let you drop an ONNX file into the project, supply a fixed-size feature vector, observe prediction outputs in the terminal log, and actuate trading decisions based on a configurable threshold.
Boundaries and verification steps are explicit: normalization parameters must match those used during training; the EA runs inference once per new bar to enforce CPU safety; and the class destructor releases the ONNX graph to prevent memory leaks. For production deployment, add monitoring for prediction drift and automated retraining, and consider an asynchronous execution layer for order routing. These extensions integrate without altering the core tensor-binding and lifecycle logic.
File Structure Table
| File Name | Description |
|---|---|
| ONNX_Engine.mqh | Complete source code for the object-oriented ONNX session manager. |
| EA_ONNX_Predictor.mq5 | Complete source code for the execution Expert Advisor. |
Warning: All rights to these materials are reserved by MetaQuotes Ltd. Copying or reprinting of these materials in whole or in part is prohibited.
This article was written by a user of the site and reflects their personal views. MetaQuotes Ltd is not responsible for the accuracy of the information presented, nor for any consequences resulting from the use of the solutions, strategies or recommendations described.
RiskGate: Centralized Risk Management for Multiple EAs
Cross Recurrence Quantification Analysis (CRQA) in MQL5: Building a Complete Analysis Library
Determining Fair Exchange Rates Using PPP and IMF Data
Integrating MQL5 with Data Processing Packages (Part 9): Entropy-Based Adaptive Volatility
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
You agree to website policy and terms of use