MQL5 AI Data Architecture: The Battle Between Numbers and Images — A Comprehensive Guide for Professional Developers to

6 January 2026, 10:39
Saeid Soleimani
0
78
MQL5 AI Data Architecture: The Battle Between Numbers and Images — A Comprehensive Guide for Professional Developers to Optimize Speed, Cost, and System Robustness

Multimodal LLM technology is exciting, but in the low-latency, high-risk trading environment, technical efficiency must take precedence over visual appeal. We will conclusively prove that sending raw numerical data via structured prompts is the only viable path for a profitable and stable Expert Advisor (EA).

1. Dissecting the MQL5 Latency Trap 

In the process of executing a trade command, lost time is never recovered. In high-frequency trading (HFT) or even scalping, a 200-millisecond delay can turn a profitable trade negative. When a signal is generated in MQL5, the external LLM connection process involves four layers of latency:

  • Latency 1: Local Processing: Generating the image and converting it to Base64 (this step is nearly zero when using numerical data).
  • Latency 2: Network Latency: The time the data packet (payload) takes to travel from your terminal to the API server. High-volume Base64 consumes bandwidth.
  • Latency 3: LLM Processing (Inference Time): The time the AI model spends decoding, understanding the image (OCR), and inferring the signal. This is the heaviest part.
  • Latency 4: Return Trip: The signal return journey from the API back to your MQL5 terminal.

1.a. Base64, Visual Processing, and OCR Overhead

Image files (like PNG) must be converted into Base64 text strings to be transmitted in a JSON payload. These strings are 20% to 30% larger than the original file. In contrast, a numerical prompt containing dozens of indicator values is only a few kilobytes (KB). The difference in throughput and model compute time is astronomical.

Technical Warning: Using an image forces the AI model to perform Optical Character Recognition (OCR) on tiny price texts, indicator names, or numerical values. This process is inherently error-prone and time-consuming. Why ask the AI to do what MQL5 has already done with absolute precision?

1.b. Financial Implications: The Token Economy

Most multimodal model providers estimate images based on their resolution and bit depth, equating them to thousands of tokens. This drastically increases the cost of each query. By sending numerical data, we ensure the cost is based on low-volume text, and the token budget is spent on intelligent reasoning (not just vision).

1.c. Quantitative Data Cost Analysis: Image vs. JSON

To better understand, let's conduct a quantitative comparison. A standard chart image (e.g., $512 \times 512$ pixels) might be estimated to cost $\sim 1000$ to $\sim 1500$ tokens for visual models. This is the cost per query.

Approximate Cost Comparison (Based on Common API Rates):

Metric Numerical Data (JSON) Visual Data (Image 512x512)
Data Volume (Approx.) 0.5 to 1.5 KB 250 to 500 KB (Base64)
Token Consumption (Input) 100 to 400 Tokens 1000 to 1500 Tokens
Latency Impact Lowest (~100 ms) Highest (500-1000 ms)


1


2. The Absolute Superiority of Structured Numerical Data Streams

Our goal in data architecture is to send the "absolute truth" to the AI. This truth is not a blurry image, but a dense, filtered set of numerical facts calculated by native MQL5 functions.

2.a. Feature Engineering for AI

Instead of sending raw data (OHLCV), we send engineered features. These features are themselves a summary of the market status across various timeframes:

  • Trend: The difference between the current price and MA50 on the D1 timeframe.
  • Momentum: The value of RSI or Stochastic on the M15 timeframe.
  • Volatility: The ATR value on the H1 timeframe.
  • Structure: The distance of the current price to the nearest key support/resistance level.

2.b. Advanced Features: Combining iIchimoku and Volatility

Professional developers extract deeper features for the prompt so that the AI can understand market structure not only from a moving average perspective but also from a balance and volatility standpoint.

  • iIchimoku Cloud State: Three Boolean or numerical values indicating whether the price is above the cloud (bullish), below the cloud (bearish), or within the cloud (neutral/choppy). This is a powerful long-term trend filter.
  • Volatility Index (VIX) Proxy: For currency pairs, this metric can be created by calculating the ATR over a very long period (e.g., W1) versus the short-term ATR (e.g., M30). Sending this value helps the AI decide whether to employ breakout or reversal strategies.
  • Correlation: If the EA monitors multiple symbols simultaneously, sending the correlation status between them (e.g., EURUSD vs. USDCHF) can assist the AI in portfolio management.


2


2.c. Structured Prompt Architecture (AI as a Logic Processor)

The prompt must include two distinct sections: System Instruction and Input Data. This approach locks the LLM's behavior into a rigid decision matrix, preventing philosophical analyses.

Example: Defining AI Persona and Goal

// System Instruction: Define Persona and Output Format

"You are a High-Frequency Trading Logic Engine. Your only goal is to analyze the provided JSON data and output a signal based on the rules. 
Your output MUST be a JSON object with keys: 'Signal', 'SL', 'TP', 'Reason'. 
'Signal' must be 'BUY', 'SELL', or 'HOLD'. 
Do NOT include any external commentary or philosophical discussion."
                    

2.d. LLM as a State Machine and Chain-of-Thought (CoT)

To enhance accuracy and reduce hallucination in the LLM, we must force the model to follow its thought process step-by-step before issuing the final verdict. This technique is called Chain-of-Thought (CoT). In trading, CoT converts the model into a state machine with separate tasks: state detection, reasoning, and command issuance.

  • Task 1: State Detection: The model must convert raw data into understandable states (e.g., "The daily trend is bullish," "M15 is in the oversold zone").
  • Task 2: Logical Reasoning: The model compares the detected states with your trading rules. This section is the core of the AI's intellectual structure.
  • Task 3: Structured Command Issuance: The model provides the final output in a JSON or string format that MQL5 can easily parse.

4

This process guides the AI's reasoning into a logical and unavoidable path, just like a classic algorithmic system.

3. System Robustness and Data Error Management

The key difference between a "testing" EA and a "production" EA is how it handles errors and ambiguity in the data returned by the LLM. Numerical data makes error handling straightforward.

3.a. The Output Validation Trap

Even with the most precise prompts, the LLM might sometimes corrupt the output with unnecessary commentary (e.g., "I hope this trade goes well!"). If the output is not clean JSON, parsing in MQL5 will fail. Here, the difference between numerical and visual data is catastrophic:

  • In visual mode, if the LLM misreads the image or misinterprets the text (e.g., reading 30,000 as 3,000), your error is hidden and leads to a wrong trade.
  • In numerical mode, if the LLM returns a text instead of a number for a numerical value (e.g., stop loss as "two times ATR"), the `StringToDouble` process in MQL5 immediately fails, preventing the EA from executing the trade. Your error is explicit and manageable.

3.b. Internal MQL5 Risk Management: The Last Line of Defense

MQL5 must always be the last line of defense. By using numerical data, we can add a final validation step. For example, if the LLM issues a buy signal but sets the SL 20 pips below the current price, MQL5 should check that this SL does not exceed a reasonable threshold (e.g., more than 50 pips). This internal control over numerical variables ensures capital security.

4. Systemic Issues with Visual Data: Conflict and Ambiguity

Vision models, while powerful, are not trained to detect specialized chart patterns and always carry the risk of misinterpreting scale, color, or unfamiliar indicators.

4.a. The Normalization and Scaling Problem

When you zoom in or out of a chart, the pixel-to-pip ratio changes. A "compressed" pattern might be recognizable to the human eye, but it creates ambiguity for a vision model trained on standard images.

4.b. Ambiguity in Indicators and Filters

Every trader might display indicators (like MA) with different colors or thicknesses. For a vision model, precisely distinguishing between a "Moving Average 50" and a "Moving Average 100" in an image, if visually similar, is nearly impossible. This information is completely separate and unambiguous in MQL5 numerical data with precise tags.

5. MQL5 Implementation: Advanced CoT Prompting

The `GetAIPrompt` function is updated to incorporate the Chain-of-Thought (CoT) technique and more advanced features (iIchimoku) to increase the quality of the AI's reasoning.

Final Prompt Preparation Function with CoT and Advanced Features (GetAIPrompt)

// MQL5 - Complete Function to Generate Numerical Prompt with Advanced Features & CoT

//+------------------------------------------------------------------+
//| Generates the structured numerical payload for the LLM API.      |
//| Incorporates Advanced Features and Chain-of-Thought (CoT).       |
//+------------------------------------------------------------------+
string GetAIPrompt(string symbol, ENUM_TIMEFRAME timeframe)
{
    // 1. Get Core Price Data
    MqlRates rates[1];
    if(CopyRates(symbol, timeframe, 0, 1, rates) < 1) return "";
    double currentPrice = rates[0].close;
    
    // 2. Calculate Technical Features (Feature Engineering)
    
    // Trend features
    double D1_MA50_val = iMA(symbol, PERIOD_D1, 50, 0, MODE_SMA, PRICE_CLOSE, 0);
    double H4_MA20_val = iMA(symbol, PERIOD_H4, 20, 0, MODE_SMA, PRICE_CLOSE, 0);
    
    // Momentum/Oversold features
    double M15_RSI_val = iRSI(symbol, PERIOD_M15, 14, PRICE_CLOSE, 0);
    
    // Volatility feature for Risk Management
    double H1_ATR_val = iATR(symbol, PERIOD_H1, 14, 0); 
    
    // Advanced Feature: Ichimoku Cloud Status (H4)
    double Senkou_A = iIchimoku(symbol, PERIOD_H4, 9, 26, 52, MODE_SENKOU_SPAN_A, 0);
    double Senkou_B = iIchimoku(symbol, PERIOD_H4, 9, 26, 52, MODE_SENKOU_SPAN_B, 0);
    
    string Ichimoku_Status = "Neutral";
    if (currentPrice > MathMax(Senkou_A, Senkou_B)) Ichimoku_Status = "Bullish_Above_Cloud";
    else if (currentPrice < MathMin(Senkou_A, Senkou_B)) Ichimoku_Status = "Bearish_Below_Cloud";

    // 3. Construct the Data String (JSON format for machine readability)
    string dataJSON = StringFormat(
        "{ \"Symbol\": \"%s\", \"Current_Price\": %s, " + 
        "\"D1_MA50\": %s, \"H4_MA20\": %s, \"M15_RSI\": %s, " +
        "\"H1_ATR\": %s, \"H4_Ichimoku_Status\": \"%s\" }",
        symbol, 
        DoubleToString(currentPrice, _Digits),
        DoubleToString(D1_MA50_val, _Digits),
        DoubleToString(H4_MA20_val, _Digits),
        DoubleToString(M15_RSI_val, 2),
        DoubleToString(H1_ATR_val, _Digits),
        Ichimoku_Status
    );
    
    // 4. Define the Trading Logic with CoT steps (MTA Logic)
    string ruleset = "\nRULES:\n" +
                     "1. Step 1: Major Trend Filter (H4 Ichimoku): If H4_Ichimoku_Status is 'Neutral', Signal must be 'HOLD' regardless of other indicators.\n" +
                     "2. Step 2: Medium Trend Check (D1/H4 MA): Determine Trend Status. BULLISH if Current_Price > D1_MA50. BEARISH if Current_Price < D1_MA50.\n" +
                     "3. Step 3: Signal Confirmation (M15 RSI): Look for exhaustion/reversal signals. Potential Buy if M15_RSI < 35. Potential Sell if M15_RSI > 65.\n" +
                     "4. Step 4: Decision Synthesis (Reasoning - MTA): ONLY issue 'BUY' if Step 1 is Bullish, Step 2 is BULLISH, AND Step 3 is Potential Buy. ONLY issue 'SELL' if Step 1 is Bearish, Step 2 is BEARISH, AND Step 3 is Potential Sell. Otherwise, signal 'HOLD'.\n" +
                     "5. Step 5: Risk Management (Structured Output): If Signal is BUY/SELL, calculate SL/TP using H1_ATR. SL = Current_Price +/- (2 * H1_ATR) and TP = Current_Price +/- (3 * H1_ATR). Round SL/TP to instrument's _Digits.\n" +
                     "6. Final Output Format: Output ONLY the final decision in the strict format: 'Signal:SL_Value:TP_Value' (e.g., BUY:1.0500:1.0600) or just 'HOLD'. DO NOT include the CoT steps in the final output.";
                         
    // 5. Final Prompt Assembly
    string finalPrompt = "You are a specialized High-Speed Trading Engine. Follow all 6 steps of the provided RULES precisely to ensure a robust trading decision. Your response must be clean and immediately executable by MQL5.\n\n" +
                         "DATA:\n" + dataJSON + "\n" + ruleset;
                         
    return finalPrompt;
}
                    

By combining precise data with CoT instructions, this function compels the AI to evolve from a simple language processor into a multi-stage reasoning machine. This increase in reasoning quality directly leads to a reduction in incorrect trades.

6. Conclusion: Architecture Choice Determines Profit

Your EA's architecture is a technical decision that directly impacts your capital return. In the MQL5 AI domain, efficiency, precision, and operational costs are conclusively in favor of the numerical approach.

Professional developers view AI models as a high-speed logical inference system, not an artificial eye for chart viewing. Shift your philosophy to processing on the client-side (MQL5) and inference on the server-side (LLM) to transform a slow, costly EA into an optimized, scalable, and profitable trading system. The emphasis on system robustness and numerical validation is the only way to survive in high-frequency markets.

2





7. ⚙️ Operational Scaling: Architecting for High-Throughput and Stability

A professional Expert Advisor (EA) must manage multiple symbols and timeframes without crashing due to network or API constraints. This requires moving beyond a simple synchronous call model.

7.a. Dynamic Rate-Limiting and Asynchronous Polling

Relying on hard-coded Sleep() functions for rate limiting is inefficient. A robust system uses asynchronous polling and dynamically manages its request queue to stay below the API's Queries Per Minute (QPM) threshold while maximizing throughput.

  • The MQL5 Challenge: MQL5 is primarily single-threaded within the terminal's execution context. True asynchronous operation requires external libraries (DLLs) or the disciplined use of WebRequest and monitoring of its return codes.

  • Rate Management Implementation: The EA should maintain a global counter for requests sent within the last 60 seconds. Before sending a new request, it must check this counter against the API's QPM limit (e.g., 60 RPM). If the limit is approached, the system should queue the next symbol's prompt and wait for the time window to reset.

    Wait Time = (60 seconds) / (QPM limit)

  • Request Queue (FIFO): Use a string array or a custom class to implement a First-In, First-Out (FIFO) queue of fully constructed numerical prompts (GetAIPrompt output). This prevents thread-blocking while waiting for the next permissible send time.

7.b. Handling API Failures and Retries (Circuit Breaker Pattern)

In a high-frequency environment, network jitter and temporary API service disruptions are inevitable.

  • Failure Detection: MQL5 must check the HTTP status code returned by the WebRequest. A 429 (Rate Limit) or 5xx (Server Error) should trigger a controlled response.

  • Exponential Backoff: Instead of immediately retrying a failed request, the system should implement an exponential backoff strategy. The wait time before the next retry attempt is progressively increased:

    Wait_Retry = Base Wait × 2^(Attempt Number)

    This prevents overwhelming an already struggling external API.

  • Circuit Breaker: If a symbol experiences a high number of consecutive failures (e.g., 5 retries), a circuit breaker should be activated, temporarily disabling AI inference for that symbol for a defined cooling-off period (e.g., 5 minutes). This protects the entire EA from one problematic symbol.

8. 📊 Advanced Feature Engineering: Optimizing the Numerical Payload

The value of the numerical prompt lies not in the quantity of data, but in its information density. Every token must count as a highly distilled insight.

8.a. Multi-Timeframe Feature Compression

Instead of sending raw indicator values, compress the data into directional or normalized strength metrics. This shifts the cognitive load from the LLM to the MQL5 client, aligning with the principle of "processing on the client-side."

Feature Name MQL5 Calculation Example Token Value/Compression
Trend_Score Normalized difference between Price and MA50 across D1, H4, M30 (e.g., +2 for strong bullish, -2 for strong bearish). Single float/string. Compresses 3 MA values into 1 directional score.
Volatility_Ratio ATR(M30) / ATR(H4). Single float. Tells the LLM if short-term volatility is accelerating or decelerating relative to the medium term.
Pivotal_Proximity Distance to the nearest Pivot Point (PP) or Support/Resistance (S/R) zone, normalized to ATR (e.g., +0.5 ATR above nearest S/R). Single float. Directly communicates structural market position relative to risk.

8.b. Encoding Order Flow and Market Depth (Simulated)

In the absence of direct MQL5 market depth access for LLMs, developers can engineer a proxy for order flow imbalance using volume-based features.

  • Volume Spike Detection: A Boolean feature (or a normalized score) indicating if the current bar's volume is significantly higher (e.g., >2 Standard Deviations) than the last 20 bars' average. This suggests institutional involvement or a news event.

    • "Volume_Spike": true/false

  • Delta Close/Open: The magnitude of the closing price versus the opening price of the most recent significant candle (e.g., H1). A large positive delta on high volume is a stronger bullish signal than a high RSI.

By sending these highly engineered, contextualized numbers, we maximize the intelligence-to-token ratio, ensuring the LLM spends its compute cycles on high-level reasoning rather than simple arithmetic.


9. 🔒 Data Integrity and Contract-Based Response Validation

The final critical stage is ensuring the LLM's output is not only correct but also immediately usable and safe for automated execution. The LLM response is treated as a data contract that MQL5 must strictly validate.

9.a. Enforcing the JSON Contract via MQL5

The StringToDouble failure discussed in the original text is a basic check. Professional EAs must use a structured JSON parsing approach that enforces the existence and data type of every key.

Validation Check Purpose: MQL5 Implementation
Key Presence Ensure all mandatory keys (Signal, SL, TP, and Reason) exist. Check JSONParse.GetValue(key) for null/empty.
Type Coercion Ensure SL and TP are valid numerical data types. Use StringToDouble and check the return value against a tolerance (e.g., abs(value) > 1e-10).
Value Range Check Validate the logical bounds. if (SL > CurrentPrice * 1.05)
Signal Enumeration Ensure Signal is one of the permitted strings (BUY, SELL, HOLD). Use a switch-case statement or StringCompare against the allowed set.

9.b. Positional Filtering (The Final Gate)

The MQL5 terminal should always maintain absolute control over the trade command. After the LLM provides its signal and parameters, MQL5 performs a final, independent logical check based on predefined hard limits.

  1. Stop Loss Check: Ensure the proposed stop loss does not exceed the maximum allowed monetary risk per trade (e.g., 1% of the account equity). This overrides any potentially large or miscalculated SL from the LLM.

  2. Slippage Check: Before submitting the order, confirm that the current market price is still within an acceptable tolerance of the price the LLM used for its calculation.

By treating the LLM as a highly capable yet external and fallible black-box component, and implementing these robust client-side numerical checks, developers ensure system robustness and capital security remain the priority. This architecture is not just fast; it is fault-tolerant and financially deterministic.