All Blogs / Analytics & Forecasts / Trading Systems

[Dev Log] TICQ AI : A 3.5M Parameter Deep Reinforcement Learning Agent (PPO)

7 January 2026, 17:43

Kaan Caliskan

211

Hello MQL5 Community,

I would like to share the technical architecture and development journey of my latest project: TICQ AI

Unlike traditional Expert Advisors that rely on rigid if/else logic or simple indicator crossovers (RSI/MACD), NEXUS is built upon a Deep Reinforcement Learning (DRL) framework, specifically utilizing Proximal Policy Optimization (PPO).

The objective was not to build a "bot" that signals entries, but to build an autonomous agent that understands market context, structure, and regime.

Here is a breakdown of the system architecture:

1. Feature Engineering (The Vision)

The input layer of the neural network is massive. The agent does not simply look at Open/Close prices. It processes a normalized vector of 222 distinct features simultaneously across 5 timeframes (M1, M5, M15, H1, H4).

The feature set is engineered to capture institutional footprints:

Smart Money Concepts (SMC): The system algorithmically detects Order Blocks, Fair Value Gaps (FVG), and internal/swing structure breaks (BOS/CHoCH).
Advanced Order Flow: It calculates Cumulative Volume Delta (CVD), aggressive Buy/Sell imbalances, and large trade intensity z-scores in real-time.
Market Regime Detection: Using Hurst Exponents, the agent identifies whether the market is in a Random Walk, Mean-Reverting, or Trending state, adjusting its bias accordingly.
Macro & Temporal: Inputs include DXY correlation, Session Overlaps (London/NY), and news proximity vectors.

2. Model Architecture (The Brain)

The core is an Actor-Critic Network with approximately 3.5 Million trainable parameters.

The Backbone: A deep residual network processes the 222 inputs to extract latent features.
The Actor (Policy): Outputs probability distributions for actions (Discrete) and continuous values for dynamic SL/TP sizing.
The Critic (Value): Estimates the expected return of the current state to stabilize training.

3. Discrete Action Space (Execution)

Standard EAs are binary (Buy or Sell). NEXUS operates in a continuous environment with 8 Discrete Actions, allowing it to manage positions like a human trader:

HOLD (Do nothing / Wait for setup)
BUY (Open Long)
SELL (Open Short)
CLOSE_ALL (Exit position)
CLOSE_PARTIAL (Secure 50% profit)
MOVE_SL_BREAKEVEN (Risk-free trade)
ACTIVATE_TRAILING (Volatility-based trailing)
EXTEND_TP (Let profits run if trend is strong)

Note: We use Action Masking to ensure the agent cannot choose invalid actions (e.g., trying to Buy while already in a position).

4. Hybrid Protection Engine (Risk Management)

This is the most critical component for live deployment. To solve the "model drift" issue where AI performance degrades in unseen market conditions, I implemented a Hybrid Execution Mode:

Live Mode: Active when the rolling 20-trade Win Rate is > 55%.
Paper Mode (Kill Switch): If the Win Rate drops below 45%, the system automatically disconnects from the live order flow. It continues to trade in an internal simulation (Paper Mode) until statistical confidence is restored.

This acts as an automated circuit breaker, preventing account blow-ups during highly erratic market events.

5. Multi-Agent Deployment

The system is currently deployed as 5 separate agents, each fine-tuned for specific asset classes:

XAUUSD (Gold)
EURUSD (Major Forex)
GBPJPY (Cross Pair / Volatility)
NAS100 (US Tech Index)
BTCUSD (Crypto Asset)

I am happy to discuss the technical implementation, the challenges of training PPO models on financial data, or the specific feature engineering techniques used for Order Flow and SMC detection.

Let's discuss: How do you handle feature normalization for multi-timeframe inputs in your ML projects?

📋 TICQ AI - Official Links

🌐 Official Website: https://fxairobot.com

✈️ Telegram Channel: https://t.me/ticqai

✈️ X Account: https://x.com/ticqai

📺 YouTube Channel: https://www.youtube.com/@xtraderai

📈 Vantage Markets Copy Trade: (Connect your account and start auto-copying) 🔗 Join Link: https://www.vantagemarkets.com/open-live-account/?sub1=spid_ODc5MTYz&affid=192756&platform=copytrading&autoCertification=false 👉 Strategy Name: TICQ AI

📡 MQL5 Live Signal Service: https://www.mql5.com/en/signals/2352533 (⚠️ Note: Results may vary on different brokers due to spread/latency.)

#xauusd, forex, AI, robot, Reinforcement

To add comments, please log in or register

[Dev Log] TICQ AI : A 3.5M Parameter Deep Reinforcement Learning Agent (PPO)

1. Feature Engineering (The Vision)

2. Model Architecture (The Brain)

3. Discrete Action Space (Execution)

4. Hybrid Protection Engine (Risk Management)

5. Multi-Agent Deployment

📋 TICQ AI - Official Links

XAUUSD Daily Analysis – Today 25 February 2026

MTrade - Trading Strategy

USER GUIDE - EA GAMES CHANGER Series

GOLD - (XAUUSD) - DAILY FORECAST — Feb 23, 2026

XAU AutoTrader — Official User Guide

Forex and Cryptocurrency Forecast for February 23 – 27, 2026

Quantum Trend Scanner

Why the Standard Waddah Attar Fails on Crypto & Gold (And How to Fix It)

AlphaNet AI Gold Pro - User Guide

How I Grew a $310 Account to $851 in 54 Days Trading Gold on MT5

🚀 Magic Histogram MT5 — The Smart Indicator That Spots the Trend Before the Market

Multisymbol EA (MT5) tutorial

Pulsar Terminal — Complete User Guide (MT5 Utility Add‑On)

AUDNZD 26 FEB 2026 - TRADE WITH THE AMAZING COMBO ITALO VOLUME AND ITALO PIVOTS

Magister Mentis - A Deep Technical Overview of the Institutional AI Architecture and Beta Tester program

Nova FI Trader — Force Index as a Confirmation Layer

#USDCHF: Intraday Bullish Reversal

#NZDJPY: Bullish Wave Confirmed

What Is the Current Market Price (CMP) Zone and Why Do Professional Traders Use It?

Best EA Settings for Funded Accounts: Portfolio + Daily Limits (Step-by-Step)

Shooting Star Trading System - manual

Hanging Man Trading System- MANUAL

Hammer Trading System - MANUAL

My EA Is in Drawdown: What to Do (and What NOT to Do) to Avoid Destroying a Profitable System