[Dev Log] TICQ AI : A 3.5M Parameter Deep Reinforcement Learning Agent (PPO)

[Dev Log] TICQ AI : A 3.5M Parameter Deep Reinforcement Learning Agent (PPO)

7 January 2026, 17:43
Kaan Caliskan
0
103

Hello MQL5 Community,

I would like to share the technical architecture and development journey of my latest project: TICQ AI

Unlike traditional Expert Advisors that rely on rigid if/else logic or simple indicator crossovers (RSI/MACD), NEXUS is built upon a Deep Reinforcement Learning (DRL) framework, specifically utilizing Proximal Policy Optimization (PPO).

The objective was not to build a "bot" that signals entries, but to build an autonomous agent that understands market context, structure, and regime.

Here is a breakdown of the system architecture:


1. Feature Engineering (The Vision)

The input layer of the neural network is massive. The agent does not simply look at Open/Close prices. It processes a normalized vector of 222 distinct features simultaneously across 5 timeframes (M1, M5, M15, H1, H4).

The feature set is engineered to capture institutional footprints:

  • Smart Money Concepts (SMC): The system algorithmically detects Order Blocks, Fair Value Gaps (FVG), and internal/swing structure breaks (BOS/CHoCH).

  • Advanced Order Flow: It calculates Cumulative Volume Delta (CVD), aggressive Buy/Sell imbalances, and large trade intensity z-scores in real-time.

  • Market Regime Detection: Using Hurst Exponents, the agent identifies whether the market is in a Random Walk, Mean-Reverting, or Trending state, adjusting its bias accordingly.

  • Macro & Temporal: Inputs include DXY correlation, Session Overlaps (London/NY), and news proximity vectors.

2. Model Architecture (The Brain)

The core is an Actor-Critic Network with approximately 3.5 Million trainable parameters.

  • The Backbone: A deep residual network processes the 222 inputs to extract latent features.

  • The Actor (Policy): Outputs probability distributions for actions (Discrete) and continuous values for dynamic SL/TP sizing.

  • The Critic (Value): Estimates the expected return of the current state to stabilize training.

3. Discrete Action Space (Execution)

Standard EAs are binary (Buy or Sell). NEXUS operates in a continuous environment with 8 Discrete Actions, allowing it to manage positions like a human trader:

  1. HOLD (Do nothing / Wait for setup)

  2. BUY (Open Long)

  3. SELL (Open Short)

  4. CLOSE_ALL (Exit position)

  5. CLOSE_PARTIAL (Secure 50% profit)

  6. MOVE_SL_BREAKEVEN (Risk-free trade)

  7. ACTIVATE_TRAILING (Volatility-based trailing)

  8. EXTEND_TP (Let profits run if trend is strong)

Note: We use Action Masking to ensure the agent cannot choose invalid actions (e.g., trying to Buy while already in a position).


4. Hybrid Protection Engine (Risk Management)

This is the most critical component for live deployment. To solve the "model drift" issue where AI performance degrades in unseen market conditions, I implemented a Hybrid Execution Mode:

  • Live Mode: Active when the rolling 20-trade Win Rate is > 55%.

  • Paper Mode (Kill Switch): If the Win Rate drops below 45%, the system automatically disconnects from the live order flow. It continues to trade in an internal simulation (Paper Mode) until statistical confidence is restored.

This acts as an automated circuit breaker, preventing account blow-ups during highly erratic market events.


5. Multi-Agent Deployment

The system is currently deployed as 5 separate agents, each fine-tuned for specific asset classes:

  • XAUUSD (Gold)

  • EURUSD (Major Forex)

  • GBPJPY (Cross Pair / Volatility)

  • NAS100 (US Tech Index)

  • BTCUSD (Crypto Asset)


I am happy to discuss the technical implementation, the challenges of training PPO models on financial data, or the specific feature engineering techniques used for Order Flow and SMC detection.

Let's discuss: How do you handle feature normalization for multi-timeframe inputs in your ML projects?

📋 TICQ AI - Official Links

🌐 Official Website: https://fxairobot.com

✈️ Telegram Channel: https://t.me/ticqai

✈️ X Account: https://x.com/ticqai

📺 YouTube Channel: https://www.youtube.com/@xtraderai

📈 Vantage Markets Copy Trade: (Connect your account and start auto-copying) 🔗 Join Link: https://www.vantagemarkets.com/open-live-account/?sub1=spid_ODc5MTYz&affid=192756&platform=copytrading&autoCertification=false 👉 Strategy Name: TICQ AI

📡 MQL5 Live Signal Service: https://www.mql5.com/en/signals/2352533 (⚠️ Note: Results may vary on different brokers due to spread/latency.)