Discussing the article: "Foundation Models in Trading: Time Series Forecasting with Google's TimesFM 2.5 in MetaTrader 5"

 

Check out the new article: Foundation Models in Trading: Time Series Forecasting with Google's TimesFM 2.5 in MetaTrader 5.

Time series forecasting in trading has evolved from traditional statistical models (like ARIMA) to deep learning approaches, but both require heavy tuning and training. Inspired by advances in NLP, Google’s TimesFM introduces a pretrained “foundation model” for time series that can perform strong forecasts even without task-specific training. For traders, this is powerful because it can be efficiently fine-tuned on their own data using lightweight methods like LoRA, reducing overfitting while adapting to changing market conditions.

In the natural language processing domain, we have witnessed a paradigm shift: large foundation models, pretrained on vast corpora and then adapted to specific tasks, have displaced the "train from scratch" workflow. The question naturally arises: can this transfer learning revolution work for time series?

Google Research answered this affirmatively with TimesFM (Time Series Foundation Model), introduced in the paper "A decoder-only foundation model for time-series forecasting" and accepted at ICML 2024. TimesFM is a 200-million-parameter, decoder-only transformer pretrained on 100 billion real-world time points. Despite being much smaller than contemporary LLMs, it achieves strong zero-shot performance across domains and granularities. In many cases, it matches or outperforms supervised models trained on the target datasets.

This is particularly useful for algorithmic trading because it can be fine-tuned on proprietary financial data. Using PEFT with LoRA adapters, we can specialize TimesFM for specific instruments while keeping trainable parameters below 100K. This helps reduce overfitting on non-stationary market data.

In this article, we build a complete end-to-end pipeline that:

  1. Exports OHLCV data from MetaTrader 5 for 14 instruments (forex pairs, indices, commodities)
  2. Constructs a rich covariate dataset incorporating moon phases, economic calendar events, market sessions, and technical features
  3. Fine-tunes TimesFM 2.5 with LoRA adapters on this financial data
  4. Generates probabilistic forecasts with quantile estimates (10th, 50th, 90th percentiles)
  5. Exports the forecasts back into MetaTrader 5 as CSV files
  6. Visualizes the predictions directly on the chart via a custom MQL5 indicator with confidence bands


    Author: Seyedsoroush Abtahiforooshani

     

    I'm not an expert in the area, this is why it's unclear how you feed vectors with 40+ features into TimesFM, which is supposedly univariate (1 feature per point)? Is it hidden behind patching or adapters? Neither disclosed in the article.

    Also, if I understand correctly, you train and forecast every instrument independently. Will it be more appropriate to feed all the features for all instruments to predict future horizon of every instrument? Market is a whole system, where every instrument affects the others.

     
    Stanislav Korotky #:

    I'm not an expert in the area, this is why it's unclear how you feed vectors with 40+ features into TimesFM, which is supposedly univariate (1 feature per point)? Is it hidden behind patching or adapters? Neither disclosed in the article.

    Also, if I understand correctly, you train and forecast every instrument independently. Will it be more appropriate to feed all the features for all instruments to predict future horizon of every instrument? Market is a whole system, where every instrument affects the others.

    true joint multi-instrument forecasting — would be a different model (e.g., PatchTST or Moirai with channel mixing, or TFT with cross-series static embeddings), not TimesFM. Within TimesFM's constraints, the practical compromise is "add other instruments' returns as covariates" and accept that you're capturing coupling weakly.

    You're right that TimesFM's core is univariate — the transformer tokenizes patches of a single series. The 40+ features don't enter through the patching path. They enter through TimesFM 2.5's separate xreg (external regressors) mechanism, which the article references in the forecast_with_covariates call:

    point, quantiles = model.forecast_with_covariates(
        inputs=[close_context],                       # the univariate series
        dynamic_numerical_covariates=dynamic_numerical, # the 40+ features
        static_categorical_covariates={...},
        xreg_mode="xreg + timesfm",
    )


    So there are two parallel paths:

    • TimesFM proper sees only the close series, patched and fed through the decoder transformer as designed.
    • xreg path takes the covariates and fits a separate regression component (in TimesFM's xreg implementation this is essentially a linear/ridge model over the covariates, fit on the in-context window).

    The xreg_mode="xreg + timesfm" setting blends the two: TimesFM forecasts the residual after the xreg component, or the two predictions are combined additively (depending on mode). It's not patching, and it's not adapters in the LoRA sense — adapters here are only used for fine-tuning TimesFM's transformer weights, completely separate from how covariates are consumed.
     
    Hi, I noticed that the PEFT library isn’t available in the implementation shown in this article. are HF PEFT compatible with this article?