Has anyone tried cross-validation for testing EAs?

 

Forward testing is very common when you want to train and test your strategy by dividing your data into two parts: training on one set and testing on the other.

Cross-validation, on the other hand, divides the data into equal parts (folds) and trains/tests the model across multiple iterations. Let's say we have 5 years (2020–2025). With cross-validation, we could train and test a model across multiple periods, for example:

- Train on 2021–2025, test on 2020
- Train on 2020 & 2022–2025, test on 2021
- Train on 2020–2021 & 2023–2025, test on 2022
- Train on 2020–2022 & 2024–2025, test on 2023
- Train on 2020–2023 & 2025, test on 2024
- Train on 2020–2024, test on 2025

This is just an example. We could also use different time periods and different numbers of partitions, allowing us to use all the available data for more robust validations.

Has anyone tried this?

 
Isaac Uriel Arenas Caldera:

Forward testing is very common when you want to train and test your strategy by dividing your data into two parts: training on one set and testing on the other.

Cross-validation, on the other hand, divides the data into equal parts (folds) and trains/tests the model across multiple iterations. Let's say we have 5 years (2020–2025). With cross-validation, we could train and test a model across multiple periods, for example:

- Train on 2021–2025, test on 2020
- Train on 2020 & 2022–2025, test on 2021
- Train on 2020–2021 & 2023–2025, test on 2022
- Train on 2020–2022 & 2024–2025, test on 2023
- Train on 2020–2023 & 2025, test on 2024
- Train on 2020–2024, test on 2025

This is just an example. We could also use different time periods and different numbers of partitions, allowing us to use all the available data for more robust validations.

Has anyone tried this?

What is it that you hope to gain by randomly decreasing your data sample size?

I simply backtest the full data sample on live account historic data, and then demo trade or micro live trade.

FYI, years 2020 through 2025 = 6 years.

 
Ryan L Johnson #:

What is it that you hope to gain by randomly decreasing your data sample size?

I simply backtest the full data sample on live account historic data, and then demo trade or micro live trade.

FYI, years 2020 through 2025 = 6 years.

More ways to validate an EA before exposure to real market conditions.

I mean, I want to investigate if there exists more ways to validate it before getting to market.

 
Isaac Uriel Arenas Caldera #:

More ways to validate an EA before exposure to real market conditions.

I mean, I want to investigate if there exists more ways to validate it before getting to market.

Hmm...

An important principle of statistical analysis holds that lengthier sample sizes lead to more accurate test results.

As somewhat of a partial caveat, an important principle of pattern recognition holds that recent data should be weighted heavier than older data.

Therefore, maybe you could do:

  • 2020 ─ 2025, with a weighted factor of 1,
  • 2021 ─ 2025, with a weighted factor of 2,
  • 2022 ─ 2025, with a weighted factor of 3,
  • 2023 ─ 2025, with weighted factor of 4, and
  • 2024 ─ 2025, with a weighted factor of 5.

You could average your test results manually, or code an MQL5 service to do it for you (this is not easy).

(The weighted factors above are merely generic examples here to illustrate application of the aforementioned principles).