Discussing the article: "MetaTrader 5 Machine Learning Blueprint (Part 1): Data Leakage and Timestamp Fixes"
The activity-driven bars do not solve all problems you mentioned for time bars. For example, you wrote:
The Subtle Intra-Bar Leakage: However, a more subtle form of data leakage can still occur within the very formation of that time bar. If a significant event transpires midway through a 1-minute bar (e.g., at 09:00:35), any features derived from that bar (such as its high price or a flag for the event) will inevitably incorporate this information by the bar's end.
If you build equal volume, equal range or other tick-based custom bars, you will mark such a bar with a single label anyway, and it will leak (or more precise, blur) information about the high price across the entire bar.
The only way to solve this - is to build "bars" with the specific features (you're going to use) in mind. For example, in case of high or lows being the main features, you should try, probably a zigzag "bars" with extermums marked with exact time.
Actually, the approach with constant timeframes, and specifically limiting them to M1 is problematic in the context of data leakage in MT5. Labelling M1 bars with ending time is not much better than with beginning time, imho.
For those, who are interested in building custom bars (charts) natively in MT5, there is the article with MQL5 implementation of equal-volume, equal-range, and renko bars. Of course, you can mark the bars with ending time in the open source code.

- www.mql5.com
The activity-driven bars do not solve all problems you mentioned for time bars. For example, you wrote:
If you build equal volume, equal range or other tick-based custom bars, you will mark such a bar with a single label anyway, and it will leak (or more precise, blur) information about the high price across the entire bar.
The only way to solve this - is to build "bars" with the specific features (you're going to use) in mind. For example, in case of high or lows being the main features, you should try, probably a zigzag "bars" with extermums marked with exact time.
Actually, the approach with constant timeframes, and specifically limiting them to M1 is problematic in the context of data leakage in MT5. Labelling M1 bars with ending time is not much better than with beginning time, imho.
For those, who are interested in building custom bars (charts) natively in MT5, there is the article with MQL5 implementation of equal-volume, equal-range, and renko bars. Of course, you can mark the bars with ending time in the open source code.
The activity-driven bars aim to improve the statistical properties information contained in the bars, such as less heteroskedasticity and improved normality. The solution to the The Subtle Intra-Bar Leakage I have proposed is labelling bars using their end times, so that all events that occur within the bar are captured in the timestamp. A useful example is when you use features derived from the timestamp, such as Fourier transformations, in training your model. If you use the MetaTrader5 convention where bars are labelled by start of the period, then you are misinforming your model. The distinction may not matter much for some models, but it has a huge impact on those that aim to exploit the cyclical nature of markets. I hope I have clarified my intent.
The activity-based bars don't solve all the problems you mentioned for time bars. For example, you wrote:
If you create bars of the same volume, range, or other tick-based custom bars, you'll be marking such a bar with a single label anyway, and information about the maximum price will leak (or more accurately, blur) across the entire bar.
The only way to solve this problem is to create "bars" with the specific features (you'll be using) in mind. For example, if highs or lows are the main characteristics, you should try to create a "zigzag bar" with extermums marked exactly in time.
The constant timeframe approach, and in particular the limitation to M1, is problematic in the context of the MT5 data leak. Marking M1 bars with the end time is imho not much better than with the start time.
For those interested in creating custom bars (charts) natively in MT5, there is the article with the MQL5 implementation of Equal Volume, Equal Range and Renko bars. Of course, you can mark the bars with end time in the open source code.
What do you mean when you state "If you create bars of the same volume, range, or other tick-based custom bars, you'll be marking such a bar with a single label anyway, and information about the maximum price will leak (or more accurately, blur) across the entire bar"?
What do you mean when you state "If you create bars of the same volume, range, or other tick-based custom bars, you'll be marking such a bar with a single label anyway, and information about the maximum price will leak (or more accurately, blur) across the entire bar"?

- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
You agree to website policy and terms of use
Check out the new article: MetaTrader 5 Machine Learning Blueprint (Part 1): Data Leakage and Timestamp Fixes.
Before we can even begin to make use of ML in our trading on MetaTrader 5, it’s crucial to address one of the most overlooked pitfalls—data leakage. This article unpacks how data leakage, particularly the MetaTrader 5 timestamp trap, can distort our model's performance and lead to unreliable trading signals. By diving into the mechanics of this issue and presenting strategies to prevent it, we pave the way for building robust machine learning models that deliver trustworthy predictions in live trading environments.
Data snooping or data leakage might seem subtle, but its impact on machine learning models can be monumental—and devastating. Imagine studying for a test where you unknowingly peek at the answers beforehand. Your perfect score feels earned, but it's actually cheating. This is precisely what happens when we use MetaTrader 5's default timestamps in machine learning—data leakage unexpectedly corrupts your model's integrity.
How MetaTrader 5's Timestamps Trick You
By timestamping at the start, MetaTrader 5 implies this bar's data was available at 18:55:00—a full 5 minutes before it actually closed! If your model uses this in training, it's like giving a student exam answers 5 minutes before the test begins. To counteract this, we should avoid using MetaTrader 5's precompiled time-bars, instead using tick data to create the bars we use in our models.
Author: Patrick Murimi Njoroge