Data Deterioration constant pain 'TRYING' to use Strategy Tester: No evidence that 'period converters' work! - page 2

 
FourX wrote >>

I would think that the Hst files have to be from the same broker in order for them to work properly? But perhaps not. I can make a case for both I'll have to check it out.

the hst files ARE transferable in that the broker-specific MT4 terminal/instance will use the hst file "as is" but be aware of the fact that the TIMESTAMP for the candles in the hst files are broker/server specific depending on what their specific offset to GMT is...



This is an overlay chart I created by exporting ~12hrs of EURUSD data from the same date and then shifting the timestamp by a given number of hours until each broker's data overlapped. (just meant to be an example of how one can visually verify the timeshift needed for any given broker's hst data)

Unfortunately you may have noticed you can't timeshift the data when importing prices from an hst file...you must first use the history center to export the prices in one of the other formats (csv works fine) and then re-import the data as csv (or any of the other non-hst file format) and only then will you have the option to timeshift the data such that your source broker's timestamps align properly with the destination broker's server time.


And also be aware that this (properly accounting for timeshift differences between brokers) still does not negate the fact that each broker is effectively there own market maker and the price data for one broker will never be the same (except by coincidence at random points in time) as any other broker. This is the effect of the market we are accessing being retail off-exchange forex...so while you can transfer FXDD historical data into your IBFX account the future IBFX market data might not have the same "statistical nature" as the FXDD data which can have unintended consequences in your EA's performance in forward/live trading.

If this is NOT the case, any chance of getting copies of your 6 year ForEx Hst DataBase

It's all the same data I linked to before (https://www.mql5.com/en/forum/124322) from FXDD/Forexite/Disktrading, no need for me push 100MB files at you (which you have no justification to believe are not doctored or compromised in some anyways, sure it's respectful of you to assume I am a trustworthy individual but when it comes to you risking your money over the unknown possibility that some random dude on the internet did not give you screwey data files then you really should seek out a qualified source...such as FXDD, Forexite, or Disktrading as well as the dukascopy links gordon gave in that other thread)

Will take longer to run, but the more data one has, the more accurate the back testing is going to be. Can always specify the dates and use partial data sets anyway.


Longer backtesting does reduce the probability of the sampled distribution failing to resemble the parent population, so I agree with the sentiment and I understand your motivation. If you need further assistance in tracking down those historical data files do continue to hit me up about it and I'll walk thru as much of the process as you need.

I have mountains of RAID 1/0 HDD space, so no worries there.

The issue isn't one of harddrive space in this day and age when it comes to hst files but rather the addressable memory space of MT4 being limited to 2GB per instance regardless whether you are operating a 64bit OS and regardless your installed ram capacity. You will find that it is simply impractical/impossible to backtest with or manipulate hst files containing ~7m bars. https://www.mql5.com/en/forum/124146 (<- while the dialogue in that thread leaves open the question of whether the statements made here are correct I have since learned after my last update to that thread that these statements/observations are in fact correct...I have not determined whether or not MT5 surrmounts any of these ram/dataset limitations)


 

Hi Phillip,

...depending on what their specific offset to GMT is....

Good points and something I understand. With your experience, knowledge and descriptions I should be able to compile these myself thanks.

.....(properly accounting for timeshift differences between brokers) still does not negate the fact that each broker is effectively there own market maker and the price data for one broker will never be the same.... accessing being retail off-exchange forex...

Understood. But I think that if anything making a composite from numerous different sources would tend to make this averaged/ RMS data set that if anything would be a more realistic and reliable than if just taken from any one 'off market' source.


sure it's respectful of you to assume I am a trustworthy individual but when it comes to you risking your money over the unknown possibility that some random dude on the internet did not give you screwey data files then you really should seek out a qualified source...

A valid point, but you are a well known (long term) participant in these forums with established credibility, so in this instance I'm not to worried on that account.

.

memory space of MT4 being limited to 2GB per instance regardless whether you are operating a 64bit OS and regardless your installed ram capacity. You will find that it is simply impractical/impossible to backtest with or manipulate hst files containing ~7m bars.

A significant factor indeed. But running various data subsets and integrating and combining the results should still give more accurate and reliable and truly indicative results.

The 64 bit system with significantly more (addressable) RAM would help. But if one is getting into the 2 GB zone of the maximum RAM allowable to the instance of the MT4 (strategy tester), it is going to become problematic. Still one could run more instances of MT4 more stability and reliability with a true 64 bit system with greater addressable RAM, speed,reliability and stability

It is worthy of noting that while MT4 will not utilize MulitThread technology of CPUs with more than one CPU core, one CAN readily designate different instances to different CPU cores if the PC is trying to run them all on one core. This makes s significant difference.

When I have numerous MT4 instances running and doing simultaneous back testing and optimizing as well as forward testing, I raise the priority of the instance(s) that are also running MT on live accounts.

Thanks once more for sharing your insights and knowledge based on your experience;

"Experience' comes before 'Expertise' Even in the dictionary!

 

With brokers making their own price gives the independent trader large numbers of arbtrage opportunities.
A programmer who can make price comparasons accross many brokers and a few key symbols could clean up.

 
FourX wrote >>

.....(properly accounting for timeshift differences between brokers) still does not negate the fact that each broker is effectively there own market maker and the price data for one broker will never be the same.... accessing being retail off-exchange forex...

Understood. But I think that if anything making a composite from numerous different sources would tend to make this averaged/ RMS data set that if anything would be a more realistic and reliable than if just taken from any one 'off market' source.

The term I like to use is "robust"...we want our EA's and the trade strategies therein to be robust against broker-specific pricing nuances since they (the nuances) are by definition artificial of actual market supply/demand dynamics to begin with and as such they (the nuances) are not something we should ever have confidence in being able to profit from trading.

But how do we know when our EA/strategy is too dependent on broker-induced price nuances? When we get wildly different results when backtesting across different broker's price data. A robust EA will result in similar profit/loss results regardless the broker and broker's price data subtleties.

When folks report things like "I get great backtesting results with FXDD but things fall apart when testing IBFX" that sends a big red flag (to me anyways) that this person is unfortunately dealing with a non-robust EA/strategy and that means as FXDD (or whatever broker they optimized the EA for use with) changes their price-nuances (spreads, etc) then the EA's profit/loss characteristics are going to change as well.

Server time errors can be a big factor here. When a price server is running 15s behind the correct atomic clock time it skews the open and close price points of every candle in every timeframe relative to the candles of those brokers who have their server timestamps calibrated to the second. IBFX used to have a large server timestamp skew, and it varied whether you were using their live account server versus demo server, gave me lots of trouble until I recognized what was going on.

Regarding your "composite" comment...there are two different forms of composite results. The first is to create a single unified set of price data...a composite of multi-broker price feeds...and this is referred to as "indicative data" in this industry. Then you backtest using this indicative data. The premise for using indicative data is that you have essentially "averaged out" the various artificial pricing nuances that any given broker is imprinting into their price stream (sort of a cancellation of errors approach) and thus you are better able to assess the market's fundamental supply/demand characteristics.

This is helpful when doing fundamental work but naturally you must still come full circle at some point and build your EA in such a way as to handle the broker specific price nuances that you will eventually contend with when attempting live trading.

The second form of composite results is to approach the situation from the other end, keep each broker's price feed clean and uncontaminated with other broker's price data (never convolve FXDD data with IBFX, etc) and backtest your EA across all the various broker's price feeds and create a composite profit/loss result from all those backtests.

The benefit of the second approach is that the spread in the backtest results captures the magnitude of the dependency your EA/strategy has on a given broker's price nuances. A tight spread in the backtest results across varying brokers is proof that you've built a robust EA which is broker agnostic and tolerates their artificial pricing shifts well (which gives you confidence that it stands a chance of continuing to do so as any given broker changes-up their price subtleties in the future, such as re-calibrating their server's clock, etc).

sure it's respectful of you to assume I am a trustworthy individual but when it comes to you risking your money over the unknown possibility that some random dude on the internet did not give you screwey data files then you really should seek out a qualified source...

A valid point, but you are a well known (long term) participant in these forums with established credibility, so in this instance I'm not to worried on that account.

So long as you are aware the risks then I have no problem sharing the data, I'll rapidshare it at some point in coming days and will shoot you a pm on your account here.


memory space of MT4 being limited to 2GB per instance regardless whether you are operating a 64bit OS and regardless your installed ram capacity. You will find that it is simply impractical/impossible to backtest with or manipulate hst files containing ~7m bars.

A significant factor indeed. But running various data subsets and integrating and combining the results should still give more accurate and reliable and truly indicative results.

The 64 bit system with significantly more (addressable) RAM would help. But if one is getting into the 2 GB zone of the maximum RAM allowable to the instance of the MT4 (strategy tester), it is going to become problematic. Still one could run more instances of MT4 more stability and reliability with a true 64 bit system with greater addressable RAM, speed,reliability and stability

It is worthy of noting that while MT4 will not utilize MulitThread technology of CPUs with more than one CPU core, one CAN readily designate different instances to different CPU cores if the PC is trying to run them all on one core. This makes s significant difference.

When I have numerous MT4 instances running and doing simultaneous back testing and optimizing as well as forward testing, I raise the priority of the instance(s) that are also running MT on live accounts.

I can add nothing more to your observations stated here, these are all "best practices" as far as I am aware. Sounds like we've both iterated our methods independently to similar endpoints at this stage of the game. If anyone has taken it to the next level and wants to share how/why they did so I would certainly appreciate the help moving to the next level on the learning curve. I'd really like to know which, if any, of these limitations in MT4 are addressed/eliminated with MT5.
 
Ickyrus wrote >>

With brokers making their own price gives the independent trader large numbers of arbtrage opportunities.
A programmer who can make price comparasons accross many brokers and a few key symbols could clean up.


Other's here may have had more success with this particular arbitrage method but in my limited efforts to implement such a strategy a while back I found that the brokers kept their price fluctuations basically within the spread of one another so that at best you could expect to break even on the most extreme cases of price divergence between brokers but more often the case you'd lose by a small fraction of the spread on the instruments. No free lunch.

(and it stands to reason, if there was arbitrage money left on the table by the time the price data got to us end-customers then that means the broker chose to leave that money on the table...which is unlikely...the broker is more likely to arbitrage the price themselves as they see the opportunity and make a little extra money for themselves as they get to "see" the prices before we do)
 
1005phillip:

FourX it sounds like you are using the process I posted about a while back (including the "all time periods" script) to refresh your hst files with the broker's data.

Hi Phillip,
I'm wondering about the advisability and implications of just leaving the last entry in the one minute intervals for all of the currency pairs in the history center set to the year of 1970 all the time as opposed to deleting this last entry after updating the data sets?
 
1005phillip 2010.04.11 23:56 FourX wrote >>

I would think that the Hst files have to be from the same broker in order for them to work properly? But perhaps not. I can make a case for both I'll have to check it out.

the hst files ARE transferable in that the broker-specific MT4 terminal/instance will use the hst file "as is" but be aware of the fact that the TIMESTAMP for the candles in the hst files are broker/server specific depending on what their specific offset to GMT is...



This is an overlay chart I created by exporting ~12hrs of EURUSD data from the same date and then shifting the timestamp by a given number of hours until each broker's data overlapped. (just meant to be an example of how one can visually verify the timeshift needed for any given broker's hst data)

Unfortunately you may have noticed you can't timeshift the data when importing prices from an hst file...you must first use the history center to export the prices in one of the other formats (csv works fine) and then re-import the data as csv (or any of the other non-hst file format) and only then will you have the option to timeshift the data such that your source broker's timestamps align properly with the destination broker's server time.

Nice work Phillip!
You definitely know your way around this stuff and are very knowledgeable about it. One of the 'walking encyclopedias' on MQL here. Including Gordon, Phy, CB, the 'Economist', BarrowBoy, obviously Rosh and a number of others. Haven't seen much of a number of them
around here lately. Could be they are focusing on MQL5 now. Also the Economist has his own web site now (I don't remember the URL) which requires significant time commitment to do it right and keep it up. I wonder how his web site is doing? I hope it is working out for him.
Do you work for MQ, in the ForEx/Investment industry or came by this from lots of 'Time In' utilizing this for personal investment and interest?
Regards,
4X
 
1005phillip:

[...] I found that the brokers kept their price fluctuations basically within the spread of one another [...]

Isn't that because they're brokers, and they're just providing access to underlying liquidity? With MT4 brokers it's often the same liquidity pool (Currenex), and as a retail investor with a tool such as MT4 you have absolutely no chance of arbing any difference in price between e.g. Currenex and Hotspot. What varies between MT4 brokers is basically the liquidity package which the broker has bought, and any mark-up they apply to the raw spread. Broadly speaking, you're basically getting the same prices from all MT4 brokers, but with different charges built in.

 
FourX wrote >>
Do you work for MQ, in the ForEx/Investment industry or came by this from lots of 'Time In' utilizing this for personal investment and interest?
Regards,
4X



Self-employed (account manager) in the industry, all that I know relating to MQL is from "lots of time in". Elbow grease and tenacity.

FourX wrote >>
Hi Phillip,
I'm wondering about the advisability and implications of just leaving the last entry in the one minute intervals for all of the currency pairs in the history center set to the year of 1970 all the time as opposed to deleting this last entry after updating the data sets?


No issues with leaving known false data in your hst files so long as you remember that it is there. This is only relevant when doing backtesting as you (probably) don't want your strategy tester to start with known bogus timestamp/candle. Just be sure and set your strategy tester to use use a timeframe with a start date that is more recent than the youngest known bad data.

Although in practice there is no reason to retain the 1970 candle datum, once you've use the "trick" to extract as much broker-specific data as possible from a given broker/server its not like you are ever going to refresh the chart some day in the future and suddenly find yourself with an extra year's worth of M1 data in your hst file. It is a "use it once and forget it" trick. Once you've done it then the most recent "real" candle serves the same purpose of that fake 1970 candle when it comes to refreshing your charts to extract all the most recent price data from your broker's server.

 
1005phillip:

its not like you are ever going to refresh the chart some day in the future and suddenly find yourself with an extra year's worth of M1 data in your hst file. It is a "use it once and forget it" trick. Once you've done it then the most recent "real" candle serves the same purpose of that fake 1970 candle when it comes to refreshing your charts to extract all the most recent price data from your broker's server.

Hi Phillip,

You are right. Once one has all the back Data from the broker, they are not all of a sudden going to put another year from a decade ago into their DataBase. Updating the info from the server occasionally will update the recent data, which of course. isn't going to be from 40 years ago.

I'm concerned about the time offsets in the ForEx DataBase that you so kindly shared with me. Are they all at GMT=0? I'm want to download the latest Data from a number of different brokers. I'll have to be cognizant of this when upDating recent data or it is all going to be SNAFUd. This applies to even one broker if they have a different time offset.

Thanks again Phil

Reason: