Backtesting Data Range and Quality Problem : Am I the only one?

 
For any Expert Advisor that I test, I set my testing time period from 11/01/2001 to 10/31/2005. I carry out back tests on all currency pairs and on at least two different time scales (e.g. daily and 4h or 4h and 1h etc).

In most of the tests, first trade seems to take place only in 2003 or later. Also, if I run the same expert with same settings at two different times, I get different number of trades and different date for first trade. The problem increases in magnitude with non-major currency pairs, but even for majors, I've seen wide fluctuations.

I raised the issue in other thread, but didn't get reply (maybe, my question was not visible), so I'm posting this as a new thread. I'm attaching a sample of outcomes that I got for one EA test on 1h time chart using every tick data mode.

Any explanation/fix would be appreciated. The data below does not come out formatted in the message body (can you format the data as table in message here?), but if you copy-paste in excel , you should see the data.

Pair # Trades Bars Ticks First Trade Last Trade

EURGBP 40 16,430 2,432,401 03/31/2003 07:05 05/05/2003 16:05
GBPJPY 2,823 16,396 5,102,933 03/25/2003 04:09 10/26/2005 13:25
GBPCHF 3,669 16,396 5,106,423 04/01/2003 12:26 10/28/2005 13:58
EURJPY 1,635 16,394 3,514,203 03/27/2003 17:10 10/27/2005 11:39
EURUSD 2,927 16,366 4,904,721 03/26/2003 13:18 10/28/2005 12:53
EURAUD 1,481 13,762 3,687,279 08/19/2003 08:41 10/27/2005 08:55
AUDJPY 122 8,052 2,967,536 08/18/2004 13:02 01/26/2005 09:26
CHFJPY 378 7,987 1,492,524 08/18/2004 13:02 01/28/2005 10:59
NZDUSD 342 6,485 1,950,082 11/15/2004 23:01 10/28/2005 12:35
GBPCHF 703 2,239 1,814,461 08/04/2005 09:20 10/28/2005 13:45
GBPUSD 441 2,172 1,426,862 07/29/2005 11:07 10/25/2005 08:11
USDCHF 443 2,172 1,572,221 07/29/2005 13:00 10/28/2005 12:53
USDJPY 553 2,171 1,496,422 07/29/2005 08:05 10/28/2005 14:50
EURCHF 97 2,169 1,180,901 07/29/2005 09:35 10/28/2005 15:59
AUDNZD 239 2,060 1,700,490 08/10/2005 02:10 10/25/2005 06:07
EURCAD 168 2,058 1,391,681 08/09/2005 13:38 10/28/2005 15:05
NZDJPY 84 2,057 1,200,855 08/09/2005 10:03 10/28/2005 09:22
AUDCAD 29 1,948 1,042,159 08/09/2005 07:52 10/26/2005 12:15
 
I have had similar problems - really difficalt to diagnose or see exactly what is happening!

regards
 
Hardwood,

Two settings that extended the data range for the following five majors EURGBP, GBPJPY, GBPCHF, EURJPY and EURUSD were as follows for me:

1. In Tools->Options->Server, ensure that Enable DDE server is checked
2. In Tools->Options->Charts, set max bars in history and max bars in chart to maximum 250000

However, for non-majors, I'm still facing the same problem... :(

Are we the only fools facing this issue? ;)
 
Ok..after two days of google search, found a few explanations for differences in backtester results...they all point to the quality of data available for demo accounts. Having data beyond Jun 2004 is difficult (especially if you don't want to cough up some serious money).

But, the strategy tester in itself (of MT4) has recd good ratings in terms of reliability and accuracy. Folks say on the web, that you can expect the real life results to vary 5-10% from strategy tester (in terms of % wins, drawdowns, profit potential etc.) till you are careful with the data.

Now, regarding data, there is a nice FAQ on one strategy builder forum...am attaching the link here..
http://www.strategybuilderfx.com/forums/showthread.php?t=15309

Link to Alpari data is: http://www.alpari-idc.com/en/dc/databank.php.htm

I feel more comfortable now....also, the point that I'm noticing is though your strategy is important, more important is the time frame you trade and your stop/loss/position sizing. Small time frames (anything upto 30m) do not fit my risk apetite. But, I found decent success on longer time frames (my favourites...1H and 4H) with risk: reward ratio of 1:3 and trailing stop of 10 PIPs, even with very simple strategies like MA cross-overs.

Good luck...
 
Also,

You will see the change in data quality through following measures:

Bars in test will jump up by ~400%+
Ticks modeled will increase proportionately
and ...
Modeling quality will jump up from 50% range to 90% plus!!!

Of course, Alpari has data only on majors...but, if your strategy works on all majors...why wouldn't it work on other financial instruments with similar liquidity and spread? ;)
 
For data check I use scrypt. It create, for example, 1H from 1MIN and saves the file. Then I check in Excel this data with data exported from data center from MT. If data is ok, then differences are no more then 1-2pips...
I suggest you to check your data from small timeframe and big timeframe using this script. Some time you can even miss hours from time zones when download small data

use TimeFrameFrom and TimeFrameTo to handle periods
code:

//+------------------------------------------------------------------+
//| QuotesCreate.mq4 |
//| RD |
//| marynarz15@wp.pl |
//+------------------------------------------------------------------+
#property copyright "RD"
#property link "marynarz15@wp.pl"

datetime T;
double O,H,L,C,V;
string FileName,value,s;
int handle;
#define TimeFrameFrom 1
#define TimeFrameTo 60

datetime CalcTimeTo(int i)
{
datetime date=Time[i];
date=MathFloor(date/(60*TimeFrameTo))*60*TimeFrameTo;
return(date);
}


//+------------------------------------------------------------------+
//| script program start function |
//+------------------------------------------------------------------+
int start()
{
if(Period()!=TimeFrameFrom) return(0);
FileName=Symbol()+DoubleToStr(TimeFrameTo,0)+".csv";
handle=FileOpen(FileName,FILE_CSV|FILE_WRITE,";");
if(handle<1)
{
Print("File ",FileName," not found, the last error is ", GetLastError());
return(0);
}
else Print("File ",FileName," opened.");
datetime TimeTo=0;
bool Filled=False;
T=0;O=0;H=0;L=1000;C=0;V=0;
TimeTo=iTime(NULL,TimeFrameTo,Bars-1);
for (int i=Bars-1;i>=0;i--)
{
if(CalcTimeTo(i)==TimeTo)
{
T=TimeTo;
if(O==0) O=Open[i];
if(H<High[i]) H=High[i];
if(L>Low[i]) L=Low[i];
if(C!=Close[i]) C=Close[i];
V+=Volume[i];
Filled=True;
}
else
{
if(Filled)
{
value=TimeToStr(TimeTo,TIME_DATE)+",";
value=value+TimeToStr(TimeTo,TIME_MINUTES)+",";
value=value+DoubleToStr(O,MarketInfo(Symbol(),MODE_DIGITS))+","+
DoubleToStr(H,MarketInfo(Symbol(),MODE_DIGITS))+","+
DoubleToStr(L,MarketInfo(Symbol(),MODE_DIGITS))+","+
DoubleToStr(C,MarketInfo(Symbol(),MODE_DIGITS))+","+
DoubleToStr(V,0);
FileWrite(handle,value);
Filled=False;
TimeTo=CalcTimeTo(i);
i++;
T=0;O=0;H=0;L=1000;C=0;V=0;
}
else
{
TimeTo=CalcTimeTo(i);
T=0;O=0;H=0;L=0;C=0;V=0;
}
}
}
FileClose(handle);
Print("File ",FileName," closed.");
return(0);
}
//+------------------------------------------------------------------+
 
Marynarz,

Thanks! Actually, the data import from Alpari data solved my problem of data quality in back testing. My observation is that data quality has to be at 90%+ for reliability of the results. Even then, do expect minor variations, but nothing that would turn your particular strategy from winning one to a losing one...

Maratha
Reason: