Tics: amplitude and delay distributions

 
I downloaded the data from http://ratedata.gaincapital.com/ for several different weeks and tried to analyse it. It's an interesting story, though!

Here is the second week of April, from 9 to 13 April 2007. Total is 27516 ticks, i.e. slightly less than 4 ticks per minute on average. And here are the statistics (the number means the difference between the current tick and the previous one):

-1: 13600 ticks
+1: 13742 ticks
0: 12 ticks (wrong?)
-2: 71 ticks
+2: 78 ticks

And just a little bit more of the rest:

+3: 3
+4: 1
+8: 1
-3: 5
-4: 2 ticks.

Everything else is left with 12 ticks, i.e. nothing. If we exclude zeros, which are supposed to be absent, we get that +-1 is 99.4% of all ticks, and +-2 is about 0.55%. The rest is simply absent!

Let's note that this is quite a trending week, during which the euro gained a couple of figures with confidence. In doing so, the euro was up 142 pips due to singles, 14 pips due to +-2 ticks, and -2 pips due to everything else.

What are the conclusions?

The rise/decline is provided by small steps, not large spurts. The large ticking spurts have no effect on the overall picture of course movement (if it was trending)!

OK, the previous week's picture: the Euro hardly moved at all. stats:

-1: 11884 ticks
+1: 11909 ticks
0: 18 ticks (wrong?)
-2: 96 ticks
+2: 100 ticks

Contribution - plus 33 points.

And the rest (-31 points):

-3: 13
-4: 3
-5: 2
-7: 1
+3: 6
+4: 2
+5: 1
+6: 1

The picture is different. But again +-1 is the vast majority, which also essentially defined the picture.

Should this be the case - or is it heavily normalised and purified data?
 
Mathemat:

The rise/decline is provided by small steps, not large jerks. Large ticking throws have no effect on the overall rate movement pattern (if it was trending)!

The quotes are just normal, if we take quotes of brokerage companies abusing filters and therefore lagging on sharp movements, then gaps of 15-20 pips are enough and the ticks are twice less.
 
Mathemat:
I downloaded the data from http://ratedata.gaincapital.com/ for several different weeks and tried to analyse it. It's an interesting story, though!

Here is the second week of April, from 9 to 13 April 2007. Total is 27516 ticks, i.e. slightly less than 4 ticks per minute on average. And here are the statistics (the number stands for the difference between the current tick and the previous one):

Dividing 27516 (number of ticks in a week) by 5 (number of days in a week) we get 5503.2
If we look at the quotes from the History Center, we will see the following



"If there is no difference, why pay more?" (c) :)
 
Is this the way it should be - or is it heavily normalised and cleaned data?

Clean data, should always have a unit difference, the more accurate the tick difference, i.e. one, the cleaner it is.
A zero difference is a tick drop out, which means any number of ticks can be missed in those zeros exactly and even in units not to mention more.
Unfortunately I have identified a pattern of filters that allow a difference of one, and discard anything larger, thereby lengthening the tick interval.

It's hard to even imagine how much data can be discarded, and not sequentially, exactly breaking any sequence.
As far as I know nobody deals with data fabrication, but only with data filtering, and the requotes have nothing to do with it.
That is why different DCs may have exactly the same data in most cases, except for the volume of this data.
 
And in general I still don't understand what requote means in tick language, it always seemed to me that it is expressed only in the offer of another price by the broker, not in the fabrication of data by the broker, some people bring this information in such a way that it seems as if it changes the tick, that is tick stamps, time and price. So that's what's stupid here, can anyone elaborate? The same question, is anyone involved in the fabrication of ticks data, of brokers, in general, that is, is it possible for non-existent ticks to appear? As far as I understand it, no, otherwise it makes no sense... This could be used against the broker in no time at all.
 

The question is, what is the point of working with ticks, if not just half, let's say even one third is missing, or even one percent is missing for one day only, i.e. the real event picture is broken, the data for which we started the analysis of ticks is thrown away:) In fact, this data can be collected by attracting clients to this issue, clients of completely different brokerage companies. If to say that there is filtering, there is inaccuracy, there is a big difference in ticks amount between brokerage companies, then it makes sense. You would like to see fluctuations that you will not see using only one DC's data, you would like to see the real development. If you take ADC - Analog to Digital Converter, the developers in this field usually say, in order to analyze digitized data, you have to digitize it with Real Time Operations Systems, that is RTOS, systems like DOS and QNX some modifications of Linux, otherwise some smallest part, a part of the picture is lost, and it is because of this you can not see all influences and tendencies. You can't see everything at a super accurate level, then I ask a question, what tech analysis are you talking about, if you can't tell where the wave is going to surge, because the surge has been cut out:) The more accurate the picture of the market, the more clearly we see the development, and in our case, we only see a blurred picture. Yes, I understand that quotes depend on many factors, but can we look at the picture really, if quotes can change abruptly, i.e. just hit, then we get the opposite situation, the clean data on the contrary is not merged into difference one, what if the filters work in such a way that the real quotes are more than one pip difference in price and this is what the clean data is. So we cannot work at a distance of less than say a week, because the accuracy of even a minute is very vague, as those jumps that are real, for us simply do not exist.

Z.I.: Just thinking out loud:)

 
And there is evidence of this, comparing two DTs, I have seen exactly a data cut, such as ticks with a difference of more than a pip, which means that my statement that a difference of a pip is accurate data, is one hundred percent wrong. A pip guarantees that the data can be filtered for sure. And zero is definitely a mistake, as the ticks initially advocate data difference:)

P.S.: That's how you get to the truth, by trial and error... Ugh, how much time is spent on research, in its own error, it would be even worse, if on these errors would be based on a whole system. So, thanks for the theme Mathemat. I'll go to sleep now, without painful thoughts:)
 

Well no, God forbid, it was never my intention to build a strategy on ticks. It seems to me it's the same as driving a car, tracking the signs of a turn by the fine structure of the road beneath the wheels. The roads are looked after in different ways. The good news is that, while there are significant differences in tick stories from one vendor to another, the results on the not-so-small TF are almost identical.

 
xnsnet:
The same question, is anyone involved in the fabrication of tick data, from brokers, in general, i.e. is it possible for non-existent ticks to appear? Not as far as I can see, otherwise it makes no sense. ... That could instantly be used against a broker.

From what I read in the Inet, and what traders have said - some brokers (DCs) do or have done it.
And it's not about ticks, but about serious +/- 100 points movements.
This was before the holidays, and after that there was a sea of lost money in stop loss and margin call.
After that traders, mostly stopped using stop losses in this brokerage company.
What's the point here - the money stays in the "kitchen"!
Assuming that someone was able to use it to his/her advantage, the profits of the brokerage company are still significant.
 
Mathemat:

...It seems to me that it is the same as driving a car, tracking the signs of a turn by the shallow structure of the road under the wheels.



Cool. The most accurate definition of ticking strategies.
 
Let's analyse the ticks further. Let's take again the same trend week, April 9-13, 2007, and draw charts in MS Excel.

The first graph is a histogram of ticks probability distribution by waiting time from the previous tick (in seconds). Horizontally - time itself, vertically - frequency. The distribution is quite even and beautiful, except for a very steep region near zero. What kind of distribution is it? It doesn't look like a Poisson distribution.



The second graph - the same tick intervals (temporal), but arranged in the order of their arrival over the week. Horizontally is the time scale, vertically - waiting time in seconds. It is much more difficult here. You can see some periodicity superimposed on the random process due to slack in the Asian session. How to deal with it - I have no idea.



And one more chart, very interesting. It is now the amplitudes of ticks, but also in the order of arrival. Horizontally - timeline, vertically - amplitude. Here the situation is almost unambiguous: there is no special time heterogeneity as in the previous graph. 99.5% of ticks are +-1, almost everything else is +-2. The solid blue shading between -1 and +1 indicates precisely the overwhelming incidence of minimal in amplitude ticks. The process can be considered almost stationary.
Reason: