From theory to practice - page 44

 
Nikolay Demko:

By setting a step filter you kill the time coordinate, i.e. you completely demolish it.

If it is acceptable, then IMHO we should make a conditional filter: step if both bid and ask prices have changed.

After that, calculate what is the average step on saws, and already to these data apply a step filter.

If we do such filtration, there should be one more indicator in ticks data, like in bars, how many ticks this point contains, it may be useful for analysis, like how many times the market was beaten at this level.


No, Nikolai, we do not. We know at what point in time the price has passed the specified threshold and we can save the time stamp together with the price value. Yes, the timestamps will not be distributed uniformly or according to an exponential law, but does it really matter? Ticks are not coming in uniformly either. And there is no need to analyse Bid/Ask separately, we may consider (Bid+Ask)/2 both in processing and trading. I agree, the tick volume is useful. It is possible that it can be used and give some advantage. As a result, the structure of the threshold filter processed data file may look like:

Date/Time Price TickVolume

The methodology is similar to the bar/curve breakdowns with the only difference being that classic bars are sliced by time, while ours is sliced by price. This is probably the only and only effective way to get rid of market and broker tick noise.

 
Alexander_K2:

It's no accident that I chose exponential time intervals between ticks, and I also take the average value between them.

That's how I see a fairly pure t2-distribution for returns increments. I couldn't do it any other way.

Try just to read ticks with larger increments, 10 seconds for example. If all your metrics are retained, it might get rid of the "different ticks at different DCs" dependency.
 
bas:
Try just reading the ticks in larger increments, 10 seconds for example. If all your metrics are retained, it might get rid of the "different ticks at different DCs" dependency.
You will lose information in my opinion and the result will be even worse.

I would suggest analysing the minutes.

BUT, it's up to you.

 
Alexander Sevastyanov:

I sincerely wish you success, but I'm afraid you're making the mistake of investigating exactly tick increments. In my humble opinion, instead of (Bid+Ask)/2 and counting at equal or exponentially distributed time intervals, it is better to switch to filtering/sampling the price increments by a fixed value of 1-2 average spread, in fact to a simple threshold filter. What is the disadvantage of counting at fixed intervals? Because price can go back and forth by an appreciable amount during that interval. And it may not be an accidental spike, but you will miss it. The threshold filter will greatly reduce the amount of data for analysis by filtering out tick noise generated by the market and broker, and will not miss significant price movements. And another advantage is that increments of 1-2 spreads can no longer be traded theoretically, but practically. And with your "quantile function and confidence levels" apparatus, even more so.

Good to hear from you again!
I think the topicstarter (and not only) would benefit from the info on the old links : H-vola and more. Kagi and renko - no better way so far.
I do not think that thinning ticks is reasonable in principle - we will watch the strobe effect, and miss all the interesting stuff.
And in the rambling to the average Wiener process we can also apply the "Promiscuous Bride" - then the entry point (according to Berezovsky) will be a bit better...

😎

 
Alexander Sevastyanov:

No, Nikolai, we don't. We know at what point in time the price crossed a given threshold and we can save the timestamp along with the price value. Yes, the timestamps will not be distributed uniformly or according to an exponential law, but does it matter? Ticks are not coming in uniformly either. And there is no need to analyse Bid/Ask separately, we may consider (Bid+Ask)/2 both in processing and trading. I agree, the tick volume is useful. It is possible that it can be used and give some advantage. As a result, the structure of the threshold filter processed data file may look like:

Date/Time Price TickVolume

The methodology is similar to the bar/curve breakdowns with the only difference being that classic bars are sliced by time, while ours is sliced by price. This is probably the only and only effective way to get rid of market and broker tick noise.


What about the difference in tick arrivals between different DCs?

Or do we stretch through matching ratios?

 
Mikhail Dovbakh:

I don't think that thinning the ticks is reasonable in principle - we will observe the stroboscope effect and miss all the interesting stuff.

The point is that the topistar's trades last for hours and are tens of pips in size. From this point of view, the price does not change significantly in a few seconds. Renco is of course good at filtering out noise, but would probably lead to a completely different distribution of increments.

 
Vladimir:

You seem to use the word "repeat" in the sense of "prove", "justify". And it is as if you yourself believe in such justifications. That's how it's done in the media nowadays, but why here?


Here, Vladimir - especially for you.

This is the histogram of EURJPY increase for the last week in exponential time intervals with the quotes averaging


Here are the statistics

Column D shows real values of probabilities

Column E represents the calculated values for t2-distribution.

What more proof do you need????????

 
bas:

The point is that the topistar's trades last for hours and are tens of pips in size. From this point of view, the price does not change significantly in a few seconds. And Renko of course filters the noise well, but probably will lead to a completely different distribution of increments.

Well, my TS is also freezing each order for thirteen hours. (

 
Alexander_K2:

Here, Vladimir - especially for you.

This is the histogram of EURJPY increments for the past week at exactly exponential time intervals with quotes averaging


Here are the statistics

Column D shows real values of probabilities

Column E represents the calculated values for t2-distribution.

What more proof do you need????????


The forum allows you to attach files to your posts.

.xls files cannot be attached, but they can be zipped.

 
Alexander_K2:

Here, Vladimir - especially for you.

This is the histogram of EURJPY increments for the last week with exactly exponential time intervals with quotes averaging

We know what the increment is. We also know what sampling thinning is.
And what does averaging of quotes mean in this case?

Can we formally describe what is taken as a sampling interval in the histogram of Beautiful?

о)

Reason: