Distribution of price increments

 

Dear traders!

At my leisure I have read many threads in this forum - many of them discuss the problem of determining the type of distribution of a random variable returns (the so called price increments). I realized for myself that this problem hasn't been solved and having some :) :) :), appropriate education and skills, I decided to take part in solving this problem.

So, the task definition:

To determine from tick data of a certain currency pair a probability distribution of successive price increments Bid and Ask (i.e. analyzed a data set consisting of the difference between the current and the previous Ask price and the same set for the Bid price). The formulas for the probability density function, the distribution function and the quantile function of a given distribution must be presented in analytical form.

The task has certainly proved to be difficult. Let me say that this distribution is not one of those widely discussed - neither normal, nor logistic, nor Laplace, nor Cauchy, etc., etc.

Before I tell you this distribution (more precisely, it is a family of distributions, since different currency pairs have different values of the coefficient of scale, which, in general, does not coincide with the standard deviation), please answer me a couple of questions - what exactly does knowing this distribution provide? How does it help in Forex trading?

Sincerely,

Accidentally passing by and interested in Forex market

Alexander_K :) :)

 
Alexander_K:

Dear traders!

In my spare time I've read a lot of threads in this forum - many of them discuss the problem of determining the type of distribution of a random variable returns (the so-called price increments). I realized for myself that this problem hasn't been solved and having some :) :) :), appropriate education and skills, I decided to take part in solving this problem.

So, the task definition:

To determine from tick data of a certain currency pair a probability distribution of successive price increments Bid and Ask (i.e. analyzed a data set consisting of the difference between the current and the previous Ask price and the same set for the Bid price). The formulas for the probability density function, the distribution function and the quantile function of a given distribution must be presented in analytical form.

The task has certainly proved to be difficult. Let me say that this distribution is not one of those widely discussed - neither normal, nor logistic, nor Laplace, nor Cauchy, etc., etc.

Before I tell you this distribution (more precisely, it is a family of distributions, as different currency pairs have different values of the coefficient of scale, which, in general, does not coincide with the standard deviation), please answer me a couple of questions - what exactly does knowing this distribution provide? How does it help in Forex trading?

Sincerely,

Accidentally passing by and interested in Forex market

Alexander_K :) :)

If you know the distribution, you will know the regularity, which leads to the fact that the distribution is so. This regularity can be traded. But if it were that simple, mathematicians would rob the whole market.
Everything changes on the market and if you know the distribution type, it will be different tomorrow. The problem is not that, but how to make stable profit knowing that all measured parameters are unstable.
There is another problem) tick Bids and Asks on forex are not real. Every broker makes his own ticks. And consequently the distribution will be different.
But there is a way out!
 
Alexander_K:

Dear traders!

At my leisure I have read many threads in this forum - many of them discuss the problem of determining the type of distribution of a random variable returns (the so called price increments). I realized for myself that this problem hasn't been solved and having some :) :) :), appropriate education and skills, I decided to take part in solving this problem.

So, the task definition:

To determine from tick data of a certain currency pair a probability distribution of successive price increments Bid and Ask (i.e. analyzed a data set consisting of the difference between the current and the previous Ask price and the same set for the Bid price). The formulas for the probability density function, the distribution function and the quantile function of a given distribution must be presented in analytical form.

The task has certainly proved to be difficult. Let me say that this distribution is not one of those widely discussed - neither normal, nor logistic, nor Laplace, nor Cauchy, etc., etc.

Before I tell you this distribution (more precisely, it is a family of distributions, as different currency pairs have different values of the coefficient of scale, which, in general, does not coincide with the standard deviation), please answer me a couple of questions - what exactly does knowing this distribution provide? How does it help in Forex trading?

Sincerely,

Accidentally passing by and interested in Forex market

Alexander_K :) :)


In fact (IMHO) there is no dependence of the current price on the previous one. Those who look for this distribution just want to identify the current time trend in the market (up trend, down trend or flat). Once the trend is identified, the trader looks for a chance to profit from it.

 
Vitalii Ananev:

In fact (IMHO) there is no correlation between the current price and the previous one. Those who look for this distribution simply want to identify the current trend in the market (up trend, down trend or flat) at the selected time interval. Once the trend is identified, the trader is looking for the opportunity to profit from it.

As a matter of fact, there is a correlation. There is a memory in the market as each deal is money and if a deal was opened it will be closed sooner or later.
 
Alexander_K:

- What does knowing this distribution actually do? How does it help in Forex trading?

GARCH models with logarithmic inputs consist of three parts: a trend model, a volatility model andan incremental distribution model. There is a huge literature about these distributions, their influence on the algorithms, differences of currency pairs by distribution types and others.... The question you raise is one with a 30 year old beard. The main mathematical tool in financial markets is GARCH, of which there are many. In the machine learning thread I gave a selection of literature - I'm clinging to it again.

By far the most widely used is the beveled t-distribution. But I repeat that a complete model consists of three components.

There are off-the-shelf software packages that are widely used in real trading. Results are available in public publications. From R we can name: fgarch, rugarch, but they are not the only ones.

 
Maxim Romanov:
As a matter of fact it does. There is a memory in the market because every deal is money and if a deal was opened it will be closed sooner or later.

I will not argue everyone has an opinion, but if there was such a correlation then it would be possible by extrapolation to predict future price movements with much greater accuracy than 50/50.

 
СанСаныч Фоменко:

GARCH models with logarithm of increments as input consist of three parts: a trend model, a volatility model andan incremental distribution model. There is a huge literature about these distributions, their influence on the algorithms, differences of currency pairs by distribution types and others.... The question you raise is one with a 30 year old beard. The main mathematical tool in financial markets is GARCH, of which there are many. In the machine learning thread I gave a selection of literature - I'm clinging to it again.

By far the most widely used is the beveled t-distribution. But I repeat that a complete model consists of three components.

There are off-the-shelf software packages that are widely used in real trading. Results are available in public publications. Of R we can name: fgarch, rugarch, but they are not the only ones.


Yes, you indicated a very close approximation - a slanted t-distribution.

In fact, my calculations gave the so-called non-standardized Student's t-distribution with number of degrees of freedom = 2. The coefficient of scale does not equal the standard deviation and is calculated separately for each currency pair.

However, this is true exactly for price increments. Real prices form some mixture of these t-distributions and knowing the distribution of increments does not give me personally an understanding of the process as a whole.

Nevertheless, I ask not to close this thread - may be some bright mind will tell me how to get knowledge of the particular to the general from the knowledge of the particular, it would be incredibly cool.

For my part, I promise to post my mathematical exercises in this area and carefully read the feedback and comments.

Regards.

Alexander_K

 
Vitalii Ananev:

I will not argue, everyone has an opinion, but if such a relationship existed we could have used extrapolation to predict future price movements with much greater accuracy than 50/50.


There are several reasons why it is impossible to use extrapolation. Firstly, the sampling rate must be correct, if you sample a sine wave with random time even a sine wave cannot be predicted. And secondly, how it can be predicted, if it is not known, when will each participant close the order? It is known that all participants who have opened deals will close them, but when, it is already unclear. Or you do not agree with the fact that all open trades will be closed?

 

What this gives is that there are more large deviations from the mean than in a normal distribution and the smaller the sample the greater the chance of getting a larger error. It's just a distribution with thick tails, everyone has known for a long time that quotes are not normally distributed. Usually they associate it with memory or inertia, i.e. big changes are followed by big changes in quotes, small changes are followed by small changes (on average), but small changes still outnumber big ones.

If this is the case, then it is impossible to predict the quotes in one time system, i.e. the moment of jump from large changes to small ones and vice versa is statistically impossible to guess. So we have to look at the quotes in different timeframes and compare the probabilities. As a result, we still have to keep the quotes history and the maximum scale, when it is difficult or impossible to determine which rises in the market are small or large at the moment.

But for some acceptable event horizons and under certain situations it is probably possible to predict + search for inefficiencies, which partly form thick tails

 
Maxim Dmitrievsky:

What this gives is that there are more large deviations from the mean than in a normal distribution and the smaller the sample the greater the chance of getting a larger error. It's just a distribution with thick tails, everyone has known for a long time that quotes are not normally distributed. Usually they associate it with the presence of memory, i.e. big changes are followed by big changes in quotes, small changes are followed by small changes (on average), but small changes still outnumber big ones.


Let me give a concrete example from my calculations.

For the currency pair EURJPY the distribution of price movements is a non-standardized Student's t-distribution with 2 degrees of freedom and the coefficient of scale (sigma) = 1.43 points (my apologies for excessive mathematical modesty). 95% of price increments are in the tolerance range of +-6.19 sigma. Does it mean that a trade can be executed for a certain sample if the price exceeds this range? Does the accuracy of my calculations down to thousandths of a percent make sense?

 
Alexander_K:

Here is a concrete example from my calculations.

For the currency pair EURJPY the distribution of price increments is a non-standardized Student's t-distribution with 2 degrees of freedom and the coefficient of scale (sigma) = 1.43 points (excuse me for being too mathematically meticulous). 95% of price increments are in the tolerance range of +-6.19 sigma. Does it mean that a trade can be executed for a certain sample if the price exceeds this range? Does the accuracy of my calculations in fractions of a thousandth of a percent make sense?


I am embarrassed to ask, but tolerant for whom? It seems as if one usually takes 3 sigmas...

SanSanych has given a lot of interesting information and sources in this field. As far as I remember, only the mentioned GARCH-models do not deal with ticks, but with close increments on the days.
Reason: