Distribution of price increments - page 15

 
nahdi:

Actually, that's what I wanted to ask - why would an experienced physicist, a statistician (or whatever you are) be interested in this topic? Wouldn't finances be better handled by financiers? Everyone should mind his own business. And if there is none, it makes you think.

Or physicist is a vocation, as Mr Medvedev used to say... If you want money, go into business. If you want to lose money, go into financial markets...


I agree. From the point of view of common life concepts and values - I have nothing to do in Forex (as a physicist), because I need a clear understanding of the process expressed in analytical formulas. But, still, I will sometimes come to the forum with some theoretical results. Now it's like a hobby for me - not to drink vodka in my spare time, really:))))

 
Alexander_K:

I agree. In terms of ordinary life concepts and values - I have nothing to do in forex (as a physicist), as I need a clear understanding of the process expressed in analytical formulas. But, still, I will sometimes come to the forum with some theoretical results. Now it's like a hobby for me - I don't drink vodka in my spare time, really:))))

If the market had a formula it wouldn't be the market!!! It's all about trivial supply and demand. If you want formulas, read up on pricing models. But these are nothing more than ways of limiting risk.

And who knows - maybe it's better to have a drink of vodka than to rack your brain with incomprehensible numbers.

 
Alexander_K:

Here's what I was thinking.

If the statement that nonparametric skew for Forex distribution is invariant and equals +-0.185 is true, it can mean (without mysticism:))))) only one thing.

Note that for a normal distribution, its half (so-calledHalf-normal distribution) has a nonparametric skew=0.36279.

In this case we have on average someHalf-unknow distribution which has nonparametric skew=0.185, and if we look at it from both sides we will see symmetric normal-like distribution.

Questions again:

1. Since you repeatedly use the word "invariant", I ask again: what do you mean by it in this case, for ratio k = (median - mean)/(standard deviation)?

2. I was interested in what data was selected for the analysis. My guess is that the up-course steps were analysed separately from the down-course steps, otherwise both the median and the mean in samples of 10 thousand or more would be hundreds of times smaller than the standard deviation, and the modulus k=0.185 would be nowhere to be found. Is this true?

3. If so, how can the median be less than the mean in the presence of heavy tails (outliers)? https://ru.wikipedia.org/wiki/%D0%9C%D0%B5%D0%B4%D0%B8%D0%B0%D0%BD%D0%B0_(%D1%81%D1%82%D0%B0%D1%82%D0%B8%D1%81%D1%82%D0%B8%D0%BA%D0%B0):

"Suppose there are 19 poor people and one millionaire in the same room. Each poor person has $5 and the millionaire has $1 million (106). The total comes to $1,000,095. If we divide the money equally among the 20 people, we get $50,004.75. This will be the arithmetic average of the amount of money that all 20 people in that room had.

The median in this case would be $5 (the half-sum of the tenth and eleventh, the middle values of the ranked series). We can interpret this as follows. By dividing our company into two equal groups of 10 people, we can say that everyone in the first group has no more than $5, while everyone in the second group has no less than $5. In general, we can say that the median is how much the "average" person brought in. On the contrary, the arithmetic mean is an inappropriate characteristic, as it is significantly higher than the amount of cash the average person has."


and a request: could you please in accordance with your suggestion https://www.mql5.com/ru/forum/218475/page14#comment_6040781

"4. There are no graphs - the arrays are generated dynamically and they are gigantic in size - I only saved the results. In principle, those interested can repeat my experiments in VisSim or MathLab (in this system - not sure, as I haven't worked with it)."

Publish here the whole million (one and a half) of analyzed ticks. I think Excel can handle the calculation of k for a million lines.

Распределение ценовых приращений
Распределение ценовых приращений
  • 2017.11.10
  • www.mql5.com
Уважаемые трейдеры...
 
Vladimir:

... So, analyzing ticks we analyze not Forex at all, but the properties of algorithms of quotes generation by this brokerage company for the given pair on the given account type at the selected time period. And here we can detect a lot of miracles. For example, serving shaggy (roughly speaking, unfiltered) or even purposely hacked (for example, by "overregulation") quotes on demo accounts as a way to lure clients to real accounts. Or such signs of a company being "young" when it allows a lot of arbitrage (which you probably noticed when you were talking about 7 sigma outliers) already on real accounts.

Good point! By the way, also a solvable problem. It's enough to take several brokerage companies and compare the distribution of ticks for the same currency pair. If they are different, then the shamanism takes place...
 
Dennis Kirichenko:
Good point! By the way, also a solvable problem. It's enough to take several brokerage companies and compare tick distributions for the same currency pair. If they are different, it means the shamanism takes place...
I agree. Simple filters are necessary. Re-checked again. Took the average between two consecutive ticks. Distribution becomes more compressed and "smooth", i.e. scale factor changes - it becomes more convenient to work, and invariance does not change. And this is good!
 

In short, I have not yet been able to findskew=0.185. I checked it on EURUSD bid ticks. Maybe because there were zeros too? I took them without and got something like 0.3.

 

Yes, that's actually what I'm working on at the moment.

If we deal with a single distribution, which "on average" is present in every TF, i.e. in any sample size - then the algorithm for solving the problem in the first approximation is as follows:

1. The average variance over a large period of time is calculated for a particular sample volume. The variance in this case changes when passing from one sample to another, i.e. it is not invariant and it is its average value that needs to be known.

2. Support/resistance lines are plotted versus a weighted moving average (where weight is the value of the probability density for a given value of the increment) for a given sample size, taking into account the calculated average variance and quantiles of the t2-distribution. This is the necessary basic thing that describes the "memory" effect of a non-Markovian process.

3. When the price goes beyond these lines, those coefficients are analysed which are invariant on the average, but have a value different from the reference value at this stage.

For example, if the nonparametric skew is now =0.4, comparing it with 0.185 we conclude that the distribution is skewed considerably and the price must return to the weighted average - we make a deal against the trend. And vice versa.

However, I suppose one invariant coefficient is not enough - we must find at least one more...

 
Dennis Kirichenko:

In short, I have not yet been able to findskew=0.185. I checked it on EURUSD bid ticks. Maybe because there were zeros too? I took them without and got something like 0.3.

Well done, Denis! What did you use? In Matlab? Does 0.3 remain the same for all samples???
 
Alexander_K:

1. For a given sample size, the average variance over a large period of time is calculated. The variance in this case changes when moving from one sample to another, i.e. it is not invariant and it is the average that needs to be known.

2. Support/resistance lines are plotted versus a weighted moving average (where weight is the value of the probability density for a given value of the increment) for a given sample size, taking into account the calculated average variance and quantiles of the t2-distribution. This is the necessary basic thing that describes the "memory" effect of a non-Markovian process.

3. When the price goes beyond these lines, those coefficients are analysed which are invariant on the average, but at this stage have a value different from the reference value.

For example, if the nonparametric skew is now =0.4, comparing it with 0.185 we conclude that the distribution is skewed considerably and the price must return to the weighted average - we make a deal against the trend. And vice versa.

Don't we get back to some parameter that must be optimized - in our case "a certain sample volume"? And this brings with it all the "charms" of optimisation, levelling out the probabilistic approach.

 
Stanislav Korotky:

Doesn't this again lead to some parameter that needs to be optimised - in this case "a particular sample size"? And this drags with it all the "charms" of optimisation, levelling out the probabilistic approach.

At present, the picture is as follows - entry points into a trade are successful when the sample size "covers" the majority of values of t2-distribution, i.e. from 1000 and more. But exit points do not. Somehow they depend on other parameters - i.e. you can't say that price will necessarily reach the moving weighted average when trading counter-trends. Sometimes just 100 ticks are missing, and the price starts moving upwards, without reaching the moving average. Something to think about. But for exit points - you are right the sample volume needs to be optimised...
Reason: