Volumes, volatility and Hearst index - page 6

 

Table 2a
n N K R M D
2 4 52000 2.3818 1.5070 4.0252
3 8 56000 3.6364 2.1770 7.9456
4 16 95000 5.4861 3.1450 15.9989
5 32 134000 8.1050 4.4831 32.0493
6 64 185000 11.8046 6.3378 63.6909
7 128 250000 17.1001 9.0244 128.6451
8 256 317000 24.5862 12.7986 257.5228
9 512 481000 35.1518 18.0730 513.5267
10 1024 639000 50.0614 25.5199 1022.8466
11 2048 936000 71.2224 36.1104 2048.1000
12 4096 1381000 101.1421 51.0515 4097.8097
13 8192 1640000 143.4602 72.2285 8198.6059
14 16384 2452000 203.3874 102.2592 16425.9632
15 32768 3183000 287.8928 144.5695 32858.2299
 
Table 2b
n N LOG(R) LOG(M) LOG(D) LOG(N) Hurst
2 4 1.2520 0.5917 2.0090 2.0000
3 8 1.8625 1.1224 2.9902 3.0000 0.6105
4 16 2.4558 1.6531 3.9999 4.0000 0.5932
5 32 3.0188 2.1645 5.0022 5.0000 0.5630
6 64 3.5613 2.6640 5.9930 6.0000 0.5425
7 128 4.0959 3.1738 7.0073 7.0000 0.5346
8 256 4.6198 3.6779 8.0086 8.0000 0.5238
9 512 5.1355 4.1758 9.0043 9.0000 0.5158
10 1024 5.6456 4.6735 9.9984 10.0000 0.5101
11 2048 6.1543 5.1743 11.0001 11.0000 0.5086
12 4096 6.6602 5.6739 12.0006 12.0000 0.5060
13 8192 7.1645 6.1745 13.0012 13.0000 0.5043
14 16384 7.6681 6.6761 14.0037 14.0000 0.5036
15 32768 8.1694 7.1756 15.0040 15.0000 0.5013
 

The third column in Table 2a shows the value of K - the number of intervals that had to be generated to get the given accuracy acc=0.001. If we take into account that the total number of all possible trajectories is 2^N, then starting from N=32 the number K is a tiny fraction of this total number. And with increasing N this fraction rapidly decreases.

However, from the practical point of view this is of little joy. The interval N=16384, based on the density of ticks in 2009, corresponds to about one day. To calculate the average range R with an accuracy of 0.001 in a stationary market would take 2452000 trading days (i.e. 9430 years). It is unlikely to be of interest to anyone. However, if you lower the accuracy significantly, you may be able to reach adequate statistical data sets.

The sixth column(D) of Table 2a quite precisely coincides in values with the second(N), and the ninth with the 10th(LOG(D)=LOG(N)), as it should be according to the previously given formula for the variance of increments. And the values of R at N=4, 8 and 16 coincide with the corresponding values from the previous table, where exact theoretical values of the mean spread are given. That is, the chosen level of accuracy and the corresponding sample sizes K do ensure the reliability of the resulting data.

The main interest is the last column, where the values of the Hurst index are given. The result in the n-th row was calculated using two points, the n-th and the previous one. Theoretically for the considered SB the Hurst index should have been equal to 0.5. However, as we can see, this is not the case. For small values of the interval N the exponent differs significantly from 0.5 and only with increasing N tends to 0.5, apparently asymptotically. I would like to underline the fundamental nature of this point: choosing different values of intervals into which we divide the series in order to calculate the Hurst ratio, we will get absolutely different values. Therefore, trying to evaluate the character of SR using the Hurst index, we should either have a tabulated curve for pure SB (this is the required calibration) with which to compare data from the experiment, or use very large intervals. Both variants are practically unacceptable for real use.

 

To illustrate, plots of R, M and D versus N in Log-Log coordinates are shown.

The red line showing the dependence of LOG(R) on LOG(N) is not a straight line. To show this, two lines Line-1 and Line-2 are drawn in the graph. The first through the first pair of points of the red curve, the second through the last pair. The Hurst index is defined as the tangent of its slope to the X-axis and, as can be seen from the graph, this slope angle varies from point to point.

The LOG(M) line is also a curve, although not as curved as LOG(R). It has the same asymptotic 0.5 and therefore never intersects with the red curve. Of the three, only the line LOG(D) is a straight line.

In principle any of these three lines could be used to calculate the Hurst index. However, unfortunately, there is no preference for any of them. Each of the lines has its advantages, but also its disadvantages. The disadvantages are, alas, so significant that they make practical use in trading ineffective.

Thus we draw the following conclusions.

The Hearst Ratio is not a "good" market characteristic, as it depends on the parameters of time series partition into intervals. In order to get proper results, this dependence must be available and used to bring them to the normal form.

The Hurst index is meaningful as a global characteristic of stationary series with rather large statistics. A market process does not have the stationarity property and requires local characteristics with a short lag time for its description. The use of the Hurst exponent in this capacity is very problematic.

 
Nevertheless, someone on the forum persisted in arguing that Hirst could be useful. Who was it?
 

Very useful, cleaned up the folder - half a dozen fewer indicators...

 
Mathemat:
Nevertheless, someone on the forum persisted in arguing that Hirst could be useful. Who was it?


Was it me ? :-)

 
Isn't it Neutron?
 
joo:
Isn't it Neutron?

I don't think we've ever figured out how to calculate it correctly (I mean classic) https://www.mql5.com/ru/forum/102239/page13
 
As far as the SB model series considered in this case is concerned, I am confident that the calculation is correct. However, if we are talking about arbitrary rows, you still need to bring them to an appropriate form there. Otherwise it may turn out nonsense. We still need to think about this reduction procedure.
Reason: