Machine learning in trading: theory, models, practice and algo-trading - page 1590

 
Andrey:

I read this article about 5 years ago, it is interesting, but there is not much additional information, the author is doing something with OHLC to get a more "convenient" volatility metric, it is not new in principle, in the classic Dacorogna "An introduction to high-frequency finance" back in the last century it was recommended to take average absolute values of returns, not RMS values, as a measure of volatility. The predictability of the volatility is also a well-known fact. It depends on two factors, seasonality and inertia, which accounts for 95% of its predictability. But even if we align (log)returns according to volatility, it won't give anything, we need a sign for trading, and it doesn't influence the distribution in any way.

For example, if you take a Gaussian noise, it is obviously impossible to predict the following ones using the previous samples, regardless of stationarity, but if you sort that series, for example, that will not change the distribution but will make it completely predictable, then you can play with the dynamic volatility within a wide range and make it non-stationary but still easily predictable.

There is some sense in doing all this not on one timeframe, but on some segment of them, comparing the resulting picture with what should be on a Gaussian SB with similar dispersion.

 
Aleksey Nikolayev:

If strictness is needed, we can assume that we are talking about the lack of stationarity in the broad sense of the logarithms of returns, for example.

https://github.com/BlackArbsCEO/mixture_model_trading_public/blob/master/notebooks/current_public_notebooks/03_Are_Gaussian_Mixture_Components_More_Stationary_2019-01-01.ipynb

BlackArbsCEO/mixture_model_trading_public
BlackArbsCEO/mixture_model_trading_public
  • BlackArbsCEO
  • github.com
Contribute to BlackArbsCEO/mixture_model_trading_public development by creating an account on GitHub.
 
Aleksey Nikolayev:

There is some sense in doing all this not on one timeframe, but on a certain segment, comparing the resulting picture with what should be on a Gaussian SB with similar dispersion.

Regarding the distribution of returnees, it is very important that the higher the timeframe, the more Gaussian the distribution becomes, for the trivial reason of averaging (we all remember that aggregation of non-normal distributions gives a normal one). Real "random" events in the market are only changes of a best(askbid), by placing/withdrawing an order or a trade, the aggregation of ticks even in a minute changes the distribution, making it closer to the Gaussian distribution (to make a Gaussian distribution from a uniform one it is enough to iterate 12 times), the real market distribution is only the tick distribution, and it is not normal at all.

 
Andrew:

Regarding the distribution of returnees, it is very important that the higher the timeframe, the more Gaussian distribution becomes, for the banal reason of averaging (we all remember that aggregation of non-normal distributions gives a normal distribution). Real "random" events in the market are only changes of the best(askbid), by placing/withdrawing an order or a trade, the aggregation of ticks even in a minute already changes the distribution making it closer to the Gaussian distribution (to make a Gaussian distribution from the uniform one it takes 12 iterations), the real market distribution is only the tick distribution, and it is not normal at all.

For currencies even it is not real. More precisely, it is not real at all and it is not "normal" at all (not in terms of distributions either).

Because there is no center. There is no single source of ticks and no guarantee that they will reach the user. Not only is the "hypothetical flow of ticks" of a particular server a product of aggregation of other servers, but this flow is also thinned by both server and terminal for technical reasons.

the stat.characteristics of the ticks depend on the particular DC, its peers and their software.

 
Andrey:

Regarding the distribution of returnees, it is very important that the higher the timeframe, the more Gaussian distribution becomes, for the banal reason of averaging (we all remember that aggregation of non-normal distributions gives a normal distribution). Real "random" events in the market are only changes of a best(askbid), by placing/withdrawing an order or a trade, the aggregation of ticks even in a minute already changes the distribution making it closer to the Gaussian distribution (to make a Gaussian distribution from a uniform one, 12 iterations are enough), the real market distribution is only the tick distribution, and it is not normal at all.

Still, at the tick level, the more correct model is some variation of the Poisson process, e.g. a compound poisson process with a discrete distribution of jumps and a non-variable intensity (a non-constant time function)). This, however, ignores the discreteness of real trading time.

The shape of the histogram depends on which regions are encountered (Maxim Dmitrievsky wrote just above about mixtures). Sometimes it even results in a double-humped histogram.

 

Since I don't know how to transfer a fully Markovian model to a metaque, the idea is to cluster all seasonal components in Python, then train a simple MOH to predict clusters, test on a test sample. And transfer it to the terminal. This will be bomb #3.

Each cluster is expected to have a constant variance and matroid.

 
Maxim Dmitrievsky:

Since I don't know how to transfer a fully Markovian model to a metaque, the idea is to cluster all seasonal components in Python, then train a simple MOH to predict clusters, test on a test sample. And transfer it to the terminal. This will be bomb #3.

Each cluster is expected to have constant variance and matroid.

I think that even if both will float within non-broad limits, it's not a big deal either
 
Maxim Dmitrievsky:

Since I don't know how to transfer a fully Markovian model to a metaque, the idea is to cluster all seasonal components in Python, then train a simple MOH to predict clusters, test on a test sample. And transfer it to the terminal. This will be bomb #3.

Each cluster is expected to have constant variance and matroid.

Bomba #5 is trend-moving with specific rates, repeating at equal intervals. But you have to go through number 4 as well.
 
Aleksey Nikolayev:

Still, at tick level the more correct model is some variations of Poisson process, for example, compound poisson process with discrete distribution of jumps and non constant intensity (NOT constant time function))).

won't work for many reasons, it's been researched for a long time and it's not even about ticks filtering by a DC server

from what I know where to lookhttps://www.mql5.com/ru/forum/102066/page9#comment_2968124 in this picture where the arrow is an ejection

these ticks will always be, this is how the market works - why they occur is another question

And if you follow your assumption about the jumps in ticks, you will consider these spikes, but these ticks just do not form the direction of further movement, at most they may occur on the high/low of a bar

i couldn't find any screenshots of the tick indicator from Prival, it has a very good way to display these spikes - so that you don't have to guess where the tick is coming from, one of the variants is that MM often mixes asc/bid with a time lag in the quote flow, but this is a real quote! )))

 
Maxim Kuznetsov:
Bomba #5 is trend-moving with specific tempos, repeated at equal intervals. But you have to go through number 4 as well.

This is already in the bomb number 2 :)

Reason: