Machine learning in trading: theory, models, practice and algo-trading - page 656

 
Yuriy Asaulenko:


Well, it's nonstationary. Any process, if you break it down into pieces, will become unsteady, and if it is not, it will not be random.

I don't get it. In GARCH, the process is decomposed into components, not broken into pieces. The formula itself contains: previous value + noise.

By the way, judging by distributions at different long intervals (several over 3 months), I haven't noticed any significant difference between them.

In publications on GARCH, it is proved that to calculate the parameters of distributions, the number of observations must exceed 5000. Less than 1000 makes the model unstable.

About the economic sense - well, I do not know. I assume that the market is random to the observer. Whether it is actually random or not doesn't really matter. The key word here isfor the observer.

On forex, I completely agree, as I believe that exchange rates are politics.

As for the other types of assets. Although today the asset prices are spurious: the price of oil is several times higher, while the consumption is approximately the same.

 
Maxim Dmitrievsky:


But you didn't get rid of the emissions.


Of course not and you can't do that. Moreover, one of the meanings of the GARCH models is how well it REALLY fits the process after an emission.

 
SanSanych Fomenko:

Of course not, and you can't do that. Moreover, one of the meanings of the GARCH models is how well it REALLY fits the process after the outlier.

I mean that if you take just increments and logarithms of increments, the graphs will be equivalent, but only on a different price scale

 
Dr. Trader:

I've been thinking about this a lot, too.

If the regression model predicts a price increase per bar, and the R2 score is above zero on fronttests and backtests, that's already a good start. The problem is that the result, although stable, is small, the spread cannot be beaten.

Analytically, the problem is that R2 penalizes the model more heavily for large errors and ignores small errors and wrong trade directions. If you look at the distribution of gains, most price movements are only a couple of pips. And the model, instead of predicting the correct direction of such small movements, learns to predict long tails of the distribution, for which it will get a higher R2. As a result the model somehow predicts large movements but makes constant errors of direction on small ones and blows the spread.

Conclusion - standard regression estimates are bad for forex. It is necessary to develop a fitness function of some kind that would take into account directions of trades, spread and accuracy. Then, even with an accuracy of a little over 50% there is a chance for profit.
Accuracy, Sharp ratio, recovery factor, and other functions that analyze trade schedule are too discrete, neuronics with standard backprops won't get out of local minimum and won't train properly.

An alternative conclusion - completely ignore weak signals of the neuron. Trade only on strong ones. The problem here is that we can always find the threshold that gives good results on the backtest, but on the fronttest it will yield bad ones. Here, too, something to think about.

R2 IMHO as well as loglos rather inconvenient metric because of the non-linearity. For me a simple correlation of returnees with predicts is much more convenient, it's like the root of R2, multiplied by 100 you get the exact percentage of change that you can capture from the market, I get 3-5%, but as you correctly said, these signals are too frequent, and filtering or averaging almost completely kills the alpha. I think this is where we should focus our efforts, because it is impossible to get more than 5% from normal data anyway.

 
SanSanych Fomenko:

I don't know about economic sense. I assume that the market is random to the observer. Whether it is actually random or not doesn't really matter. The key word here isfor the observer.

On forex, I completely agree, as I believe that exchange rates are politics.

As for the other types of assets. Asset prices are off the charts today though: the price of oil is times the price of oil, and consumption is about the same.

It's all over the place. I mostly play the stock-futures market. Everything is random for an observer. What's really there - the hell knows. You have to be an insider.)

What is "not random" - it's rollbacks in the movement and the fluctuations around the mean (not to be confused with Alexander_K2-m). With this approach, rather hitting a movement can be called random, but not rare and even a regular phenomenon, given that we never know when and in which direction it will be.

 
Yuriy Asaulenko:

Yes, everywhere. I mostly play the stock-futures market. To an observer, everything is random. I don't know what's really there. You have to be an insider.)

What is "not random" - it's rollbacks in the movement and the fluctuations around the mean (not to be confused with Alexander_K2-m). With this approach, rather hitting a movement can be called random, but not rare and even a regular phenomenon, given that we never know when and in which direction it will be.

This hypothesis about the efficient market is nonsense.

 
Maxim Dmitrievsky:

I mean, if you take just the increments and the logarithms of the increments, the graphs will be equivalent, but only on a different price scale

Stranye you have graphs, the logarithm should have also compressed them. Using what formula did you calculate? Decimal, for example, a 10-fold change in the original data leads to a 2-fold change. Natural, too, but weaker. Your graphs do not show vertical compression.
 
SanSanych Fomenko:

After all, it is not for nothing that GARCH models are used for increments. and they are the most common at the moment. The basic idea of defeating non-stationarity by decomposing a non-stationary series into components that make quite meaningful economic and statistical sense is very attractive.

Right now garch is too complicated for me. Books on it are written mostly for statisticians and econometricians, and constantly operate with things I don't understand; to understand and assimilate some basics I need to first understand a bunch of other stuff that the books don't explain.

I play with packages in R, but I haven't got any profit with default settings even in tests; I need again some knowledge how and what to tweak in settings, and I can't do it at random.

I believe that garch can do many things, but the amount of time I have to invest in order to understand it is too great, and I do not have much of it.

 
elibrarius:
Strany you have graphs, the logarithm should have compressed them as well. By what formula did you calculate? Decimal, for example, a 10-fold change in input data results in a 2-fold change. Natural, too, but weaker. Your graphs do not show vertical compression.

log(close[i]/close[i-15])

where to compress what, why?

 
Maxim Dmitrievsky:

I mean, if you take just the increments and the logarithms of the increments, the graphs will be equivalent, but only on a different price scale

I guess it's not log(open[0] - open[1]),
but log(open[0]/open[1])

Reason: