Machine learning in trading: theory, models, practice and algo-trading - page 655

 
Dr. Trader:

I've been thinking about this a lot, too.

If the regression model predicts a price increase per bar, and the R2 score is above zero on fronttests and backtests, that's already a good start. The problem is that the result, although stable, is small, the spread cannot be beaten.

Analytically, the problem is that R2 penalizes the model more heavily for large errors and ignores small errors and wrong trade directions. If you look at the distribution of gains, most price movements are only a couple of pips. And the model, instead of predicting the correct direction of such small movements, learns to predict long tails of the distribution, for which it will get a higher R2. As a result the model somehow predicts large movements but makes constant errors of direction on small ones and blows the spread.

The conclusion is that the standard regression estimates are bad for forex. It is necessary to create a fitness function of some kind, taking into account transaction directions, spread, accuracy and the function must be smooth. Then, even with an accuracy of a little over 50% there is a chance for profit.
Accuracy, Sharp ratio, recovery factor, and other functions that analyze trade schedule are too discrete, neuronics with standard backprops won't get out of local minimum and won't train properly.

An alternative conclusion - completely ignore weak signals of the neuron. Trade only on strong ones. The problem here is that we can always find the threshold that gives good results on the backtest, but on the fronttest it will yield bad ones. Here, too, we have to think of something.

Still, the very idea of using regression models for machine learning seems highly questionable. And this is especially true for predicting increments. And twice especially it concerns NS, which by meaning is a black box with some layers and perseptrons. What economic or statistical meaning do these words have?

After all, it is not for nothing that GARCH models are used for increments. and they are the most common at the moment. The basic idea of defeating non-stationarity by decomposing a non-stationary series into components that have quite meaningful economic and statistical meaning is very attractive.


In GARCH, the model consists of the following steps:

  • The original series is detrended by logarithm (lowering the effect of outliers) the ratio of neighboring bars.
  • Since usually it is impossible to get rid of non-stationarity completely, then
  • model the remaining trend (ARIMA)
  • model nuances of ARCH
  • model the distribution of increments.

All meaningful and meaningful work.

If we add the possibility to add external regressors to this, we get a pretty rich tool, unfortunately extremely varied and, therefore, labor-intensive.

 
ARIMA+GARCH Trading Strategy on the S&P500 Stock Market Index Using R | QuantStart
ARIMA+GARCH Trading Strategy on the S&P500 Stock Market Index Using R | QuantStart
  • www.quantstart.com
In this article I want to show you how to apply all of the knowledge gained in the previous time series analysis posts to a trading strategy on the S&P500 US stock market index. We will see that by combining the ARIMA and GARCH models we can significantly outperform a "Buy-and-Hold" approach over the long term. Strategy Overview The idea of the...
 
SanSanych Fomenko:

  • the original series is detrended by logarithm(lowering the influence of emissions) of the ratio of neighboring bars.

on what basis?

 
SanSanych Fomenko:

Still, the very idea of using regression models for machine learning seems highly questionable. And this is especially true for predicting increments. And twice especially it concerns NS, which by meaning is a black box with some layers and perseptrons. What economic or statistical meaning do these words have?

After all, it is not for nothing that GARCH models are used for increments. and they are the most common at the moment. The basic idea of defeating non-stationarity by decomposing a non-stationary series into components that make quite meaningful economic and statistical sense is very appealing.

You are wrong SanSanych. NS is sort of the equivalent of fuzzy logic. Learnable. Personally, I don't see anything mysterious. You can use other analogies.

Well, and non-stationarity. Any process, if you break it down into chunks will become non-stationary, and if it isn't, it won't be random.

By the way, I haven't noticed any significant difference between the distributions on different long stretches (several in 3 months).

As for the economic sense - well, I don't know. I assume that the market is random to the observer. Whether it is actually random or not doesn't really matter. The key word here is for the observer.

 

You are an interesting man! It turns out you know everything!

 
Maxim Dmitrievsky:

on what basis?

I have log, what's the difference?

 
SanSanych Fomenko:

I have log, but what difference does it make?

because logarithm in this case does not get rid of outliers: calculating increments with n-lag gets rid of outliers

logarithm simply centers the graph with respect to 0

And to get rid of outliers by logarithm, it is necessary to introduce a logarithmic scale

simple increments

logarithm of increments (natural)


 
Maxim Dmitrievsky:

because logarithm in this case does not get rid of outliers: calculating increments with n-lag does get rid of outliers

logarithm simply centers the graph with respect to 0

and to get rid of outliers by logarithm it is necessary to introduce a logarithmic scale

simple increments

logarithm of increments (natural)


Emissions are subtle. It is better to replace too large emissions with a more acceptable maximum.

It is impossible to get rid of emissions completely. But to minimize their impact on the distribution is not only possible and necessary, and it is done by logarithm.

> summary(diff(eur))
     Index                       diff(eur)         
 Min.   :2016-01-04 00:00:00   Min.   :-0.0230100  
 1 st Qu.:2016-04-14 19:00:00   1 st Qu.:-0.0005300  
 Median :2016-07-27 12:00:00   Median : 0.0000100  
 Mean   :2016-07-27 12:01:14   Mean   :-0.0000036  
 3 rd Qu.:2016-11-08 06:00:00   3 rd Qu.: 0.0005200  
 Max.   :2017-02-17 23:00:00   Max.   : 0.0143400  


> summary((diff(eur, log=T)))
     Index                     (diff(eur, log = T))
 Min.   :2016-01-04 00:00:00   Min.   :-0.0206443  
 1st Qu.:2016-04-14 19:00:00   1st Qu.:-0.0004810  
 Median :2016-07-27 12:00:00   Median : 0.0000090  
 Mean   :2016-07-27 12:01:14   Mean   :-0.0000034  
 3rd Qu.:2016-11-08 06:00:00   3rd Qu.: 0.0004755  
 Max.   :2017-02-17 23:00:00   Max.   : 0.0127862  
                               NA's   :1


If we take a certain hypothetical case with 10 and 2 neighboring quotes, then

10/2 = 5

log (10/2) = 0.69

 
Maxim Dmitrievsky:

because logarithm in this case does not get rid of outliers: calculation of increments with n-lag does get rid of outliers



n-lag is an increase in TF, and the larger the TF, the larger the increments.

Your lag 50 is H8, only more precise in the sense that your TF=8 hours starts every minute unlike usual graph.

 
SanSanych Fomenko:

Emissions are a tricky thing. It is better to replace too high emissions with a more acceptable maximum.

It is impossible to get rid of emissions completely. But it is possible and necessary to minimize their influence on the distribution, and this can be done by logarithmization.



If we take a certain hypothetical case with 10 and 2 neighboring quotes, then

10/2 = 5

log (10/2) = 0.69

Well, fine, you have found the degree to which you need to increase the base of e to get the value of the initial increment

but you didn't get rid of the outliers.

I cited two pictures above

Reason: