Machine learning in trading: theory, models, practice and algo-trading - page 469

 

Here's the code to find the arim parameters, you can find it in the atache post at the link. I corrected that first post a bit, replaced the first more unsuccessful example with a normal one.

  arimaModel <- auto.arima(y = ts(DT$value[trainIndexes],frequency=48),
                           seasonal.test = "ocsb",
                           trace=TRUE,
                           stepwise = FALSE,
                           max.q = 48, 
                           max.order = 48+5
                         )

The auto.arima function looks for suitable p d q P D Q parameters arima itself
ts(DT$value[trainIndexes],frequency=48) # data is converted into some format from forecast package, the main thing is to specify frequency, otherwise seasonal will not be used
seasonal.test = "ocsb" # google says it is better, I don't know for sure
stepwise = FALSE #false enables more complete selection of parameters. the default value of true means that the search will most likely get stuck at the local minimum and stop
max.q =48 #maximum value of q when searching. default value == 5, not much for this data
max.order = 48+5 # max sum p+q+P+Q. default value == 5, not much for this data

The function will take a long time, but should end up with the same parameters I used, maybe even find even better ones.

I myself didn't wait for the function to find everything, I just picked the right parameters by intuition. The data are trended, so p = 1 and P = 1. And the graph shows prevailing periods 24 and 48, so q = 24, and Q = 48 / frequency = 1
I couldn't insert period 336 into arim anymore, it requires a second seasonality, the forecast package can't do that.

Arima with already known parameters p d q P D Q is created like this:

Arima(y = ts(DT$value[trainIndexes],frequency=48), order = c(1, 0, 24), seasonal = c(1, 0, 1))

The seasonality is actually not (1,0,1) but (1,0,48) because Q is sort of dominated by frequency



SanSanych Fomenko:

Discussing arima without analyzing the residual on ARCH is a completely empty exercise. There are series that have a stationary residual after an arima simulation. But discussing prediction error on the assumption that it is stationary is not serious. This residual is extremely faceted.

Yes, I agree, it's just that this data is very cyclic and simple, that's why arima works without problems. If I paste eurusd m30 into the same code, then the model does not get into sharp price jumps with the new data.
 
Dr. Trader:

Here's the code to find the arim parameters, you can find it in the atache post at the link. I corrected that first post a bit, replaced the first more unsuccessful example with a normal one.

the function auto.arima searches for suitable p d q P D Q parameters by itself
ts(DT$value[trainIndexes],frequency=48) # data are converted into some format from forecast package, the main thing is to specify frequency, otherwise seasonal will not be used
seasonal.test = "ocsb" # google says it is better, I don't know for sure
stepwise = FALSE #false enables more complete selection of parameters. the default value of true means that the search will most likely get stuck at the local minimum and stop
max.q =48 #maximum value of q when searching. default value == 5, not much for this data
max.order = 48+5 # max sum p+q+P+Q. default value == 5, not much for this data

The function will take a long time, but should end up with the same parameters I used, maybe even find even better ones.

I myself didn't wait for the function to find everything, I just picked the right parameters by intuition. The data are trended, so p = 1 and P = 1. And the graph shows prevailing periods 24 and 48, so q = 24, and Q = 48 / frequency = 1
I couldn't insert period 336 into Arima anymore, it needs the second seasonality, the forecast package can't do that.

Arima with already known parameters p d q P D Q is created like this:

the seasonality is actually not (1,0,1), but (1,0,48) because Q is sort of dominated by frequency



Yes, I agree, it's just that this data is very cyclic and simple, so arima works without problems. If I paste eurusd m30 in the same code, then the model does not get into sharp price movements with the new data.
I was not interested in these "Optimal" parameters, parameters that are coefficients in regression equation - it prints after fitting
 
summary(arimaModel)
Series: ts(DT$value[trainIndexes], frequency = period) 
ARIMA(1,0,24)(1,0,1)[48] with non-zero mean 

Coefficients:
         ar1     ma1     ma2     ma3     ma4     ma5     ma6     ma7     ma8     ma9    ma10    ma11    ma12    ma13    ma14    ma15    ma16    ma17    ma18
      0.8531  0.3469  0.3324  0.3512  0.3564  0.3176  0.2676  0.2223  0.1904  0.2015  0.2241  0.2529  0.2424  0.2383  0.2408  0.2507  0.2279  0.1701  0.1418
s.e.  0.0316  0.0350  0.0413  0.0462  0.0506  0.0542  0.0559  0.0554  0.0537  0.0514  0.0494  0.0481  0.0477  0.0469  0.0455  0.0451  0.0448  0.0439  0.0415
        ma19    ma20   ma21     ma22     ma23     ma24    sar1     sma1       mean
      0.0813  0.0525  0.028  -0.0152  -0.0226  -0.0159  0.9899  -0.4300  1816.9447
s.e.  0.0390  0.0358  0.032   0.0280   0.0224   0.0180  0.0015   0.0132   687.9652

sigma^2 estimated as 1442:  log likelihood=-23883.84
AIC=47825.68   AICc=47826.05   BIC=48012.95

Training set error measures:
                     ME     RMSE      MAE         MPE     MAPE      MASE         ACF1
Training set -0.1648644 37.86381 25.64976 -0.07217873 1.573367 0.1610166 0.0002493082
Files:
arimaModel.zip  140 kb
 
Dr. Trader:

It's a strange table.

Nevertheless.

We compare the value of the coefficient with the s.e. With rare exceptions, more than 10%. For some reason I don't see an estimate through t, but if head-on, that 10% means:

The null hypothesis on the coefficient estimate: probability that the coefficient is not significant. A deviation of more than 10% says that all these coefficients are NOT significant, i.e. you have NO regression equation.


PS.

Usually those coefficients that are significant are marked with asterisks. Since the coefficients are NOT significant, all the other numbers are just numbers.

hist(residuals(arimaModel), breaks= 100)


The reason the coefficients are NOT significant is because the tail on the left is thicker than the tail on the right.

There are tests that allow you to quantitatively, not by eye, identify problems and pick up tools to solve them.

Conclusion:

The ARIMA model is not applicable to the time series used.

 
Maxim Dmitrievsky:

In the market any classifier is retrained, because the market is not stationary. If we want not to overtrain, we need to teach NS for the whole history. Otherwise it will always be the case that the market cycle has changed and the model is spoiled. That's why the only correct approach is retraining or re-training in the process of trading :) We don't believe in hauls that will stably give 1000% monthly on a 15-year history without any intervention.

In general, I still do not see that edge - what does overtrained NM in forex. Is it when it's not making money on a test sample? no no no... it's about non-stationarity. There`s one thing about overtraining - it`s normal, you should just look for other approaches in addition to the real one.


Here you are absolutely right, how can you talk about retraining when you do not know the concept itself???? What does it mean to retrain the NS??? Let's each throw out how they see it, and I'll go first.

1. NS doesn't work well on new data. Implying that it doesn't consistently split signals, it doesn't matter right or wrong, what matters is the stability of separating the bad from the good.... It can consistently drain (model inversion), but the very fact of separating the bad from the good is on the face of it.

2. The model performed well for less than half the training interval. The implication of this approach is that a well-trained model should run 50% or more of the training interval.

3.The balance curve on the new data has sharp ups and downs (a random model that works in a specific period of time and led to profitability due to 1-2 large transactions, but in general it is a sinker)

And on the subject of classification here is my answer.

NOT STATIONARY smoothly changing value, as soon as the bar closed, it begins to slowly drift away. And the further the bar becomes in the history, the further this notorious STATIONARY value (a notional designation of some chaos or etheric value that changes the market in general) becomes, so having trained the model of classification, we will get that this value directly depends on the quality of the model. The older the model, the lower the level of its quality decreases, in accordance with the change of this very NOT STATIONARY. The problem is to build such a model, that would work long enough, that it would be possible to take a couple of pips out of it :-)

 
Mihail Marchukajtes:

What does it mean to retrain the NS?

Data for training, especially in Forex, is usually noisy, and by training the model to 100% accuracy we teach it to reproduce these noises along with the desired result. And we need the training to stop at the moment when NS has already started to predict the result correctly, but hasn't yet started to stupidly memorize the correct answers together with the noise. That's in my own words. Scientifically,https://ru.wikipedia.org/wiki/Переобучение.


https://commons.wikimedia.org/wiki/File:Overfitting.svg

Here is a good illustration. Two models,
the first one (the hall line) learned this data with 100% accuracy. By eye we see that many points on the boundary of red and blue space are shifted a little to the side (noise), and in fact the boundary of these two spaces should not be a broken line, but some kind of averaged line.
The first model is retrained.
And there is a second model (black line) that ignores the noise, and clearly by sense divides the plane.

 
Dr. Trader:

Data for training, especially in Forex, is usually noisy, and by training the model to 100% accuracy we teach it to reproduce these noises along with the desired result. And we need the training to stop at the moment when NS has already started to predict the result correctly, but hasn't yet started to stupidly memorize the correct answers together with the noise. That's in my own words. Scientifically,https://ru.wikipedia.org/wiki/Переобучение.


https://commons.wikimedia.org/wiki/File:Overfitting.svg

Here's a pretty good illustration. Two patterns,
the first one (the hall line) has learned this data with 100% accuracy. By eye, we can see that many of the points at the boundary of red and blue space are shifted slightly to the side (noise), and in fact the boundary of these two spaces should not be a broken line, but some kind of averaged line.
The first model is retrained.
And there is a second model (black line) that ignores the noise, and clearly by sense divides the plane.

Sometimes the brain starts to break down... about the noise in forex, it's not a radio signal, right? Why is there noise in the forex? If Michael's model had 30-50 trades a month, 1-2 a day, was she trading noise or what? Somehow this definition does not fit here :)

Overlearning in forex is misclassified (temporal) patterns. But there are no other patterns in forex, so any pattern will overtrain to one degree or another

p.s. So you need to do instrument sorting and pick the most persistent BPs at the moment, like rising stocks or indices

 

All true!!!! But overtraining also has a mathematical explanation......

When training with a teacher, we try to reduce the network error on the training set. And this error reduction can be infinite if we deal with real numbers. But there comes a moment, when error reduction leads to deterioration of the model even on the test set. From this we can draw the following conclusion!

Theoretically for each data set there is absolute learning, this is a certain line on error scale where 0.000000000000000000000001 model is not retrained and 0.00000000000000000000009 is retrained, a kind of "absolute zero", using the language of physicists. All models whose errors are to the right of this point are considered to be underlearned, those to the left are overlearned, respectively. Let me remind that this is only a theory of my personal understanding, I do not pretend to anything.

A kind of ideal model for a particular set of data.

The task of any AI is to get as close as possible to the point of absolute learnability, but not to overstep it. IMHO.


This theory assumes that this is not a specific point, but a region of fully trained and over-trained models. A mixed area, not large enough. Just. imagine...... Why??? This I have identified from observation.

Anyway, the first thing the AI has to do is to make sure it gets into this transition region. But here's the thing......

If you divide the sample stationary then it will most likely be some specific value of the overtraining boundary (most likely), if the sample is divided randomly each time then it's OVER.... transient.... IMHO

If the AI doesn't come to that region guaranteed, then it's not done right. The other thing is which model it will stop there!!!!

I got all this from using JPrediction.

Starting to train the same file, taking into account Random sampling, I got 10 different training results, the worst was 75%, the best 85% of generalization (we are now taking the optimizer's numbers, right or wrong it does not matter, now...as an example....) That is, we can assume that we have an area between 75 and 85 in which there are an infinite number of variants of the model, a neural network. As a rule I choose the average around 80-82 and then you can run into such a model, which on the OOS will be weak. Because determining the ultimate polynomial is not a simple matter.

Here is a video and watch from 35 minutes, there he talks about it....

https://www.youtube.com/watch?v=qLBkB4sMztk

001. Вводная лекция - К.В. Воронцов
001. Вводная лекция - К.В. Воронцов
  • 2014.12.22
  • www.youtube.com
Курс "Машинное обучение" является одним из основных курсов Школы, поэтому он является обязательным для всех студентов ШАД. Лектор: Константин Вячеславович Во...
 
Mihail Marchukajtes:

Here is the video and watch from 35 minutes, there he talks about it....

https://www.youtube.com/watch?v=qLBkB4sMztk


Yeah, I've seen this guy before, I'll watch it again, thank you.)

If to understand the mathematical meaning, the real meaning - the overtrained NS in Forex has another meaning, and there is no escape from such an overtraining :) This is why either rigid sorting and searching for trend instruments or permanently retrained adaptive NS, but according to what criteria - it's a creative question.

 
Maxim Dmitrievsky:

Yeah, I've seen this guy before, I'll watch it again, thank you.)

The mathematical meaning is clear, but the real meaning - what is the retrained NS in Forex has a different meaning, and there is no escape from such retraining :) This is why either rigid sorting and searching for trend instruments or permanently retrained adaptive NS, but according to what criteria - it's a creative issue here.


The real point, though, is this. If there is a bad separation on the test section, it does not matter right or wrong, the very fact of separation is weak. And the model worked no more than 50% of the training intrevalue, then such a model is considered overtrained.... IMHO

Reason: