Machine learning in trading: theory, models, practice and algo-trading - page 97

 
Dr.Trader:
Keep not trusting importance when using it for forex. Iris is very simple data, there are direct correlations between available data and classes. RF is enough to find a minimal set of predictors on which iris classes can be defined - and you're done.

And what RF can catch non-direct dependencies? It seems to me that for the market it does not work solely because the predictors are rotten, if there were normal predictors it would work as with iris with an error under 95%

 

With irises it's simple - if a petal is from so-and-so to so-and-so length then it's class 1, and if it's from so-and-so to so-and-so width then it's class 2, etc. What RF does is find intervals of predictor values that best match the target values.

I don't even need forest for this task, one tree is enough for 90% accuracy:

 Rule number: 2 [Species=setosa cover=33 (31%) prob=1.00]
   Petal.Length< 2.6

 Rule number: 7 [Species=virginica cover=35 (33%) prob=0.00]
   Petal.Length>=2.6
   Petal.Length>=4.85

 Rule number: 6 [Species=versicolor cover=37 (35%) prob=0.00]
   Petal.Length>=2.6
   Petal.Length< 4.85

That is, if a certain class corresponds to a certain range of predictor values or their combinations, and these intervals do not overlap - then the tree or the forest will solve everything 100%.

Dependencies on Forex are much more complicated and the forest needs tens of predictors to describe the target values. The forest will surely find such intervals of predictors and their combinations that describe the target values, but they will be simply selected combinations, without any logic or analysis. What the forest will take to decide - noise, or a really important predictor - is a matter of chance. The forest for forex will only work correctly if weed out the unsuitable predictors in advance and leave only the necessary ones for learning. And the problem is that the necessary predictors need to be somehow identified or found, and the forest is not an aid in this.

ForeCA I have not yet been able to.

It took most of the time to sift out predictors with eigenvalue = 0 after cov() of the training table (I assume that only the specially correlated predictors fit). After 24 hours, it came down to training the ForeCA model itself, which didn't train because a bug:

unit-variance: Mean relative difference: 3.520041 e-06
Error in check_whitened(out$score, FALSE) : Data must be whitened:
         Each variable must have variance 1.

The package is very demanding on predictors, lots of all sorts of restrictions. I don't even know what the last error means, I'll figure it out further.

I'll finish this later:
Google says you don't have to remove the predictors. You can transform them so that they are no longer correlated, then the covariance matrix will have the full rank, which is required for ForeCA. There are some functions for whitening in the package itself (it didn't work right away, you need to figure it out), plus the theory in the links below.
To use the ForeCA package properly you need to first understand and learn how to do it:
http://stats.stackexchange.com/questions/23420/is-whitening-always-good
http://courses.media.mit.edu/2010fall/mas622j/whiten.pdf

 
Dr.Trader:

1) Forest for forex will work properly only if you preliminarily sift out unsuitable predictors and leave only necessary ones for training. And the problem is that the necessary predictors need to be somehow identified or found, and the forest is not an assistant in this.

2) The package is very demanding to predictors, there are a lot of restrictions. I don't even know what the last error means, I'll keep looking into it.

1) I suggested what i think is a very good idea how to do such a selection, but no one is interested, and i can't implement it myself.

2) I can't do it myself, but I need to reduce the amount of data, otherwise, you know what I mean.)

 

I've already posted this post, but no one reacted to it.

There is such a notion in BP as dynamic time warping (DTW), you can use it to make the price chart more readable, and therefore more recognizable for the algorithm

data_driven_time_warp <- function (y) {
  cbind(
    x = cumsum(c(0, abs(diff(y)))),
    y = y
  )
}

y <- cumsum(rnorm(200))+1000

i <- seq(1,length(y),by=10)
op <- par(mfrow=c(2,1), mar=c(.1,.1,.1,.1))
plot(y, type="l", axes = FALSE)
abline(v=i, col="grey")
lines(y, lwd=3)
box()
d <- data_driven_time_warp(y)
plot(d, type="l", axes=FALSE)
abline(v=d[i,1], col="grey")
lines(d, lwd=3)
box()
par(op)

and everything seems to be cool, but the sad thing is that as a result of this transformation, we get two coordinates x and y time(synthetic) and values

d
                x        y
  [1,]   0.000000 1001.393
  [2,]   1.081118 1002.474
  [3,]   2.799970 1004.193
  [4,]   3.706653 1005.100
  [5,]   3.867351 1005.260
  [6,]   4.654784 1004.473
  [7,]   5.000202 1004.127
  [8,]   6.665623 1005.793
  [9,]   7.415255 1005.043
 [10,]   7.956572 1005.584
 [11,]   8.403185 1005.138
 [12,]   9.352230 1004.189
 [13,]   9.913620 1004.750
 [14,]  10.249985 1004.414

the question is how to return this transformation to a vector so it doesn't lose its properties

this is what the transformation looks like - top ordinary, bottom dtw

DTW

 
mytarmailS:

1) I suggested what I think is a pretty good idea how to make such a selection, but no one is interested, and I can not implement it myself.

2) Only reduce the amount of data or you know yourself)

What do you suggest? It turns out, what I missed? Can I repeat your idea?
 
SanSanych Fomenko:
What do you suggest? What did I miss? Can I repeat your idea?
Look where I wrote about clustering, I explained the point in great detail there
 
mytarmailS:

The question is how to return this transformation to a normal vector so that it does not lose its properties

So, any thoughts on this?
 

I made one more example with ForeCA, in archive small table for the test and code for work with it.
This time I've got it right.
You can take your own table with training data for model, the main thing it must be matrix, without factors (training with lm, you can only regression). The number of rows should be much greater than the number of predictors, otherwise there will be errors in ForeCA.
Target values I have 0 and 1, with other accuracy will be determined incorrectly, if that adjust the code in RegressionToClasses01(regr) for your case in the place where the regression result is rounded into classes.
trainData - data for training
trainDataFT - data for fronttest

Result:
lm on raw data: 75% accuracy on training data and 57% accuracy on new data.
lm on all 14 components from foreCA: 75% on training data and 61% on new data. A little better, but in this case +4% is only +1 correct result, the table is quite small :)

That is, if the predictors are already pre-selected, then after ForeCA it should not be worse, maybe even a couple of percent accuracy will be added.

I also added a graph with the dependence of accuracy on the number of ForeCA components. It appears that the more components you have, the more accurate your results will be. Maximum allowed number of components = number of predictors.


 

The second part of the experiment is.

I had 14 previously selected predictors, added another 14 with random values. The maximum allowed number of ForeCA components is now 28.

Prediction accuracy with all 28 components on training data in both cases (with and without foreCA) 76%, accuracy on new data in both cases 57%.

In my opinion foreCA did not cope with garbage in predictors, I did not see the expected miracle.

 
mytarmailS:
So what?
It looks like a renko chart. Rencos are somehow drawn in the mt5 terminal on bars, you need a similar algorithm.
Reason: