Machine learning in trading: theory, models, practice and algo-trading - page 137

 
mytarmailS:


What's the problem? Please help.


This is how it works:

test_vec <- numeric() # тут будем хранить показатели теста 


for(i in 151:ln){

print(i)

idx <- (i-150):i

#проводим линейную регрессию для определения правильного соотношения

x <- data[idx, ]

model <- lm(ri ~ si + 0, x)

#вычисляем разницу цен (спред)

spread <- x$ri - coef(model)[[1]] * x$si

#проводим тест Дики-Фуллера на стационарность

test <- adf.test(as.vector(spread), k = 0)

test_vec[i-150] <- test$p.value

}


plot(test_vec, type = 's')

 
Alexey Burnakov:

First, R^2 0.55 can really be achieved by applying some small functional transformation to the "metaphysical". Another thing is that the function turns out to be a bit complicated in its form.

Also - try to take:

rowMeans(df[,1:10])

...

I added 10 new columns, it did not change anything for nnet, the results and learning curves remained about the same, the model selects the same predictors.

But the tree suddenly gave much better results. The forest got better too, but one tree is leading by itself (pseudo r^2=0.39), here are the graphs with rattle on the new data.

The tree only chose rowMeans[,1:50], remembering its values from the training data. So there is a very close but non-linear relationship between rowMeans[,1:50], and the target.

Although, if we leave only these 10 new predictors, then nnet will train to r^2=0.22, also better.

 
Dr.Trader:

I added 10 new columns, it did not change anything for nnet, the results and learning curves remained about the same, the model selects the same predictors.

But the tree suddenly gave much better results. The forest also got better, but one tree is leading by itself (pseudo r^2=0.39), here are the graphs with rattle on the new data.

The tree only chose rowMeans[,1:50], remembering its values from the training data. So there is a very close but non-linear relationship between rowMeans[,1:50], and the target.

Although, if we leave only these 10 new predictors, then nnet will train to r^2=0.22, also better.



Exactly right. mean(1:50). Good results. I will now try to improve the approximation of the function a bit. If you don't mind, post the scatter plot of the found fic vs simulated output. For NS or for Random Forest. I'll post mine later. There should be a non-linearity.

 

Alexey Burnakov:

There should be a non-linearity.

Judging by the graph, there is a little bit of connection. But both neuronics and the tree only release some general trend. This predictor alone obviously does not allow to train the model, we cannot get more from a tree.

 
Dr.Trader:

Judging by the graph, there is a little bit of a connection. But both neuronics and the tree only release some general trend. And on this predictor alone it is clearly not possible to train the model, it is impossible to get more from the tree.

Thank you, thank you.

Yes, I agree.

My graphs:

rmse minimization

real dependence:

model:

real and model:

The original conception is pure:

How it really turned out with the noise reduction you've already seen.


Maximum modeling quality possible:

 
Alexey Burnakov:

It works that way:

Thank you, I wouldn't have figured it out.
 
Dr.Trader:

Judging by the graph, there is a little bit of a connection. But both neuronics and the tree only release some general trend. And on this predictor alone it is clearly not possible to train the model, it is impossible to get more from the tree.

We need to summarize the problem and write what we have learned.

All under the assumption that in real life we know nothing about the type of dependence.

My understanding is that:

1) pulling a dependency out of a number of simple fics like price returns is difficult and many methods don't work well. But by general principles you can get an approximate solution through convolution.

2) If you generate a lot of chips in advance, there is a good chance that conventional methods will work well.

3) the best quality metric on raw chips, after convolutional NS, has conventional NS, then go other methods with approximately the same result.

4) on a lot of potential generated fiches work well forest, NS.

5) whether it is preferable to leave the convolutional NS to collect the chips itself, instead of the human, is still an open question. Finding the right convolutional architecture is probably as much work as generating a bunch of features in advance.

What can you add, Dr.?

 

I tried generating fiches when you first posted the problem, the algorithm spent all night going through different mathematical combinations, selecting the best new predictors through vtreat evaluation. There was no positive result, the model couldn't even properly train on the new predictors. So here you either happen to guess the right predictors and mathematical operations with them, or you don't. You can spend days generating and trying variants, and it will still be useless. Since convolutional network on original predictors got better results than usual network together with rowmeans - it is probably better to stop on convolutional network.

 
Dr.Trader:

I tried generating fiches when you first posted the problem, the algorithm spent all night going through different mathematical combinations, selecting the best new predictors through vtreat evaluation. There was no positive result, the model couldn't even properly train on the new predictors. That is, you either hit and guess the right predictors and mathematical operations with them by chance, or you don't. You can spend days generating and trying variants, and it will still be useless. Since convolutional network on original predictors has got better results, than normal network with rowmeans, it is probably better to use convolutional network.

Thanks, Dr!

Yes, that's a strong argument, too. It's just that I'm out of my habit of outputting amounts with a sliding increasing window, as well as differences with a sliding lag and everything else sliding into the past.

You see, I have this notion that to model almost everything (almost!) it's enough to take predictors of the form:

current_price - price(lag1)

current_price - price(lag2)

...

current_price - price(lag_n)

This can be treated as sliding sums (which are easily reduced to averages), and any configuration of the trend can be reproduced: kinks in different places, speed, acceleration.

Speaking of the convolutional network, I suggest trying to do something practical with the code I cited. And focus on analyzing the weights and kernels of the network. The kernel will show the kind of convolution. Weights can show importance and non-linearity.

Personally, I'm taking a break from my main experiment for now, although there are already some tolerable results there. Just getting bored... I decided to try forecasting one stock instrument with a convolutional network. The overhead there is pretty low (an order of magnitude lower than the forex spread) and there's a fantasy that it might work. I'll tell you more about it later.

Quick course on CNN: http://cs231n.github.io/convolutional-networks/

 

These video lectures also used to be on YouTube, but then they were removed, but they remain in archive.org - https://archive.org/details/cs231n-CNNs

In English, but very informative and useful. Mostly about image recognition with convolutional nets, but there is a lot of useful information about neurons in general.

Reason: