Machine learning in trading: theory, models, practice and algo-trading - page 214
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
PS. And parallelize the calculations of lm(). This is exactly the case when you need to
Thank you.
I've seen how to parallelize an operation in a loop through foreach %dopar%. I don't know how to hook it to a hidden loop in DT. I don't know if it may be faster.
Thanks.
I've seen how to parallelize an operation in a loop through foreach %dopar%. I don't know how to attach it to a hidden loop in DT. And I don't know if it would be faster.
I meant this part of code.
{
lapply(c(1:20), function(x) summary(lm(data = .SD[, c(1:x, 21), with = F], formula = V21 ~ . -1))$'fstatistic'[[1]])
}
, by = sampling
]
Instead of lapply - foreach()
There is something wrong with graphs that take several tens of seconds to build.
Check out this package ("nhstplot"). It's fast and pretty good, in my opinion.
There is something wrong with graphs that take several tens of seconds to build.
Check out this package ("nhstplot"). It's fast and pretty good, in my opinion.
Ahh. I'll try that, thanks. It turns out that the lapply loop will be replaced by a parallel loop. And this all goes around in a DT loop, where there are 1000 iterations.
Is there another way to skip these 1000 iterations through foreach?
I'll take a look at it. But where is the semi-transparency here, where are the hundreds of superimposed objects? Test in heavy conditions and then we'll understand whether it's faster or not.
Fast drawing of translucencies can be achieved by video card with OpenGL. In R there is a library for this rgl, it is more for 3d, but if you can make an orthogonal projection, and draw lines, it will be just what you need. I did not manage to figure it out on the spot, you need to read the documentation.
Done:
It's pretty easy to draw semi-transparent lines, all you need is a table with X and Y coordinates. You can also add a third column Z for three-dimensionality.
for(i in 1:1000){
lines3d(cbind(1:1000, cumsum(rnorm(1000))), col="blue", alpha=0.1)
}
But it turned out to be slow all the same. Judging by processexplorer - video is used by only 5%, and one logical processor at all 100%. I think R is too slow in feeding data into OpenGL, much slower than it can receive. Somehow it came out that way.
Purely for the sake of clutter :) do it, maximize the window to the full screen, and spin the "figure" with the left mouse button.
for(i in 1:100){
lines3d(cbind(cumsum(rnorm(100)), cumsum(rnorm(100)), cumsum(rnorm(100))), alpha=0.2)
}
Finally made the first try of the idea with clusters, which I voiced earlier, a test run, just to see what's up, predictors are simple
Moving series of 5 OHLC values + volume + volatil 6 predictors of 5 values each
history of 100 000 bars
each predictor was normalized of course ) and then clustered into 100 clusters sorry for the nonsense
The target was created out of the blue (at the moment I just sat down), I just took the reversal in this form: The target is an extremum that is higher than the previous 4 candles and higher than the next 10 candles after it.
I started to look for repeating patterns...
The best I could find with such no predictors was the following pattern
91 6 30 41 91 100 0.4 9
In(open high low close volum vol um) there are the numbers of clusters that characterize this pattern
target_avg - this is the probability of my turn being triggered in this pattern; I haven't managed to find any patterns with theprobability of triggering 80-90% according to these predictors
target_count - the number of times the pattern was caught in the history, I haven't found any significant patterns, which were caught 30-50 times using these predict ors either
In total the best I found in these predict ors is a pattern, in which the reversal (target) works in 40% of cases and observations of such (number of patterns) is 9 pieces
Maybe this is the only piece of useful information, the required "NO BLOW", which can be extracted from the set of predictors and it explains only one reason for the reversal and even then only by 40% and the reasons for different reversals are different and there are definitely not 10 or 30 of them imho.
And now think about how the algorithm can explain all market movements with such predictors, it's impossible, because predictors can explain only 2%, and all the rest is noise ...
plus there is no control of statistical recurrence, i.e. IR can make a decision based on one or two observations, and that's how it is in fact in most cases in 95% of cases
Anyway, I digress... let's continue with the pattern, having estimated the quality of inputs on the new sample I will say this, it is far from a Mercedes, but if it is a killed Zaporozhets then this approach is a nine from the factory
The quality of inputs is much better, clearer, fewer errors...
another thing, the full pattern is...
91 6 30 41 91 100
when I ran the recognition on the new data of 50 000 candlesticks the algorithm couldn't find any of such a pattern, it just didn't show up))
I had to cut the pattern down and left only the prices
91 6 30 41 91 100
I found about 20 of such patterns
here are the entries in the pattern, I have not selected anything "ala best entries" just photographed as is in the sequence in which the deals were made, not all deals of course just the first few, so that you can evaluate
eqivity is good, although the risk is greater than expected
remember, it's only one pattern, and only the shorts
If someone needs the code, then I will lay it out, although I doubt it, everything is very clear214 pages is a lot to study/learn. (And everyone is talking about something different and not always understandable).
Is it possible to summarize all these pages in one post, even if not very short? Type: set goal, methods of solution, results, conclusions.
Let me say right away that my model of the market is a random process (Brownian motion), or rather the sum of several (may be many) such motions with feedbacks. And it is absolutely useless to predict or look for any regularities except statistical ones. I.e., any meaningful predictors simply do not exist, at least for speculative purposes.