Machine learning in trading: theory, models, practice and algo-trading - page 115
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Are you by any chance a clerk in an old-style brokerage house?
(Please tell me "how should", I suppose your "secret method" is to trade mashki (random) and double for a loss, right?))
I have to live on commission, gentlemen, only on the commission ...
Committee creation, and testing:
There's a problem in that the original factor type classes, and the result in the matrix is converted to the order numbers corresponding to the factors. So at the end the comparison goes through as.numberic().
For everything to work properly with factors, I need to create predictionMatrix as data.frame, but after that my rbind function gave out varnings, I need to change something else, I didn't understand what's wrong there.
A few thoughts on the code:
1. You don't have to use the for() construct unless it's absolutely necessary. There is a wonderful alternative to foreach(), which, apart from the high speed of execution, allows you to parallelize calculations between the available cores.
Model ensemble makes sense and gives results only if the models have significant differences. Two variants: one data set - different models (RF, DT, SVM); one model - different data sets. Example for the last variant below
Choose the models with the best performance and work with them from there.
Good luck
A few thoughts on the code:
1. You don't need to use for() unless it is absolutely necessary. There is a nice alternative to foreach() which, apart from high speed of execution, allows you to parallelize calculations between the available cores.
Model ensemble makes sense and gives results only if the models have significant differences. Two variants: one data set - different models (RF, DT, SVM); one model - different data sets. An example for the latter option below
Choose models with better performance and work with them.
Good luck
Let's choose the model with the best indices and work with it.
This is where the trouble lies.
And what are the best values based on what data?
Why do I ask, because Vkontov is trying to figure out how to choose a model (out of many models), using the data from training and testing. And here you have it so straightforward: take the best indicators and work with them.
This is where the trouble lies.
And what are the best values based on what data?
Why do I ask, because Vkontov is trying to figure out how to choose a model (out of many models), using the data from training and testing. And here you have it so straightforward: take the best indicators and work with them.
The initial set is divided into train/test stratified. On train we train and on test respectively we test. Is it really not clear from the code?
Good luck
I would like to see you more often. Do not disappear.
The initial set is divided into train/test stratified. On train we train on test respectively we test. Is it really not clear from the code?
Good luck,
I'll try rminer::holdout, thanks for the example. Generally, from experience, if you pick model and its parameters to get the best result on a test sample - then the model will finally show a really good result on the test sample. But the result is usually very low on new data. I'm talking specifically about forex data, in other areas this is quite a normal approach. I don't hope rminer::holdout for forex will change anything dramatically.
then the model will end up showing a really good result on the test sample. But the result is usually very low on new data. I'm talking specifically about forex data,
The market moves against its own statistics, this is a theory that I have confirmed with practice, it is the only theory I know that gives answers to all questions from why the model does not work on new data to why everyone loses money in the market in general...
why is this so hard for you to accept?
does the old knowledge and habits so much suppress the perception of new information?
Why concentrate so much on the model if the difference in performance between models is between 0.5% and 5%?
No model can help here because the essence is in the data itself
I've posted this picture more than once, but nevertheless.....
Look closely! This is the difference between cumulative buy and sell forecasts from two networks cum(buy.signal) - cum(sell.signal), ideally if our model is good then the blue chart should correlate with the price, this means that the network understands the data well and reacts adequately to them, in fact what we see????????
We cannot say that the model does not understand the data, the correlation is inverse but the structure is the same, but there is something wrong with the direction, the market goes against predictions and against statistics that the network learned in the past...
Now tell me what model can help with this? What crossvalidation will help? Any model training followed by out of sample (new data) would be nothing more than a fitting of a model that works well on out of sample and nothing more... And you see it all the time when you train models yourself, that on brand new data the model always fails, don' t you see?! ? I give you the answer why it happens!
Is this a graph with the data on which the training itself took place, or is there only a test on the new data? If you draw a graph for both time periods at once, both for the training and for the test, then on the first (training) part of the data will be a complete coincidence of the blue and gray graph, and with the beginning of new data - there will be a sharp transition to the inverse correlation?
If it were that simple, it would be enough to train any model, and just invert its predictions. That doesn't work unfortunately.
Teaching a model that gives 0% accuracy on new data is just as difficult as achieving 100% accuracy. The default, for example flipping a coin is 50% accuracy, and going a couple dozen percent in either direction is a task of equal difficulty. The problem is not that the models give opposite results, but that on some bars the result will be correct, on others - wrong, and all this is random and without the ability to filter out only correct results.
And why are you taking away the S forecast from the B forecast? Maybe you should do the opposite, S-B? Then all of a sudden the correlation would be right too.