Machine learning in trading: theory, models, practice and algo-trading - page 15

 
Dr.Trader:

I use standard indicators as a basis for creating predictors. I'm still experimenting with it myself, trying ideas from this forum thread.

I've been doing it for the last weeks, now the best result is as follows: (a lot of calculations, I'm studying this approach on D1 timeframe to be faster, later I will try to use a smaller timeframe)

1) export from mt5 to csv: ohlc, time, indicators, all for the last 10 bars. Recently I started to take time only from the newest bar, I believe that time of other bars is calculable and therefore it does not bring any new information. Several hundred "primary" predictors come out. The required result of training is "1" or "0" - the price increase or decrease for the next bar. My results with zigzags are unstable and difficult, I work better with close prices now. When I work out the full algorithm for model training from scratch - I can start working on zigzags and trend prediction.

2) I'm doing in R various mathematical operations with available data - adding, deltas, min, max, etc. It already comes out more than a thousand predictors.

3) Obviously, there's more garbage after the second step than needed. I am sifting it out by the method of the article about basic componentshttp://www.r-bloggers.com/principal-components-regression-pt-2-y-aware-methods/, SanSanych wrote about it earlier. I'm not teaching the PCR model itself; for now I've settled on the function for pre-screening of predictors:

srcTable is a table with predictors, the last column should be the required training result.pruneSig is better to leave -1.

As a result, the function will return a list with column names from the table that carry some useful information. Or an empty list if nothing useful is found. This method is indicated in the article as not very significant, but it turns out to be quite adequate, it sifts out garbage very well. Also, the list of results will be sorted by importance, from more useful to less useful.

4) If function returns an empty list, I again do the second step, again generating different mathematical combinations on available data, then the third step for sifting out. So I have to repeat it 3-4 times. The volume of data grows with each repetition, so it's better to somehow limit the volume of new data generated. You can change this function for sifting out so that if the list comes out empty, it returns a hundred or two best results and generates new predictors only from them.

5) Next, according to the article we need to teach the model of the main components itself. I have problems with it - so far the best r-squared for the trained model is 0.1, it's not enough, the article says that I need at least 0.95. But I can train some other R model on obtained predictors and it will give better result. I have the most experience with neuronics, the best result in fronttest with it comes out with an error of about 37%. PCE model is supposed to be more stable, without retraining, etc., but so far I can't get predictors for it.


Congratulations, thanks for posting the result.

I hope that this post of yours will be appreciated -you have, if not solved it, then you have approached the solution to the basic problem of trading, it is without exaggeration.

Everything else will come with it.

Congratulations again, good luck!

 
Dr.Trader:

I use standard indicators as a basis for creating predictors. I am still experimenting with it myself, trying ideas from this forum thread.

I've been doing this for the last few weeks, now the best result is as follows: (a lot of calculations, I'm studying this approach in general on the D1 timeframe to be faster, then I'll move to smaller timeframes)

1) export from mt5 to csv: ohlc, time, indicators, everything for the last 10 bars. Recently I started to take time only from the newest bar, I think that the time of other bars is calculated, and therefore does not carry new information. The output is several hundred "primary" predictors. The required learning result is "1" or "0" - rise or fall of the price for the next bar. I am getting unstable and complicated with zigzags, I am doing better with closing prices now. When I work out the full algorithm for training the model from scratch, I will be able to work on zigzags and trend prediction.

2) I perform various mathematical operations with available monthly data in R - addition, deltas, min, max, etc. I already have more than a thousand predictors.

3) Obviously, there is more rubbish after the second step than necessary. I sift it out using the method from the article about the main components of http://www.r-bloggers.com/principal-components-regression-pt-2-y-aware-methods/, SanSanych wrote about it here earlier. I don't train the PCR model itself, so far I've settled on this function for pre-screening predictors:

srcTable - table with predictors, the last column should be the required training result. pruneSig should be left at -1.

As a result, the function will return a list with the names of columns from the table that contain some useful information. Or an empty list if nothing useful is found. This method is indicated in the article as not very significant, but it turns out to be quite adequate, it filters out rubbish very well. Also, the list with results will be sorted by importance, from more useful to less useful.

4) If the function returns an empty list, I run the second step again, again generating different mathematical combinations on the available data, then the third step for screening. This way I have to repeat 3-4 times. The volume of data grows with each repetition, so it is better to somehow limit the volume of new generated data. You can change this function for screening, so that if the list is empty, it returns a hundred or two best results, and generate new predictors only from them.

5) Next, according to the article, we need to train the principal component model itself. I have problems with this, so far the best r-squared for the trained model = 0.1, this is not enough, the article says that it is necessary at least 0.95. But, it is possible to train some other model from R on the obtained predictors, and the result will be better. I have the most experience with neuronka, the best result in the fronttest with it comes out with an error of about 37%. PCE model should be more stable, without retraining, etc., but I can't find predictors for it yet.

If you have an error of 30% in the fronttest, it is already quite profitable model, make an Expert Advisor for mt5 and check it in the strategy tester.

Keep going! You will get better with time.
 
Dr.Trader:
I started watching this course, it focuses a lot on Pandas framework in Python. The first lessons are more like tutorials on this very framework than on data analysis. But the leaders look adequate, without the typical "I'm a Forex guru, I'll open your eyes and you will make millions" like in many other useless trainings, it gives hope that they will tell adequate things till the end. But it should be taken into account that this course is designed for stock trading, not for Forex, I do not know if the learning process models are similar in these two areas.
The principle is the same. There are some nuances of trading. For example, there are very few terminals that allow near-realistic testing for exchange (slippage, partial execution, delays). But there are such terminals. But MT5 does not belong to them, as I understand it.
 
Dr.Trader:

I use standard indicators as a basis for creating predictors.

...

2) I perform different mathematical operations with available data in R - addition, deltas, min, max, etc. It comes out already more than a thousand predictors.

3) Obviously, there is more garbage after the second step than needed. I'm sifting it out by the method from the article about basic componentshttp://www.r-bloggers.com/principal-components-regression-pt-2-y-aware-methods/, SanSanych wrote about it earlier.

...

As a result, the function will return a list with names of columns from the table that contain some useful information. Or an empty list if there is nothing useful. This method is indicated in the article as not very significant, but it turns out that it is quite adequate, it sifts out garbage very well. Also, the list with results will be sorted by relevance, from more useful to less useful.

I have the feeling that the calculation is completely at random! The predictors are nothing but garbage. There may be a diamond in their midst, because Life could arise from the "soup"!

It turns out that this approach is a competent optimization of computing. When there is not bruteforce, and more intelligent algorithms. But the input is still the same garbage.

It turns out that if we had such a powerful computing machine, which could do any calculations in a second, we wouldn't need any training at all. But at the same time there would not be any qualitative changes in obtaining the profitable TS. That's too bad.

 
Alexey Volchanskiy:

A colleague threw me a link to a course on machine learning, look at plz, what do you think? The course is free, but it's in Python for some reason (

https://www.udacity.com/course/machine-learning-for-trading--ud501

Everything is shown betterhere.

Good luck

 
Anton Zverev:

I can't help but get the feeling that the calculations are TOTALLY based on chance! Predictors are nothing but garbage. Maybe there will be a diamond among them, because Life could arise from the "soup"!

.... That's sad.

You are completely wrong!

Let me explain using an example.

Let's take a cobbler. Everything in it is garbage, or there is something worthwhile. Most likely there is, after all the experience of so many people.

Now let us assume for certainty that we are going to trade trends. Which of the available indicators in the kodobase will be useful for trend trading? We will judge from the name, or intuitively? And how many can we practically select to try? 10, 20, 100? I think 100 is through a tester for life, considering from the combination.

But the most important thing is not just the number of indicators to try. The main thing is whether the Expert Advisor will work in the future. And the EA will work in the future only in one case - if it's not retrained (not overfitted). The main problem in the construction of mechanical trading systems is the problem of retraining (overfitting). Have many people managed to overcome this problem?

I think thatDr.Trader didn't build his predictors out of the blue, but had some idea - at the moment the idea of generating so many predictors is not interesting.

What is interesting is something else entirely.

What is interesting is what you didn't pay attention to.

Among his thousands of predictors,Dr.Trader is able to select those ones that will not cause overtraining (overfitting) of Expert Advisors.

Not only he knows how to select predictors that will not cause overfitting of Expert Advisors, but he has laid out the code allowing to do this.

 
SanSanych Fomenko:

You are completely wrong!

Let me explain with an example.

Let's take a kodobase. Everything in it is garbage, or there is something worthwhile. Most likely there is, after all the experience of so many people.

Trash, of course! Well, take the entire kodobase as predictors...

Of his thousands of predictors,Dr.Trader can select those that will not cause overtraining (overfitting) of Expert Advisors.

Not only he knows how to select predictors that will not cause overfitting of Expert Advisors, but also he has laid out the code allowing to do this.

And it turned out that not a single gem, let alone a diamond, was found out of the huge pile of garbage. I told you, it was a fluke.

Or is anyone here capable of substantiating that such-and-such an indicator is not trash? And show in numbers the relative importance of this predictor?

 

Yes, I have a lot of random action, I agree. But you can't just take one indicator and use it to make an Expert Advisor, you will quickly lose money with it. Indicators are not 100% trash, but they alone do not have enough information to predict the price movement. But in my research I have found that by combining indicators one can increase their predictive power, i.e. one can make a diamond out of garbage. The problem is that there are thousands of possible combinations and only dozens of them are useful and I don't know which indicators are originally better than others. So far this problem is solved as I wrote earlier, acting with brute force and long calculations. As time goes by I will obtain statistics about which indicators get into final predictors more often and I will be able to work with them only, everything will go faster.

I started to make an Expert Advisor based on the obtained predictors and the tester will really see the result. They say that even if I have 60% of correctly predicted bars I will still lose money as the price travelled less distance for correctly predicted bars than for incorrect ones. If that is the case then we should make our own fitness function to train the neuronics to estimate the profitability of the model rather than the percentage of bars.

 
Dr.Trader:

Yes, I get a lot of the action at random, I agree. But you can't just take one indicator and use it to make an Expert Advisor, you will quickly lose money with it. Indicators are not 100% trash, but they alone do not have enough information to predict the price movement. But in my research I have found that by combining indicators one can increase their predictive power, i.e. one can make a diamond out of garbage. The problem is that there are thousands of possible combinations and only dozens of them are useful and I don't know which indicators are originally better than others. So far this problem is solved as I wrote earlier, acting with brute force and long calculations. As time goes by I will obtain statistics about which indicators get into final predictors more often and I will be able to work with them only, everything will go faster.

You want to find correlations in one single VR. And you want to find such correlations that must be present at all times in that VR.

These two circumstances (in bold) seem strange, to say the least.

Machine learning methods have learned to recognize objects (a dog, an orange, etc.) from pictures. That is, they have learned to recognize things that Humans or certain kinds of animals can recognize. When a Person or an animal sees a price BP, it is not aware of anything. That is, they are not able to make comparisons in their NS. However, when a Man looks at several BPs at once, he sees similarities even with the naked eye. And these similarities, indeed, are realized. That is why it is logical to put machine learning methods on the awareness of an object.

First we realize it ourselves, then we try the algorithms. I think you know what I mean.

Once upon a time EURGBP overnight was very cool (profitable). Your NS would not have recognized it. Profits were taken by those who understood the reasons of the night steepness. And then the algorithms were applied to these pieces of EURGBP history to find the date when it suddenly became steep. So as not to spoil the status of the data that was before the steepness. We started to investigate. And many made good money on it - just read the forums.

And now imagine that now the GOLD/SILVER is cool. There is no such pair, but you can trade it. But you have limited yourself to a single BP. And it is logical to look for interconnections between different BPs. Then such pairs as GOLD/SILVER may appear. And the intervals of the week, days, etc. also play a huge role. People's behavior depends on the time of day and the day of the week. It's conscious data, so that's where you have to dig, IMHO.

 
Anton Zverev:

You want to find correlations in one single VR. And you want to find such correlations that must be present at all times in that VR.

...

First we realize ourselves, then we try algorithms. I think you know what I mean.

...

But you limited yourself to a single BP. And it is logical to look for interconnections between different BPs. Then you can get these "GOLD/SILVER". And the intervals of the week, days, etc. also play a huge role. People's behavior depends on the time of day and the day of the week. This is conscious data, so you have to dig there, IMHO.

We are still trying to find the dependence of the conditional "future" on the "past" on the same time line. But this does not mean that we will not try to do it for a combination of rows.

About recognition. For oranges, your reasoning is applicable. Even maybe an expert can distinguish several varieties of orange.

For financial BPs, you need to distinguish a pattern - that is, the monotonous behavior of BPs over the entire time interval available. Yes, sometimes there seems to be something in sight. But this knowledge is very vague, and dependence parameters are not precisely defined at all. Here one cannot do without the help of a computer. Although I am not claiming that there is not a person who can easily find an addiction and code it.

I agree with Dr. Trader in trying to gather a lot of garbage first and then extract valuable inputs from it. The value of those inputs is checked by training the model and validating it. If it's not noise, there will be a plus on validation. That's the whole course of machine learning. )

I've tried such raw inputs:

> names(sampleA)

  [1] "lag_diff_2"        "lag_diff_3"        "lag_diff_4"        "lag_diff_6"        "lag_diff_8"        "lag_diff_11"       "lag_diff_16"     

  [8] "lag_diff_23"       "lag_diff_32"       "lag_diff_45"       "lag_diff_64"       "lag_diff_91"       "lag_diff_128"      "lag_diff_181"    

 [15] "lag_diff_256"      "lag_diff_362"      "lag_diff_512"      "lag_diff_724"      "lag_mean_diff_2"   "lag_mean_diff_3"   "lag_mean_diff_4" 

 [22] "lag_mean_diff_6"   "lag_mean_diff_8"   "lag_mean_diff_11"  "lag_mean_diff_16"  "lag_mean_diff_23"  "lag_mean_diff_32"  "lag_mean_diff_45"

 [29] "lag_mean_diff_64"  "lag_mean_diff_91"  "lag_mean_diff_128" "lag_mean_diff_181" "lag_mean_diff_256" "lag_mean_diff_362" "lag_mean_diff_512"

[36] "lag_mean_diff_724" "lag_max_diff_2"    "lag_max_diff_3"    "lag_max_diff_4"    "lag_max_diff_6"    "lag_max_diff_8"    "lag_max_diff_11" 

 [43] "lag_max_diff_16"   "lag_max_diff_23"   "lag_max_diff_32"   "lag_max_diff_45"   "lag_max_diff_64"   "lag_max_diff_91"   "lag_max_diff_128"

 [50] "lag_max_diff_181"  "lag_max_diff_256"  "lag_max_diff_362"  "lag_max_diff_512"  "lag_max_diff_724"  "lag_min_diff_2"    "lag_min_diff_3"  

 [57] "lag_min_diff_4"    "lag_min_diff_6"    "lag_min_diff_8"    "lag_min_diff_11"   "lag_min_diff_16"   "lag_min_diff_23"   "lag_min_diff_32" 

 [64] "lag_min_diff_45"   "lag_min_diff_64"   "lag_min_diff_91"   "lag_min_diff_128"  "lag_min_diff_181"  "lag_min_diff_256"  "lag_min_diff_362"

 [71] "lag_min_diff_512"  "lag_min_diff_724"  "lag_sd_2"          "lag_sd_3"          "lag_sd_4"          "lag_sd_6"          "lag_sd_8"        

 [78] "lag_sd_11"         "lag_sd_16"         "lag_sd_23"         "lag_sd_32"         "lag_sd_45"         "lag_sd_64"         "lag_sd_91"       

 [85] "lag_sd_128"        "lag_sd_181"        "lag_sd_256"        "lag_sd_362"        "lag_sd_512"        "lag_sd_724"        "lag_range_2"     

 [92] "lag_range_3"       "lag_range_4"       "lag_range_6"       "lag_range_8"       "lag_range_11"      "lag_range_16"      "lag_range_23"    

 [99] "lag_range_32"      "lag_range_45"      "lag_range_64"      "lag_range_91"      "lag_range_128"     "lag_range_181"     "lag_range_256"   

[106] "lag_range_362"     "lag_range_512"     "lag_range_724"     "symbol"            "month"             "day"               "week_day"        

[113] "hour"              "minute"            "future_lag_2"      "future_lag_3"      "future_lag_4"      "future_lag_6"      "future_lag_8"    

[120] "future_lag_11"     "future_lag_16"     "future_lag_23"     "future_lag_32"     "future_lag_45"     "future_lag_64"     "future_lag_91"   

[127] "future_lag_128"    "future_lag_181"    "future_lag_256"    "future_lag_362"    "future_lag_512"    "future_lag_724"

There's time and all sorts of metrics for price movement. Then I sifted them out. Here, look what I got.

This is part of the Expert Advisor that takes signals from the trained machine in R. Selected inputs are indicated there. And in the first place, by the way, is the hour when the deal is opened. That is, the time is important!

This is a test of the Expert Advisor on the entire history from 1999.02 to 2016.06:

It turned out crooked, but the machine is still learning NOT the noise, but the dependencies on the specified inputs.

That's why we are on the plus side. At least we improve the results of experiments.

Files:
Reason: