Taking Neural Networks to the next level - page 9

 

model and first training pass with "triple barrier" version:

multi currency MLP first training

 

Hi Chris, greetings from Spain.

I reached this thread two days ago, I realized that is no only me that believe that the NN will give us an edge in trading. I just finished to read all the posts and I'm reinforced now by the info I read here.

The 'one-shot' encoding seams fantastic, I was planning to use the integer representation for a date (23:59 -> 2359 and so on) but your idea makes a lot of sense.

Some other of the time inputs I use is day of the week (Mon, Tue, Wed, etc...) in order to model some kind of news, like tomorrows Non-Farm employment in USA, and also the week in the month (1 to 5).

I think there could also determinant like other seasonal info like the months.

My system is designed in some kind of 'triple barrier', a classification that says if the bar will reach some positive price from the entry in order to buy or a negative price, to sell.

I hope to share with you some of my result in one or two weeks.


Regads.



Chris70:

Okay... it's probably now really time to get back to neural networks... (yeah, I know, the off-topic intermezzo was mostly my own fault...):

In the meantime I finished the code of a multicurrency EA version for neural network price forecasting, so that it's now ready for training, validation and testing.

As I mentioned, the predictions in previous attempts (single currency) were not very reliable on average for any random moment in time, so I wanted to concentrate more on finding high-probability setups only.

I chose a classifier model now instead of forecasting exact prices, because the method that I'll use this time closely follows an approach suggested by Dr. Marcos Lopez De Prado ("Quant of the year" 2019, author of the book "Advances in financial machine learning").

The network has 3 outputs that are labeled based on the "triple barrier method":

 - output 1: an upper price level (n pips, fixed distance) is hit first (=upper "horizontal barrier")

 - output 2: a lower price level (n pips, fixed distance) is hit first (=lower "horizontal barrier")

 - output 3: no price level is hit within a max. number of minutes (="vertical barrier")

The activation function of the output layer is "softmax", which has the nice quality that all three outputs together add up to 1.0, so that the individual outputs can be seen as probabilities within a distribution.

Because it is a classifier this time, the loss function that we want to minimize during training this time isn't MSE loss (mean squared error), but Cross Entropy Loss.

The network has a normal MLP architecture for now, but I might also give it a try and compare with LSTM cells.

As I mentioned earlier, MLPs are good for pattern recognition, LSTMs and related types of recurrent networks are better for long time dependencies. So both have advantages. A multilayered fully connected LSTM network combines the advantages of both and this is also the model that I had initially used with the autoencoder. Without the autoencoder (which gets a little complicated with multi-currency trading), computation performance will suffer, which is why I start with a normal MLP; this doesn't mean that it can't have many neurons/layers, but not having to backpropagate on top of that through lots of time-steps is gonna make the training part a lot faster. We'll see.

Nevertheless, we're not done yet with a standard MLP network. Further following the suggestion of Dr. M.Lopez De Prado, I'm taking the outputs and the correct labels and thereby obtain true positives / true negatives / false positives / false negatives and can make a second (!) MLP network learn (after training of the main network) with this "meta labeling", so that I can calculate things like accuracy (validity), precision (reliability), recall and F-Score. The objective is to use these values for selection of high probability setups only.

For the inputs of the primary/main network, I'm using n periods of High/Low/Close prices (1.) of the main chart symbol and (2.) additional symbols that are conveniently communicated as an input variable (=comma separated list). Instead of pure prices, I take log returns a differencing method. The plan is to use at least all major pairs (EURUSD, USDJPY, GPBUSD, USDCAD, USDCHF) plus AUDUSD, as long as MT5 can handle these many price histories simultanously... It is the job of the neural network to find correlations among the currency pairs by itself and thereby derive possible consequences for the next upcoming prices of the main chart symbol.

I also added the month, the day of the week and the time as input variables.

For those of you who think about developing neural networks by themselves (may it be MQL or e.g. Python..), let's think for a moment about how to best feed these variables into a network (and if you don't know it yet, maybe I can show a neat trick):

Let's take the hour of the day as an example: 23 is followed by 0... does this really make sense? The minutes 23:59 and 0:00 are direct neighbors, but their values are at the highest possible distance. We have no continuity and the network will have some issues trying to make something meaningful out of this huge step. So what can we do?

One very common method (in fact the standard method for this purpose) is called "one-hot" encoding, which means we don't take just one input for the hour of the day, but 24 (i.e. 0-23). If for example the hour is 15:xx, then input number 15 gets the value 1, all other 23 of these inputs get the value 0. This method isn't that rare at all. Think of image recognition: an RGB sub-pixel is either ON or OFF, so it totally makes sense to encode a picture as "one-hot" encodings of all those MegaPixels that the images is made of.

If we only encode the hour, we need those 24 inputs. If we also encode the minute of the hour we have 60 more. Then 12 for the month... All this is absolutely feasible, but there might be a more elegant way...:

Think of the hour hand of a clock (and let's say this clock has a 24h watchface instead of 12h): instead of taking the value of the hour, we might instead take the angle of the hour hand, then we get a 360 degrees circle. Still, between 359° and 0°, there is this huge gap that we want to avoid. So how do we achieve continuity? The magic trick: the sine and cosine wave function! They are continuous, no gaps between neighbor values. If we put this into code, the declaration of the inputs can then look something like this:

et voilà.. we just used only 2 inputs for continuous time information that is precise down to the second, instead of 24+60+60=144 inputs for the one-hot encoding method;

sin(2*M_PI*mon/12) and cos(2*M_PI*mon/12)    do the same for the month; this method works for all kinds of such "cyclic" variables.


Okay... now let's see if the multicurrency network version is training without any surprises and I'll come back later with some results...

 
2duros:

Hi Chris, greetings from Spain.

I reached this thread two days ago, I realized that is no only me that believe that the NN will give us an edge in trading. I just finished to read all the posts and I'm reinforced now by the info I read here.

The 'one-shot' encoding seams fantastic, I was planning to use the integer representation for a date (23:59 -> 2359 and so on) but your idea makes a lot of sense.

Some other of the time inputs I use is day of the week (Mon, Tue, Wed, etc...) in order to model some kind of news, like tomorrows Non-Farm employment in USA, and also the week in the month (1 to 5).

I think there could also determinant like other seasonal info like the months.

My system is designed in some kind of 'triple barrier', a classification that says if the bar will reach some positive price from the entry in order to buy or a negative price, to sell.

I hope to share with you some of my result in one or two weeks.


Regads.


Saludos ;-),

Always nice to hear about other projects in the field.

Yes, you can of course apply the sin/cos method to any cyclic variable. But just keep in mind that you always need sin AND cos, because for example: sin(20)=0.342, but sin(160) is also(!) 0.342. Cos(20) is 0.94, bis cos(160) is negative(!) 0.94, so if you use both in combination, any degree/angle of the circle is always assigned unambigously. Still, two numbers are much less than e.g. 365 numbers for a year, which I'd consider as a huge simplification.

Note: the formula sin(2pi*x/n) and cos(2pi*x/n) refers to angles in radian units, not degrees (in degree notation: sin(360°*x/n) or cos (360°*x/n)).

I also thought about using news as additional inputs. I wouldn't go this far to try to automatically interpret the directional implications of any specific news event (=from a fundamental point of view), but a simple method could be to feed in the time distance (seconds) since the last event that affects this currency, the impact level and the same for the next (scheduled) upcoming event. I already wrote the code for such a function class that returns me this number of seconds a few month ago, so I only need to combine it with my neural networks. There are about 90.000 news in the available history through Metatrader's built in news event functions. This should be enough for some reasonable neural network training.

Do you also implement AI with naked MQL only or do you interface MQL with another programming language? I'm really interested what you have achieved so far.

-------

Bad news first: with the multi-currency approach, the results were not much better than with single currency forecasting. It seems really hard to beat randomness in the forex market, i.e. anything that is significantly beyond a coin toss probability.

The meta-labeling method that specifically only looks for high probability setups now seems to do the trick.

I just trained an MLP network on EURUSD with 60x10 minutes HLC data on EURUSD+USDJPY+GBPUSD+AUDUSD+USDCHF+USDCAD plus time&date information. The forecasting target is the direction of the next 20 pip moves of EURUSD as a binary classifier.

The results of this primary MLP network didn't look promising and not different from my experiences with single currency forecasting. However, if we use the exact output numbers (=however insignificant their deviation from a 50% probability might be) and feed them into a secondary network that learns to classify (softmax/cross entropy) for the probabilities for true positive / false positive / true negative / false negative, we then can calculate from the output results the values for positive predictive value=TP/(TP+FP), accuracy=(TP+TN)/(TP+TN+FP+FN) and precision=TP/(TP+FP) and can take these values as a decision filter for all the trades we don't want to take.

I just started a backtest that has these values as its only optimization criteria. While a chose a minimum value of 0.5 (up to 0.55) for each of them (with the positive predictive value probably being the most important one), the first results are remarkable. Not because of by any means exeptionally high profits, but because it hasn't produced a single losing pass yet. This is an early test and it has limited value because it covers only the past 8 months (I had in this case no other unseen data left, that were not part of the networks training..), but I think it shows some potential. Another downside: the backtesting takes "forever" if you run two huge neural networks simultanously (to give an idea: the primary network has more than 1 million weight connections, the secondary network about 44.000).

I'll probably abort the test and start another test with a multi-currency + multilayer LSTM version and then decide which one starts off more promising for a more detailed evaluation.

multicurrency backtesting

 
Chris70:

Saludos ;-),

Always nice to hear about other projects in the field.

Yes, you can of course apply the sin/cos method to any cyclic variable. But just keep in mind that you always need sin AND cos, because for example: sin(20)=0.342, but sin(160) is also(!) 0.342. Cos(20) is 0.94, bis cos(160) is negative(!) 0.94, so if you use both in combination, any degree/angle of the circle is always assigned unambigously. Still, two numbers are much less than e.g. 365 numbers for a year, which I'd consider as a huge simplification.

Note: the formula sin(2pi*x/n) and cos(2pi*x/n) refers to angles in radian units, not degrees (in degree notation: sin(360°*x/n) or cos (360°*x/n)).

I also thought about using news as additional inputs. I wouldn't go this far to try to automatically interpret the directional implications of any specific news event (=from a fundamental point of view), but a simple method could be to feed in the time distance (seconds) since the last event that affects this currency, the impact level and the same for the next (scheduled) upcoming event. I already wrote the code for such a function class that returns me this number of seconds a few month ago, so I only need to combine it with my neural networks. There are about 90.000 news in the available history through Metatrader's built in news event functions. This should be enough for some reasonable neural network training.

Do you also implement AI with naked MQL only or do you interface MQL with another programming language? I'm really interested what you have achieved so far.

-------

Bad news first: with the multi-currency approach, the results were not much better than with single currency forecasting. It seems really hard to beat randomness in the forex market, i.e. anything that is significantly beyond a coin toss probability.

The meta-labeling method that specifically only looks for high probability setups now seems to do the trick.

I just trained an MLP network on EURUSD with 60x10 minutes HLC data on EURUSD+USDJPY+GBPUSD+AUDUSD+USDCHF+USDCAD plus time&date information. The forecasting target is the direction of the next 20 pip moves of EURUSD as a binary classifier.

The results of this primary MLP network didn't look promising and not different from my experiences with single currency forecasting. However, if we use the exact output numbers (=however insignificant their deviation from a 50% probability might be) and feed them into a secondary network that learns to classify (softmax/cross entropy) for the probabilities for true positive / false positive / true negative / false negative, we then can calculate from the output results the values for positive predictive value=TP/(TP+FP), accuracy=(TP+TN)/(TP+TN+FP+FN) and precision=TP/(TP+FP) and can take these values as a decision filter for all the trades we don't want to take.

I just started a backtest that has these values as its only optimization criteria. While a chose a minimum value of 0.5 (up to 0.55) for each of them (with the positive predictive value probably being the most important one), the first results are remarkable. Not because of by any means exeptionally high profits, but because it hasn't produced a single losing pass yet. This is an early test and it has limited value because it covers only the past 8 months (I had in this case no other unseen data left, that were not part of the networks training..), but I think it shows some potential. Another downside: the backtesting takes "forever" if you run two huge neural networks simultanously (to give an idea: the primary network has more than 1 million weight connections, the secondary network about 44.000).

I'll probably abort the test and start another test with a multi-currency + multilayer LSTM version and then decide which one starts off more promising for a more detailed evaluation.

you need a Quantum computer my friend, to run each pair optimization in parallel dimensions...:D


you could also start a gofundme where we collect enough money to train the data using amazon AWS, then the EA would be shared amongst participants (and retrained from time to time)

 

Quite happy to see where this is going... we're finally getting somewhere...

lstm multi currency

This is a screenshot during a short live training session of the earlier described dual neural network multi-currency EA in visual tester mode (or to be more precise: I probably should say multi asset instead of multi-currency, because it could just as well be used on multiple indices like e.g. DAX vs. DJI, S&P500.. or stocks versus indices).

Here I'm training on 6 currencies simultanously on the 15minutes charts with a lookback period (sliding window) of 100 candles for the LSTM network. On the middle/lower left you see the continous loss function decline for the LSTM and metalabeling networks. Both the LSTM and the metalabeling classifier network in this example have 3 layers on top the input layer and I chose 100 neurons per hidden layer.

On the upper left you see the currency symbols together the the close price bias of the prediction and results for the positive predictive value, accuracy and precision.

In this example I'm predicting high/low/close prices 15 minutes ahead (=for all 6 currencies, not just the active chart). The predictions are made upon candle open and are shown by those blue lines, which are not repainting. If you look at the right end of the chart you can see the beginning of a new candle and the predictions for high/low/close of high this candle could look like 15 minutes later are already made. From there on these lines don't change.

I'm pleased to see that the predictions visibly actually capture trends and not are not just making a copy of the preceding candle (like you can see in many bad example of time series predictions).

A quantum computer isn't needed, 'though I get that you where joking. Backtests with neural network obviously take more time than training. Because backtests are done after the training, only the feed forward passes are necessary then, which saves some time. The problem comes with the many hundred or thousand test passes that backtests are usually based on. But we're talking days here, so still very doable on a home PC.

 
Can you please post a picture of the close prediction only and line chart instead of candles?
 

(???) What's the added information? The close price predictions are represented by the dotted line and the real close prices are visible from the candles.

Sure, it's possible, but I'd have to run another training session first to take such a screenshot.

 
Chris70:

(???) What's the added information? The close price predictions are represented by the dotted line and the real close prices are visible from the candles.

Sure, it's possible, but I'd have to run another training session first to take such a screenshot.

The above screenshot seems to show the familiar 1 bar time-lag, but it is difficult to see due to the candles. Just the real close line and the predicted close line would give a clear picture of what is actually going on. 

 
I think the predicted vs real close line is quite obvious, furthermore the candles indicate a stronger connection to the algos performance,i.e. you can see the market depth of that currency pairs strength volume assumptions as opposed to the line it will be hard to understand. Furthermore a screenshot with market volume bars would be more ideal if you really wanted to see the real close and predicted line "clear picture." The line doesn't tell you s--- apart from how close the prediction vs real coincide together.
 
How is your manual trading going Chris? It seems you could use this EA as apart of your manual trading?
Reason: