Taking Neural Networks to the next level - page 25

Chris70
543
Chris70  

Safe to say that you won the challenge and that the network has failed with this case.

Yes, I guess this was to be expected. Although computers by their nature are deterministic machines and are having a hard time generating true randomness, even without knowing the algorithm behind the rand() function, I assume that though being formally pseudo-random, it's not entirely useless for producing chaotic distributions. Now, with the myfn() function we again pseudo-randomly diluted this information y times (homeopathy aficionados will love it ;-) ). It is to be expected, that we get near perfect chaos.

--

As for the other challenge: I'm not "in" yet. I don't say no, but let me think about it. After yesterday I'd prefer not to stay at the computer all day (and need time for other stuff), but I'm also not too excited about that challenge because I already did things like that and don't think it can work. The markets are not perfectly random, but close and it will be hard to find something useful without additional functionality like multicurrency inputs and memory cells. I'm confident that it will quickly learn the first 99 prices as more or less copies (assuming that open[t] usually ressembles close[t-1]), but for the last output, I guess it will return something close to the open, i.e. not being biased (=zero change). Different timeframes will make a small difference, but all in all probably too little to be left with a profit after spread/commissions.

NELODI
406
NELODI  

Chris70:

As for the other challenge: I'm not "in" yet. I don't say no, but let me think about it. After yesterday I'd prefer not to stay at the computer all day (and need time for other stuff), but I'm also not too excited about that challenge because I already did things like that and don't think it can work. The markets are not perfectly random, but close and it will be hard to find something useful without additional functionality like multicurrency inputs and memory cells. I'm confident that it will quickly learn the first 99 prices as more or less copies (assuming that open[t] usually ressembles close[t-1]), but for the last output, I guess it will return something close to the open, i.e. not being biased (=zero change). Different timeframes will make a small difference, but all in all probably too little to be left with a profit after spread/commissions.

Ok. In that case, don't bother. I'll take your word for it.

NELODI
406
NELODI  

Well, here is what I did: https://www.mql5.com/en/code/26928

As you've said, it isn't useful for predicting where the price might be going next, but ... it is fun to watch ;) Or not. Depending on your definition of "fun" :P

NELODI BackProp Chart
NELODI BackProp Chart
  • www.mql5.com
"NLD_BPChart.mq5" is a custom price chart, painting price movements as a red/green colored line and using a very simple Artificial Neural Network (implemented in the "NLD_BackProp.mqh" file) to try predicting where the Close price might be going next. It projects the next X Close prices (default = 10) into the future by using the last Y Open...
Bayne
1008
Bayne  

@Chris70

Could you provide a backtest result of the strategy in #97 ?

because i am wondering if i did something wrong with my LSTM meta-model. (Sometimes Getting TP & SL in the same direction, while it only takes trades into one direction)

+ a way to trade would be interesting: hwo do you overcome margin problems?  you cant open an order every Hour in relations to limited margin avialable (even with metalabeling you get many results)

Brian Rumbles
1152
Brian Rumbles  
NELODI:

Well, here is what I did: https://www.mql5.com/en/code/26928

As you've said, it isn't useful for predicting where the price might be going next, but ... it is fun to watch ;) Or not. Depending on your definition of "fun" :P

Which folders do the files go in? libraries and indicators?
NELODI
406
NELODI  
Brian Rumbles:
Which folders do the files go in? libraries and indicators?

They should both go to the same folder, wherever you keep your indicators.

Chris70
543
Chris70  

Hey guys, - just in case anybody's interested... I know this thread has been silent for a while, but neural networks aren't dead ;-)

I've been busy in the meantime and can give a short update.

- I changed many of my older EAs in order to be working with range bars instead of time bars; they make more sense to me because less data are generated in periods of low market activity (which should filter out range periods to a certain degree and give more relevance to trend periods) and I observed that they improve my neural network predictions a little. 

- I realized that I have to be more patient with the network training and leave them for learning for many hours or over night, because sometimes the cost function has an s-shape and after several hours of not much anything happening (and the  loss curve flattening out), it can still go into a significant late decline - and a better trained network means better predictions [edit: about possible curve fitting concerns see my next post]

- I've mentioned several times that I'm a big proponent of pure price action and don't believe much in indicators; when being asked if a good selection of indicators as neural network inputs wouldn't be better than the price itself, I still believe that nothing can replace price: Many indicators contain filtered information derived from the price, but can't be reverse-engineered into the price because some information is lost. However, regarding the question if it's okay to add indicators to the price, i.e. as additional inputs instead of alternative inputs, then I should admit that at least it probably won't hurt. The "universal approximation theorem" is still valid, which is why adding indicators shouldn't have a theoretical benefit for the end-result. Wether it has an effect on training time or not remains to be answered. Therefore I said to myself: okay.. why not? The extreme example of trying to reverse-engineer the input data of a complex pseudo-random number generator (--> challenge with @NELODI) has at least shown that something that's true in theory still can have practical limitations. And testing is better than believing ;-)

Therefore: the EA that I'm currently working on uses a multilayer classifier network with softmax as output layer activation and has the following inputs (over a sliding window of the last ~50 periods):

- price: range_bar.close vs. range_bar.open

- range_bar duration in milliseconds

- range_bar tick volume

- MACD (calculated from range bars)

- MACD signal line  ( " ")

- range_bars.close vs. fast range_bar EMA

- range_bar.close vs. slow range_bar EMA

- RSI (calculated from range bars)

- slow stochastic (calculated from range bars)

- volatility (n period standard deviation of range_bar close prices)

- momentum (custom formula)

- parabolic SAR (calculated from range bars)

- sin/cos encoding of the time of day (seconds)

- sin/cos encoding of the month

- one-hot encoding of the day_of_week

- refeed of the outputs of the last iteration

outputs/labels (classifier):

- probability of bullish next range bar

- probability of bearish next range bar

I'm still experimenting with some network settings, but so far the backtest results on real tick data are very promising [edit: with testing on unseen data, of course, i.e. not the training set].

I won't make this EA or its code publically available, but I thought maybe the ideas could be inspiring for some...

Here are 2 example screenshots of a network training session live "at work" with all the mentioned indicators + cost function graph + features distribution histograms (range bars as pink/green overlays over the time bars):

network training example 1

network training example 2

Jean Francois Le Bas
1253
Jean Francois Le Bas  
Chris70:

Hey guys, - just in case anybody's interested... I know this thread has been silent for a while, but neural networks aren't dead ;-)

I've been busy in the meantime and can give a short update.

- I changed many of my older EAs in order to be working with range bars instead of time bars; they make more sense to me because less data are generated in periods of low market activity (which should filter out range periods to a certain degree and give more relevance to trend periods) and I observed that they improve my neural network predictions a little. 

- I realized that I have to be more patient with the network training and leave them for learning for many hours or over night, because sometimes the cost function has an s-shape and after several hours of not much anything happening (and the  loss curve flattening out), it can still go into a significant late decline - and a better trained network means better predictions

- I've mentioned several times that I'm a big proponent of pure price action and don't believe much in indicators; when being asked if a good selection of indicators as neural network inputs wouldn't be better than the price itself, I still believe that nothing can replace price: Many indicators contain filtered information derived from the price, but can't be reverse-engineered into the price because some information is lost. However, regarding the question if it's okay to add indicators to the price, i.e. as additional inputs instead of alternative inputs, then I should admit that at least it probably won't hurt. The "universal approximation theorem" is still valid, which is why adding indicators shouldn't have a theoretical benefit for the end-result. Wether it has an effect on training time or not remains to be answered. Therefore I said to myself: okay.. why not? The extreme example of trying to reverse-engineer the input data of a complex pseudo-random number generator (--> challenge with @NELODI) has at least shown that something that's true in theory still can have practical limitations. And testing is better than believing ;-)

Therefore: the EA that I'm currently working on uses a multilayer classifier network with softmax as output layer activation and has the following inputs (over a sliding window of the last ~50 periods):

- price: range_bar.close vs. range_bar.open

- range_bar duration in milliseconds

- range_bar tick volume

- MACD (calculated from range bars)

- MACD signal line  ( " ")

- range_bars.close vs. fast range_bar EMA

- range_bar.close vs. slow range_bar EMA

- RSI (calculated from range bars)

- slow stochastic (calculated from range bars)

- volatility (n period standard deviation of range_bar close prices)

- momentum (custom formula)

- parabolic SAR (calculated from range bars)

- sin/cos encoding of the time of day (seconds)

- sin/cos encoding of the month

- one-hot encoding of the day_of_week

- refeed of the outputs of the last iteration

outputs/labels (classifier):

- probability of bullish next range bar

- probability of bearish next range bar

I'm still experimenting with some network settings, but so far the backtest results on real tick data are very promising.

I won't make this EA or its code publically available, but I thought maybe the ideas could be inspiring for some...

Here are 2 example screenshots of a network training session live "at work" with all the mentioned indicators + cost function graph + features distribution histograms (range bars as pink/green overlays over the time bars):


but won't "learning more" mean curve fitting ? wouldn't it be better to stop learning early?

Chris70
543
Chris70  
Jean Francois Le Bas:

but won't "learning more" mean curve fitting ? wouldn't it be better to stop learning early?

Better fitting isn't overfitting as long as generalization improves, too.

Still, this is a valid concern, but there are also a few tricks to avoid overfitting/curve fitting that apply for the special case of neural networks:

1. using dropout helps a lot: because we randomly switch off a certain percentage of the network's neurons (usually about 20%) for every learning step (=per batch in batch or mini-batch training or per iteration in "online"-training), every step basically is like training a slightly different network.

2. about "early stopping" - just like you say: yes, this can sometime indeed be the better option, but instead of just guessing that we might deal with curve fitting, we can of course just measure it: if we've split the data into training set + validation set + test set, we can check the accuracy of the validation set against the training set and recognise the point in time when we're starting to just memorize the training set at the cost of poorer generalization (and stop once generalization starts getting worse).

I almost always train with dropout. I haven't yet implemented automatic stopping, so at the moment I still need to click a few buttons to generate validation set test reports and decide myself when to stop.

3. smart use of "sparcity" also helps: in this context I explained apoptosis (selective neuron sparcity) and pruning (selective weight sparcity) earlier in this thread; what also can help is using rectifier activation functions (like ReLU, leakyReLU, ELU) for hidden neurons.

In practise my tests show - as long as I use dropout - that both training set accuracy and generalization accuracy  usually are still continuing to improve after many epochs (how many epochs make sense  exactly can't be answered in general, because many factors like learning rate, learning rate decay (or schedule), optimizer method and number of training sample each play a big role).