How to predict a variable in R correlation? - General

Kazam 2009.03.22 19:33 #131

Yeah training vs predicted. I built a new network with the first 6-7 years of data used to train, test and cross validate. I then fed it the remaining 3 years of data as a test with no learning, the idea being to mimic a live test. The training regression line had a slope of 0.99995 and when I fed it about 3 years of previously unseen data that dipped to 0.9995. I'm not sure how to interpret this. It seems a little too accurate for something I put together in less than an hour.

I'll explain it to you using an example.

Lets say that you want predict a variable that can have values from 100 to 250 /like GBPJPY/. You want to predict very small time steps compared to training data /like H1 or H4/. For some step the desired value is, lets say, 174.850 and the NN output is 176.350. The error is very small /about 0.8%/ but in Forex terms the error is big - 150 pips.

It's much easier to predict normal or logarithmic rate of return. Even if you make an error the output probably still will be useful /if you predict a 20% increase of price, and it really is 10%, even though the error is 50% the result is still very ok/.

Surely thou if you generate the first population at random is there not a chance that you could generate a population in which no program solves the problem?

It's impossible.

Even if the NN gives a very bad prediction it still is a solution.

More so than say the average desktop could handle

You can use a desktop computer. Today computers are way better than those I started from

Determining Expert Advisor parameters Mechanism for describing the Copying structures

[Deleted] 2009.03.22 23:24 #132

mrwobbles,

Would you be so kind and inform us of your inputs & outputs for your NN results? I'd like to run it through NeuroShell and see if I can generate R correlation level around the same tightness. Thanks in advance.

MetaEditor - Professional editor MathDifference Bool Type

[Deleted] 2009.03.22 23:27 #133

Kazam:
I'll explain it to you using an example.

Lets say that you want predict a variable that can have values from 100 to 250 /like GBPJPY/. You want to predict very small time steps compared to training data /like H1 or H4/. For some step the desired value is, lets say, 174.850 and the NN output is 176.350. The error is very small /about 0.8%/ but in Forex terms the error is big - 150 pips.

It's much easier to predict normal or logarithmic rate of return. Even if you make an error the output probably still will be useful /if you predict a 20% increase of price, and it really is 10%, even though the error is 50% the result is still very ok/.

Yeah but I'm talking about errors an order of magnitude smaller than that. The average error in pips of the network I've trained is about 10-20 which is approaching an acceptable level. Still there are some anomalous results, a few of over 100 pips which is obviously unacceptable.

It's impossible.

Even if the NN gives a very bad prediction it still is a solution.

Yeah I guess you could call them solutions but would starting with a random population not lead to a longer convergence time? In that case starting with a population of pre-trained networks would surely speed up convergence of the network and hopefully result in more accurate results.

Dropout Dropout Weight initialization methods in

Kazam 2009.03.23 09:09 #134

Yeah but I'm talking about errors an order of magnitude smaller than that. The average error in pips of the network I've trained is about 10-20 which is approaching an acceptable level. Still there are some anomalous results, a few of over 100 pips which is obviously unacceptable.

But you are still looking at the training data. I made a quick example - look at the picture bellow. Error measures are small, correlation coefficient is very high, but look what happens when you try to predict next 10 steps.

Yeah I guess you could call them solutions but would starting with a random population not lead to a longer convergence time? In that case starting with a population of pre-trained networks would surely speed up convergence of the network and hopefully result in more accurate results.

Randomness is the key

With many random networks you have a bigger chance to find the best possible solution in the end. Look at the second picture. If you make a pre selection of NN's you might get stuck at local optimum, but if you use random NN's you have a higher chance to find global optimum.

Of course there are ways to overcome the problem of getting stuck in local optimum.

Files:

clipboard02.jpg 176 kb

clipboard01_2.jpg 16 kb

Neural network training Weight initialization methods in Principles of batch normalization

[Deleted] 2009.03.23 11:02 #135

That thought had crossed my mind. I mean if the population of pre-trained NNs is too small or with not enough genetic variance then some solutions might not be considered. Like you said there's always the chance of choosing 12 NNs that are all stuck at different local minima and that wouldn't be good. Or worse still 12 NNs that are all stuck at the same local minimum. Although you could always encode some random gene mutations to try and increase the genetic stock, every 10 generations say. Thou starting with a completely random structure would ensure that most possibilities are considered.

Ah I see, okay I've just found out how to use the network after its trained, the sim function apparently... oh the joys of help files You'll have to forgive my ignorance, I'm fairly new to working with NNs. I would've been disappointed if I'd solved it that easily. This is supposed to be what I'm doing over the summer holiday well that and working

Edit: I trained the network on the first 7 years data and then simulated its performance on the last few years of inputs. The mean error in pips was 40 but if you look at the plot of the graph it gets the first 1500 pretty spot on then it loses it and starts to get the price wrong but the direction is for the most part right. Then it hits about 10000 and it picks it up again (just in time for the big crash). Have a look at this and tell me what you think. Dark blue is output light blue target. Btw I didn't supply the network with targets those were over-layed for analysis.

Files:

gbpjpy60-4.jpg 40 kb

gbpjpy60-8.jpg 55 kb

Weight initialization methods in Optimization Types - Algorithmic Batch normalization

Kazam 2009.03.23 11:53 #136

I can't tell anything looking at the pictures because there might by a "shadow effect" and the pictures are to small to tell that.

But I can tell you how to check if the NN is ok. Export /there is an export and an import wizard in Matlab/ the testing output to a XLS or CSV file /Excel file/. Then put the real values next to the NN output and in the next column put a formula that checks if the NN predicted the correct direction of price movement.

By counting how many "1" you get you'll know the accuracy of the network.

The you may write a formula that calculates the profit and loss for every step. Look at he picture below /I'm using Polish version of Excel so I don't know if I got the formulas right /. Skip the spread for now.

Files:

clipboard01_3.jpg 22 kb

Fractal Adaptive Moving Average Determining Expert Advisor parameters Stochastic Oscillator - Oscillators

biddick 2009.03.23 13:33 #137

Hi Kazam,

İs it possible to implement this NN package to Metatrader?

Files:

example.zip 106 kb

[Deleted] 2009.03.23 13:41 #138

Looks like it might be okay, I ran the formula through Open Office spreadsheet and it returned 73% accuracy on trade direction. Still got a few more inputs to give the network that I think will improve accuracy. Here's the spreadsheet, I saved it in xls format, you should be able to read it.

Files:

gj60.rar 831 kb

Universal Time How to Use - Editing the Market Watch

SIMBA 2009.03.23 15:41 #139

In sample ,out of sample

mrwobbles:
Looks like it might be okay, I ran the formula through Open Office spreadsheet and it returned 73% accuracy on trade direction. Still got a few more inputs to give the network that I think will improve accuracy. Here's the spreadsheet, I saved it in xls format, you should be able to read it.

Good...In sample or Out of sample?

Big difference it will make in your account,73% In sample is an account killer,73% out of sample is a maybe...

Try to believe only the out of sample results,and the less inputs you have ,the less overfitting you will get...so,if you add new inputs,think about deleting some of the old ones ..or expand the out of sample dataset on which you will forge your beliefs

As a rule of thumb: Less inputs,less connections,more out of sample points...better generalization.

Regards

Simba

Comparative testing of models Neural network training User interaction

Kazam 2009.03.23 16:42 #140

biddick

It's a DLL so have a look here:

http://www.metatrader.info/node/150[/CODE]

There's an example of how to use DLL functions in Metatrader.

mrwobbles

It's either the training data or you got something wrong - the results are to good .

If one could get an accuracy of 73% with a simple back propagation network no one would give a shit about more complicated stuff

SIMBA

You're right. Choosing the proper input is a the most important thing in the process of creating a NN /there's a rule "trash goes in, trash comes out" /.

But you can always use data mining tools to analyze many different variables and choose those that affect the one you want to predict.

There's a nice book about data minig /and about genetic algorithms, Bayesian classification etc./ - "Data Mining Methods and Models." by Daniel T. Larose. It also shows how to use WEKA /a free, open source software for machine learning/.

My preferred way is to use GA - for the propose of time series prediction I usually allow them to choose from 15-40 previous steps.

PS

I've mentioned using Bayesian probability for the classification tasks but it can also be used for time series prediction

[CODE]http://www.cis.hut.fi/juha/papers/ESTSPfinal.pdf

Creating Expert Advisors in Neural network training Mechanism for describing the

Neural Networks - page 14