Machine learning in trading: theory, models, practice and algo-trading - page 376

 
elibrarius:

I found in ALGLIB early stopping training with validation section:

Neural network training using early stopping (base algorithm - L-BFGS with regularization).
...
The algorithm stops if validation set error increases for a long
enough or step size is small enough (there are task where
validation set may decrease for eternity). In any case solution
returned corresponds to the minimum of validation set error.


Something seems wrong to me, because in real trading the bars will go in their own order, and not mixed up with those of an hour and a day ago.
And if the "nature" of the market changes, it means that it is necessary to re-learn or look for new NS models.


Do you have more than 500 connections in your grid? They write that L-BFGS is less efficient than L-M if there are few neurons
 
Maxim Dmitrievsky:

Do you have more than 500 connections in your grid? They write that L-BFGS is less efficient than L-M if there are few neurons
For now less, to save time - the development stage, when I'm done I will strain myself in search of predictors and network scheme
 
elibrarius:
Less so far, to save time - the development stage, as soon as everything is finished, I will have to work hard in search of predictors and network diagrams


Maybe you will write an article when you figure it out completely? :) There are no good articles on Alglibian neural network, there's one translated hard-to-understand

Article like a description of the NS (because I couldn't even find a normal help for algib) and an example of training / retraining, auto-optimization in the bot. Well it's just the way I noticed that there is not enough information to study. For this kind of still pay) spend your time not in vain

 
Maxim Dmitrievsky:


Why don't you write an article when you've got it all figured out? :) There are no good articles on Alglib neuronics, there is one translated hard-to-understand

Article about description of NS (because I couldn't even find some help for alglib) and example of learning/retraining, auto-optimization in bot. Well it's just the way I noticed that there is not enough information to study. For this kind of pay more) spend your time not in vain.

It is unlikely - I can't find the time for an article 100% of the time... I will not have enough time to find an article, and I myself just started to understand the NS, so I am not able to say anything new/intelligent.

I took https://www.mql5.com/ru/articles/2279 as a basis. It took me about 8 hours to make it work. I think most programmers will not take more time.

But it's been a week of rework, adding more options, tests, etc.
Нейросеть: Самооптимизирующийся советник
Нейросеть: Самооптимизирующийся советник
  • 2016.10.03
  • Jose Miguel Soriano
  • www.mql5.com
Возможно ли создать советник, который согласно командам кода автоматически оптимизировал бы критерии открытия и закрытия позиций с определенной периодичностью? Что произойдет, если реализовать в советнике нейросеть (многослойный персептрон), которая, будучи модулем, анализировала бы историю и оценивала стратегию? Можно дать коду команду на ежемесячную (еженедельную, ежедневную или ежечасную) оптимизацию нейросети с последующим продолжением работы. Таким образом возможно создать самооптимизирующийся советник.
 
elibrarius:
I don't think so - I can't find time for the article 100%... Besides, I'm just starting to understand NS myself, I can't say anything clever/new.

I took https://www.mql5.com/ru/articles/2279 as a base. It took me about 8 hours to make it work. I think most programmers will not take more time.

But it's been a week of rework, adding more options, tests etc.


I'm still looking towards Bayesian classifier + genetics, not bad results. With grids somehow murky in my head, a lot of nuances

Yes, I mean the same article, it didn't seem very interesting to me, though I'm more of a trader than a programmer )

 
I still do not understand the situation with mixing the results:

Early-stop training on unmixed data:

Average error in training (80%) plot =0.535 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6
Average error in validation (20%) plot =0.298 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6
Full plot (training + validation plot):
Average learning error=0.497 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6
Average error on the test (20%) section =0.132 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6

It feels like there was a fit to the validation plot. The test one is good, but it was not in training and was not compared, probably just a coincidence.
This is the same fay of ensembles, and there is a 2/3 split and everything is mixed between both plots, I'll try to do the same...
I did it:

Mean error in the training (60%) plot =0.477 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6
Mean error in validation (40%) plot =0.472 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6
Full plot (training + validation plot):
Average learning error=0.475 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6
Average error on the test section (20%) =0.279 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6

By mixing, the error leveled off on the training and validation sections.
But it got worse at the test one.

It seems wrong to mix the data and then divide them in training and validation sections, because in real trading the bars will follow their own order, and not mixed up with those of an hour, a day or a week ago. Similarly, the cross-validation algorithms where the validation section is at the beginning, then in the middle, then at the end.
And if the "nature" of the market changes, it means we must re-learn or look for new NS models.

And if you do not mix and validate on the last section, then how do you avoid fitting to this section?
 
elibrarius:
4 plots turns out? Training/validation/test1/test2 ?

How many cycles of training/validation do you need to do? Haven't seen any information about that anywhere... 1 cycle total? - and right after that we either approve or change something in the predictor set or network scheme? More precisely in N cycles of training we will be shown one best.


Test section2 is a verdict: no match, then we start all over again, preferably with a set of predictors


PS.

By the way, there is also a tester, the final verdict of the TC.

 
elibrarius:
I still do not understand the situation with mixing the results:

Learning by early stopping method on unmixed data:

Average error on the training (80%) sample =0.535 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6
Mean error in validation (20%) plot =0.298 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6
Full plot (training + validation plot):
Average learning error=0.497 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6
Average error on the test (20%) section =0.132 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6

It feels like there was a fit to the validation plot. The test one is good, but it was not in training and was not compared, probably just a coincidence.
This is the same fay of ensembles, and there is a 2/3 split and everything is mixed between both plots, I'll try to do the same...
Shuffled it:

Mean error in the training (60%) plot =0.477 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6
Mean error in validation (40%) plot =0.472 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6
Full plot (training + validation plot):
Average learning error=0.475 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6
Average error on the test section (20%) =0.279 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6

Due to mixing, the error smoothed out at the training and validation sections.
And it got worse at the test one.

It seems wrong to mix the data and then divide them in training and validation sections, because in real trading the bars will follow their own order, and not mixed up with those of an hour, a day or a week ago. Similarly, the cross-validation algorithms where the validation section is at the beginning, then in the middle, then at the end.
And if the "nature" of the market changes, it means we must re-learn or look for new NS models.

And if you don't mix and validate at the last section, how do you avoid fitting to that section?


1. My understanding is that you don't train anything at all - just a random result on predictors that have nothing to do with the target variable.


2. Stirring.

I don't know NS.

But in so many other MO algorithms, learning is done on exactly one line. ONE value of each predictor is taken and the target variable is mapped to it. Therefore, the shuffling is irrelevant. There are MO algorithms that take neighbors into account.

But anyway our points of view coincide and initially I always do testing on test2 without shuffling.


PS.

Once again.

If the error on two different samples is different like yours - that means your system is hopeless, only to be thrown away.

 

Wandering around the bottomless cesspool called the Internet, I came across this paper.

Artificial Neural Networks architectures for stock price prediction:comparisons and applications

In other words - NS architecture for stock prediction-comparison and application

Files:
 
elibrarius:
The situation with mixing results remains unclear:


Something seems wrong to me to mix the data and then divide into training and validation, because in real trading the bars will go in their own order, not mixed with the ones from an hour, a day or a week ago. Similarly, the cross-validation algorithms where the validation section is at the beginning, then in the middle, then at the end.
And if the "nature" of the market changes, it means we must re-learn or look for new NS models.

And if you do not mix and validate on the last section, then how do you avoid fitting to this section?

After splitting into train/test/valid , shuffle train. Do not mix the rest of the sets.
This is valid for classification by neural networks. Moreover, when training deep neural networks, mix each minibatch before feeding the neural network.

Good luck

Reason: