Discussion of article "Deep Neural Networks (Part V). Bayesian optimization of DNN hyperparameters" - page 3

 
Vladimir Perervenko:

For Bayesian, you need to play not only with the number of passes but also with the number of points. You have to look for a faster option. This is very tedious

To speed it up, add to the parameters when calling BayesianOptimisation

maxit = 1 #1 instead of 100 - number of repetitions for GP_fit for hyperplane prediction
I didn't notice any improvement with 100 repetitions compared to 1, so now I use 1.
Ie.

BayesianOptimization(все как у вас , maxit = 1)

maxit=1 via ... will be passed to GPfit:: GP_fit and the optimisation will be run 1 time instead of 100.
You can also pass:
control = c(20*d, 10*d, 2*d);#default - control = c(200*d, 80*d, 2*d) - from 200*d select 80*d best ones and build 2*d clusters - where d is the number of parameters to be optimised.

Description of these parameters here https://github.com/cran/GPfit/blob/master/R/GP_fit.R

Vladimir Perervenko:

PS. Don't you switch to TensorFlow? It is just a higher level.

It seems that with Darch 30% in training and 36% on the test I get it. I'll finish the EA, put it to work, and then maybe I'll do it.
Although Darch is poorly supported, they have corrected and improved some things, but in January it was sent from CRAN to the archive for not fixing errors (there was one with error evaluation in training mode with validation). In May they released version 13, but then rolled back to version 12. Now the 13th version has appeared again - apparently it has been finished.
 
elibrarius:

To speed things up, add to the parameters when calling BayesianOptimisation

maxit = 1 #1 instead of 100 - number of repetitions for GP_fit for hyperplane prediction
Didn't notice any improvement at 100 repetitions compared to 1, so I use 1.
Ie.

maxit=1 via ... will be passed to GPfit:: GP_fit and optimisation will be run 1 time instead of 100.
You can also pass:
control = c(20*d, 10*d, 2*d);#default - control = c(200*d, 80*d, 2*d) - choose 80*d best from 200*d and build 2*d clusters - where d is the number of parameters to be optimised

Yes, it seems to work with Darch 30% in training and 36% in test. I'll finish the EA, put it to work, and then maybe I'll do it.
Although Darch is poorly supported, they have corrected and improved some things, but in January it was sent from CRAN to the archive for failure to fix errors (there was one with error estimation in training mode with validation). In May they released version 13, but then rolled back to version 12. Now the 13th version has appeared again - apparently they have finished it.

Thanks for the information. I will try with your parameters.

I haven't visited them on Github for a long time. I'll have to write a proposal. The darch package provides GPU usage, but they removed the package they use for this from CRAN (for 3.4.4). And it would be interesting how GPU would affect speed and quality.

Good luck

 

There's another brake here

https://github.com/yanyachen/rBayesianOptimization/blob/master/R/Utility_Max.R

I also set maxit = 1 instead of 100.

Through ... cannot be passed, you can just load your Utility_Max function into R and use the corrected version.

 
elibrarius:

Another retard here

https://github.com/yanyachen/rBayesianOptimization/blob/master/R/Utility_Max.R

I also set maxit = 1 instead of 100.

Through ... cannot be passed, you can just load your Utility_Max function into R and use the corrected version.

I checked it on optimisation of the neural network ensemble from PartVI article. Neither maxit nor control has a visible effect on the computation time. The biggest influence is the number of neurons in the hidden layer. I left it like this

 OPT_Res <- BayesianOptimization(fitnes, bounds = bonds,
                                  init_grid_dt = NULL, init_points = 20, 
                                  n_iter = 20, acq = "ucb", kappa = 2.576, 
                                  eps = 0.0, verbose = TRUE,
                                  maxit = 100, control = c(100, 50, 8))
elapsed = 14.42 Round = 1       numFeature = 9.0000     r = 7.0000      nh = 36.0000    fact = 9.0000   Value = 0.7530 
elapsed = 42.94 Round = 2       numFeature = 4.0000     r = 8.0000      nh = 46.0000    fact = 6.0000   Value = 0.7450 
elapsed = 9.50  Round = 3       numFeature = 11.0000    r = 5.0000      nh = 19.0000    fact = 5.0000   Value = 0.7580 
elapsed = 14.17 Round = 4       numFeature = 10.0000    r = 4.0000      nh = 35.0000    fact = 4.0000   Value = 0.7480 
elapsed = 12.36 Round = 5       numFeature = 8.0000     r = 4.0000      nh = 23.0000    fact = 6.0000   Value = 0.7450 
elapsed = 25.61 Round = 6       numFeature = 12.0000    r = 8.0000      nh = 44.0000    fact = 7.0000   Value = 0.7490 
elapsed = 8.03  Round = 7       numFeature = 12.0000    r = 9.0000      nh = 9.0000     fact = 2.0000   Value = 0.7470 
elapsed = 14.24 Round = 8       numFeature = 8.0000     r = 4.0000      nh = 45.0000    fact = 2.0000   Value = 0.7620 
elapsed = 9.05  Round = 9       numFeature = 7.0000     r = 8.0000      nh = 20.0000    fact = 10.0000  Value = 0.7390 
elapsed = 17.53 Round = 10      numFeature = 12.0000    r = 9.0000      nh = 20.0000    fact = 6.0000   Value = 0.7410 
elapsed = 4.77  Round = 11      numFeature = 9.0000     r = 2.0000      nh = 7.0000     fact = 2.0000   Value = 0.7570 
elapsed = 8.87  Round = 12      numFeature = 6.0000     r = 1.0000      nh = 40.0000    fact = 8.0000   Value = 0.7730 
elapsed = 14.16 Round = 13      numFeature = 8.0000     r = 6.0000      nh = 41.0000    fact = 10.0000  Value = 0.7390 
elapsed = 21.61 Round = 14      numFeature = 9.0000     r = 6.0000      nh = 47.0000    fact = 7.0000   Value = 0.7620 
elapsed = 5.14  Round = 15      numFeature = 13.0000    r = 3.0000      nh = 3.0000     fact = 5.0000   Value = 0.7260 
elapsed = 5.66  Round = 16      numFeature = 6.0000     r = 9.0000      nh = 1.0000     fact = 9.0000   Value = 0.7090 
elapsed = 7.26  Round = 17      numFeature = 9.0000     r = 2.0000      nh = 25.0000    fact = 1.0000   Value = 0.7550 
elapsed = 32.09 Round = 18      numFeature = 11.0000    r = 7.0000      nh = 38.0000    fact = 6.0000   Value = 0.7600 
elapsed = 17.18 Round = 19      numFeature = 5.0000     r = 3.0000      nh = 46.0000    fact = 6.0000   Value = 0.7500 
elapsed = 11.08 Round = 20      numFeature = 6.0000     r = 4.0000      nh = 20.0000    fact = 6.0000   Value = 0.7590 
elapsed = 4.47  Round = 21      numFeature = 6.0000     r = 2.0000      nh = 4.0000     fact = 2.0000   Value = 0.7390 
elapsed = 5.27  Round = 22      numFeature = 6.0000     r = 2.0000      nh = 21.0000    fact = 10.0000  Value = 0.7520 
elapsed = 7.96  Round = 23      numFeature = 7.0000     r = 1.0000      nh = 41.0000    fact = 7.0000   Value = 0.7730 
elapsed = 12.31 Round = 24      numFeature = 7.0000     r = 3.0000      nh = 41.0000    fact = 3.0000   Value = 0.7730 
elapsed = 7.64  Round = 25      numFeature = 8.0000     r = 4.0000      nh = 16.0000    fact = 7.0000   Value = 0.7420 
elapsed = 6.24  Round = 26      numFeature = 13.0000    r = 5.0000      nh = 6.0000     fact = 1.0000   Value = 0.7600 
elapsed = 8.41  Round = 27      numFeature = 11.0000    r = 8.0000      nh = 8.0000     fact = 7.0000   Value = 0.7420 
elapsed = 8.48  Round = 28      numFeature = 6.0000     r = 7.0000      nh = 15.0000    fact = 2.0000   Value = 0.7580 
elapsed = 10.11 Round = 29      numFeature = 12.0000    r = 6.0000      nh = 17.0000    fact = 4.0000   Value = 0.7310 
elapsed = 6.03  Round = 30      numFeature = 8.0000     r = 3.0000      nh = 12.0000    fact = 1.0000   Value = 0.7540 
elapsed = 8.58  Round = 31      numFeature = 13.0000    r = 5.0000      nh = 18.0000    fact = 2.0000   Value = 0.7300 
elapsed = 6.78  Round = 32      numFeature = 13.0000    r = 2.0000      nh = 15.0000    fact = 8.0000   Value = 0.7320 
elapsed = 9.54  Round = 33      numFeature = 10.0000    r = 3.0000      nh = 37.0000    fact = 9.0000   Value = 0.7420 
elapsed = 8.19  Round = 34      numFeature = 6.0000     r = 1.0000      nh = 42.0000    fact = 3.0000   Value = 0.7630 
elapsed = 12.34 Round = 35      numFeature = 7.0000     r = 2.0000      nh = 43.0000    fact = 8.0000   Value = 0.7570 
elapsed = 20.47 Round = 36      numFeature = 7.0000     r = 8.0000      nh = 39.0000    fact = 2.0000   Value = 0.7670 
elapsed = 11.51 Round = 37      numFeature = 5.0000     r = 9.0000      nh = 18.0000    fact = 3.0000   Value = 0.7540 
elapsed = 32.71 Round = 38      numFeature = 7.0000     r = 7.0000      nh = 40.0000    fact = 6.0000   Value = 0.7540 
elapsed = 28.33 Round = 39      numFeature = 7.0000     r = 9.0000      nh = 38.0000    fact = 5.0000   Value = 0.7550 
elapsed = 22.87 Round = 40      numFeature = 12.0000    r = 6.0000      nh = 48.0000    fact = 3.0000   Value = 0.7580 

 Best Parameters Found: 
Round = 12      numFeature = 6.0000     r = 1.0000      nh = 40.0000    fact = 8.0000   Value = 0.7730                                  maxit = 100, control = c(100, 50, 8))

Best 10

OPT_Res %$% History %>% dp$arrange(desc(Value)) %>% head(10) %>%
    dp$select(-Round) -> best.init
  best.init
   numFeature r nh fact Value
1           6 1 40    8 0.773
2           7 1 41    7 0.773
3           7 3 41    3 0.773
4           7 8 39    2 0.767
5           6 1 42    3 0.763
6           8 4 45    2 0.762
7           9 6 47    7 0.762
8          11 7 38    6 0.760
9          13 5  6    1 0.760
10          6 4 20    6 0.759

Value - average F1. Not a bad performance.

To speed up calculations, we need to rewrite some functions of the package. The first thing is to replace all ncol(), nrow() which there are a lot of with dim()[1], dim()[2]. They are executed tens of times faster. And probably, since there are only matrix operations, use GPU (gpuR package). I won't be able to do it myself, can I suggest to the developer?

Good luck

 
Vladimir Perervenko:

Checked it on the optimisation of the neural network ensemble from the PartVI paper. Neither maxit nor control has a visible execution time. The biggest influence is the number of neurons in the hidden layer. I left it like this

Best 10

Value - average F1. Not a bad performance.

To speed up calculations, we need to rewrite some functions of the package. The first thing is to replace all ncol(), nrow() functions with dim()[1], dim()[2]. They are executed tens of times faster. And probably, since there are only matrix operations, use GPU (gpuR package). I won't be able to do it myself, can I suggest it to the developer?

Good luck

Just you optimise few parameters, I optimised 20 pieces, and when known points become 20-40 pieces, then calculation of only GPfit took tens of minutes, in such conditions you will see acceleration.

And the number of neurons affects only the calculation time of the NS itself.

 
elibrarius:

Just you optimise few parameters, I optimised 20 pieces, and when known points become 20-40 pieces, then calculation of only GPfit took tens of minutes, in such conditions you will see acceleration.

And the number of neurons affects only the calculation time of the NS itself.

I guess so.

 
How exactly do I use it, how do I organise my trading system into a neural network as well or a more complex EA automated trading
[Deleted]  
MetaQuotes Software Corp.:

New article Deep Neural Networks (Part V). Bayesian optimization of DNN hyperparameters has been published:

Author: Vladimir Perervenko

Hi Vladimir, 
I am working on the derivatives of the MACD for android mobile  and am needing help to write an accurate algorithm on the properties parameter fill-in form . would you be able to include how the level settings can be positioned and if I may continue communications .
Thanks ,
Paul