Discussion of article "Deep Neural Networks (Part VI). Ensemble of neural network classifiers: bagging"

 

New article Deep Neural Networks (Part VI). Ensemble of neural network classifiers: bagging has been published:

The article discusses the methods for building and training ensembles of neural networks with bagging structure. It also determines the peculiarities of hyperparameter optimization for individual neural network classifiers that make up the ensemble. The quality of the optimized neural network obtained in the previous article of the series is compared with the quality of the created ensemble of neural networks. Possibilities of further improving the quality of the ensemble's classification are considered.

Despite the fact that the hyperparameters of the individual classifiers in the ensemble were chosen intuitively and obviously are not optimal, a high and stable quality of classification was obtained, both using averaging and a simple majority voting.

Summarize all the above. Schematically, the whole process of creating and testing an ensemble of neural networks can be divided into 4 stages:

Fig.3. Structure of training and testing the ensemble of neural networks with the averaging/voting combiner

Author: Vladimir Perervenko

 

Thanks, interesting, elmNN seems to be a worthy replacement for nnet, especially in ensemble. I also learnt about rBayesianOptimization, I will try to use it in the future.

 

Interesting article!
It was unexpected to learn 500 networks at once. I didn't think it could be done so fast. With DNN it will be hours...

1) Instead of the additional random number generator setRNG(rng[[k]]) you could use the built-in holdout(Ytrain, ratio = r/10, mode = "random", seed = i ) # where i is the loop iteration number.
This way we will also get a new mixed set of input data at each iteration, which is repeated at restarts.

2) In general, holdout is a great function, it mixes data very well, much better than the self-written one in MT5 with replacing each line with a random other line.
It is when changing the seed that you can come across a very good learning result. I tried to change manually with darch - I got an error from 50% to 30% and the number of trades from units to hundreds - and this is only after changing the key for mixing. Automatically checking in a loop is probably more reliable.

3) elmNN - resembles a regular NS with one epoch of training. Taking 500 pieces (from which we choose the best one) we get an analogue of 1 NS with 500 epochs of training (from which we also choose the best epoch). However, this is just an association, not a statement.

Though averaging of several best networks I think is better than 1 best result after 500 epochs. I would like to do ensemble with DNN, but I'm afraid it will be a very long learning curve. I will experiment)
Thanks for the article!

 
elibrarius:

Interesting article!
It was unexpected to learn 500 networks at once. I didn't think it could be done so fast. With DNN it will be hours...

1) Instead of additional random number generator setRNG(rng[[k]]) we could use the built-in holdout(Ytrain, ratio = r/10, mode = "random", seed = i ) # where i is the loop iteration number.
This way we will also get a new mixed set of input data at each iteration, which is repeated at restarts.

2) In general, holdout is a great function, it mixes data very well, much better than the self-written one in MT5 with replacing each line with a random other line.
It is when changing the seed that you can come across a very good learning result. I tried to change manually with darch - I got an error from 50% to 30% and the number of trades from units to hundreds - and this is only after changing the key for mixing. Automatically checking in a loop is probably more reliable.

3) elmNN - resembles a regular NS with one epoch of training. Taking 500 pieces (from which we choose the best one) we get an analogue of 1 NS with 500 epochs of training (from which we also choose the best epoch). However, this is just an association, not a statement.

Though averaging of several best networks I think is better than 1 best result after 500 epochs. I would like to do ensemble with DNN, but I'm afraid it will be a very long learning curve. I will experiment)
Thanks for the article!

1. You can't. The main task of the RNG is to ensure that the weights of the neural networks in the ensemble are initialised by constant random variables. To optimise hyperparameters, a constant quality ensemble is needed.

3. This is a single-layer NN but without backpropagation learning. Read the description in the links. There is a whole zoo of them there. And as the developers claim, they work quite successfully.

The results of the ensemble with ELM were honestly for me a big surprise and confirmation of the statement: "not everything that is difficult is brilliant". In the next part we will try several methods to improve the quality of classification by averaging.

Good luck

 
Vladimir Perervenko:

1. You can't. The main task of the RNG is to ensure that the weights of the neural networks in the ensemble are initialised by constant random variables. To optimise the hyperparameters, a constant quality ensemble is required.

3. This is a single-layer NN but without backpropagation learning. Read the description in the links. There is a whole zoo of them there. And as the developers claim, they work quite successfully.

The results of the ensemble with ELM were honestly for me a big surprise and confirmation of the statement: "not everything that is difficult is brilliant". In the next part we will try several methods to improve the quality of classification by averaging.

Good luck

1) Got it. In addition to initialising the mixing, you are also initialising the network weights.
And just set.seed(i); will not give the same effect?

 
elibrarius:

1) Got it. In addition to initialising the mixing, you are also initialising the network weights.
And just set.seed(i); will not give the same effect?

No, it won't. It will set the RNG to a state once, but we have 500 iterations of foreach and we need a different state of the RNG at each iteration. Look at the doRNG package description.

Good luck

 
Vladimir Perervenko:

No, it won't. It will set the RNG to a state once, but we have 500 foreach iterations and we need a different state of the RNG at each iteration. Look at the description of the doRNG package.

Good luck

It will be inside the loop

  Ens <- foreach(i = 1:n, .packages = "elmNN") %do% {
    set.seed(i);
    idx <- rminer::holdout(Ytrain, ratio = r/10, mode = "random")$tr
    elmtrain(x = Xtrain[idx, ], y = Ytrain[idx], 
             nhid = nh, actfun = "sin")
  }
i.e. there will be set.seed(1); then set.seed(2); set.seed(3); ..... set.seed(500);
 
elibrarius:

It'll be inside the loop

Try it. It might work.

 
Vladimir Perervenko:

Try it. It might work.

It should.

And I think it is possible not to switch off multithreading in this case.

 
elibrarius:

It has to.

And I think it is possible not to switch off multithreading in this case.

Just test it practically. If you get the same result or better, then you can do it this way. The R language allows you to perform the same action in different ways.

Good luck

 
Vladimir Perervenko:

Just test it practically. If you get the same result or better, then you can do it that way. The R language allows you to perform the same action in different ways.

Good luck

The version

  Ens <- foreach(i = 1:n, .packages = "elmNN") %do% {
    set.seed(i);
    idx <- rminer::holdout(Ytrain, ratio = r/10, mode = "random")$tr
    elmtrain(x = Xtrain[idx, ], y = Ytrain[idx], 
             nhid = nh, actfun = "sin")
  }

works. It gives the same network weights each time it is run. I was comparing on the second network. I output env$Ens[2] and then compared it by plugin in notepad++.

It did not work with multithreading:

Error in setMKLthreads(2) :can't find function "setMKLthreads"

What is this function? It is not in the code of articles 4 and 6. How to connect it?

PS: It would be more convenient if you could post the R session with all functions and source data.