Implementing Dropout in Python

To build models in Python, we previously used the Keras library for TensorFlow. This library already has a ready-made implementation of the Dropout layer.

tf.keras.layers.Dropout(
    rate, noise_shape=None, seed=None, **kwargs
)

The Dropout layer randomly sets the input units to 0 at a certain frequency equal to rate at each iteration during the training process. This helps prevent the model from overfitting. Initial data that is not set to 0 is scaled by 1/(1 - rate). Therefore, the sum of all initial data transmitted remains unchanged.

Note that the Dropout layer is only applied if its training field is set to True. Otherwise, no values are masked. When training the model, the training flag will be automatically set to True. In other cases, the user can explicitly set training to True when calling the layer.

This is different from setting trainable = False for the Dropout layer. In this case, the value of the trainable flag does not affect the behavior of the layer, since Dropout does not have any weights that could be frozen during training.

The Dropout layer constructor has the following arguments:

  • rate — a floating point number in the range from 0 to 1, which represents the proportion of elements of the initial data that are masked during the training process.
  • noise_shape — a one-dimensional integer tensor representing the shape of a binary exception mask as (batch_size, timesteps, features). The shape will be multiplied by the tensor of the initial data. For example, if the initial data has a shape and you want the exclusion mask to be the same for all time steps, you can use noise_shape=(batch_size, 1, features).
  • seed — an integer to use as a random seed.

When calling a layer, two arguments are allowed:

  • inputs — a tensor of the source data, it is possible to use a tensor of any rank.
  • training — a Boolean flag indicating the operating mode of the layer.

To test the effectiveness of using Dropout technology, we will create a script and train several models using this layer. We will not create overly complex models. Instead, let's take the script batch_norm.py, which was used when testing batch normalization. We will create a copy of this script in a file dropout.py and add Dropout layers to each model.

First, we add two Dropout layers to the model with one hidden layer without using batch normalization. We will insert new layers before each fully connected layer.

# Adding a Dropout to a model with one hidden layer
model1do = keras.Sequential([keras.layers.InputLayer(input_shape=inputs),
                           keras.layers.Dropout(0.3),
                           keras.layers.Dense(40, activation=tf.nn.swish), 
                           keras.layers.Dropout(0.3),
                           keras.layers.Dense(targets, activation=tf.nn.tanh) 
                         ])
model1do.compile(optimizer='Adam'
               loss='mean_squared_error'
               metrics=['accuracy'])
model1do.summary()

Please note that in all Dropout layers, we will be masking 30% of the neurons of the previous layer.

Then, in a similar manner, we will add two Dropout layers to the model with one hidden layer and batch normalization of the initial data. Please note that we are being a little disingenuous here. It is currently not recommended to use batch normalization and Dropout simultaneously within this model, as this will only reduce the overall result of the model. Let's test this statement with practical examples.

# Adding Dropout to the model with batch normalization of the initial data 
# and one hidden layer
model1bndo = keras.Sequential([keras.layers.InputLayer(input_shape=inputs),
                             keras.layers.BatchNormalization(),
                             keras.layers.Dropout(0.3),
                             keras.layers.Dense(40, activation=tf.nn.swish), 
                             keras.layers.Dropout(0.3),
                             keras.layers.Dense(targets, activation=tf.nn.tanh) 
                            ])
model1bndo.compile(optimizer='Adam'
               loss='mean_squared_error'
               metrics=['accuracy'])
model1bndo.summary()

Similarly, we add Dropout batches to models with three hidden layers.

# Adding a Dropout to a model with three hidden layers
model2do = keras.Sequential([keras.layers.InputLayer(input_shape=inputs),
                           keras.layers.Dropout(0.3),
                           keras.layers.Dense(40, activation=tf.nn.swish), 
                           keras.layers.Dropout(0.3),
                           keras.layers.Dense(40, activation=tf.nn.swish), 
                           keras.layers.Dropout(0.3),
                           keras.layers.Dense(40, activation=tf.nn.swish), 
                           keras.layers.Dropout(0.3),
                           keras.layers.Dense(targets, activation=tf.nn.tanh) 
                         ])
model2do.compile(optimizer='Adam'
               loss='mean_squared_error'
               metrics=['accuracy'])
model2do.summary()

# Adding Dropout to the model with batch normalization of the initial data 
# and three hidden layers
model2bndo = keras.Sequential([keras.layers.InputLayer(input_shape=inputs),
                             keras.layers.BatchNormalization(),
                             keras.layers.Dropout(0.3),
                             keras.layers.Dense(40, activation=tf.nn.swish), 
                             keras.layers.BatchNormalization(),
                             keras.layers.Dropout(0.3),
                             keras.layers.Dense(40, activation=tf.nn.swish), 
                             keras.layers.BatchNormalization(),
                             keras.layers.Dropout(0.3),
                             keras.layers.Dense(40, activation=tf.nn.swish), 
                             keras.layers.Dropout(0.3),
                             keras.layers.Dense(targets, activation=tf.nn.tanh) 
                            ])
model2bndo.compile(optimizer='Adam'
               loss='mean_squared_error'
               metrics=['accuracy'])
model2bndo.summary()

After creating the models, we add code to start the new model training process.

history1do = model1do.fit(train_data, train_target,
                      epochs=500, batch_size=1000,
                      callbacks=[callback],
                      verbose=2,
                      validation_split=0.1,
                      shuffle=True)
model1do.save(os.path.join(path,'perceptron1do.h5'))

history1bndo = model1bndo.fit(train_nn_data, train_nn_target,
                      epochs=500, batch_size=1000,
                      callbacks=[callback],
                      verbose=2,
                      validation_split=0.1,
                      shuffle=True)
model1bndo.save(os.path.join(path,'perceptron1bndo.h5'))

history2do = model2do.fit(train_data, train_target,
                      epochs=500, batch_size=1000,
                      callbacks=[callback],
                      verbose=2,
                      validation_split=0.1,
                      shuffle=True)
model2do.save(os.path.join(path,'perceptron2do.h5'))

history2bndo = model2bndo.fit(train_nn_data, train_nn_target,
                      epochs=500, batch_size=1000,
                      callbacks=[callback],
                      verbose=2,
                      validation_split=0.1,
                      shuffle=True)
model2bndo.save(os.path.join(path,'perceptron2bndo.h5'))

We also add the ability to run models on a test dataset.

test_loss1do, test_acc1do = model1do.evaluate(test_data, test_target,
                                                            verbose=2
test_loss1bndo, test_acc1bndo = model1bndo.evaluate(test_nn_data, 
                                                    test_nn_target,
                                                    verbose=2
test_loss2do, test_acc2do = model2do.evaluate(test_data, test_target, 
                                                            verbose=2
test_loss2bndo, test_acc2bndo = model2bndo.evaluate(test_nn_data,
                                                    test_nn_target,
                                                    verbose=2)

In addition to changes in terms of training and testing models, we will also add a block for rendering model results. First, let's change the code that creates dynamics graphs for the mean square error and Accuracy during the training process. The changes here are not global, as we are just adding new variables to the graph.

# Rendering the results of training models with one hidden layer
plt.figure()
plt.plot(history1.history['loss'], label='Normalized inputs train')
plt.plot(history1.history['val_loss'], label='Normalized inputs validation')
plt.plot(history1do.history['loss'], label='Normalized inputs\nvs Dropout train')
plt.plot(history1do.history['val_loss'],
                                label='Normalized inputs\nvs Dropout validation')
plt.plot(history1bn.history['loss'],
                        label='Unnormalized inputs\nvs BatchNormalization train')
plt.plot(history1bn.history['val_loss'],
                   label='Unnormalized inputs\nvs BatchNormalization validation')
plt.plot(history1bndo.history['loss'],
            label='Unnormalized inputs\nvs BatchNormalization and Dropout train')
plt.plot(history1bndo.history['val_loss'],
       label='Unnormalized inputs\nvs BatchNormalization and Dropout validation')
plt.ylabel('$MSE$ $loss$')
plt.xlabel('$Epochs$')
plt.title('Model training dynamics\n1 hidden layer')
plt.legend(loc='upper right',ncol=2)

plt.figure()
plt.plot(history1.history['accuracy'], label='Normalized inputs trin')
plt.plot(history1.history['val_accuracy'], label='Normalized inputs validation')
plt.plot(history1do.history['accuracy'],
                                    label='Normalized inputs\nvs Dropout train')
plt.plot(history1do.history['val_accuracy'],
                               label='Normalized inputs\nvs Dropout validation')
plt.plot(history1bn.history['accuracy'],
                       label='Unnormalized inputs\nvs BatchNormalization train')
plt.plot(history1bn.history['val_accuracy'],
                  label='Unnormalized inputs\nvs BatchNormalization validation')
plt.plot(history1bndo.history['accuracy'],
           label='Unnormalized inputs\nvs BatchNormalization and Dropout train')
plt.plot(history1bndo.history['val_accuracy'],
      label='Unnormalized inputs\nvs BatchNormalization and Dropout validation')
plt.ylabel('$Accuracy$')
plt.xlabel('$Epochs$')
plt.title('Model training dynamics\n1 hidden layer')
plt.legend(loc='lower right',ncol=2)

# Rendering the results of training models with three hidden layers
plt.figure()
plt.plot(history2.history['loss'], label='Normalized inputs train')
plt.plot(history2.history['val_loss'], label='Normalized inputs validation')
plt.plot(history2do.history['loss'], label='Normalized inputs\nvs Dropout train')
plt.plot(history2do.history['val_loss'], 
                                 label='Normalizedinputs\nvs Dropout validation')
plt.plot(history2bn.history['loss'],
                        label='Unnormalized inputs\nvs BatchNormalization train')
plt.plot(history2bn.history['val_loss'],
                   label='Unnormalized inputs\nvs BatchNormalization validation')
plt.plot(history2bndo.history['loss'],
            label='Unnormalized inputs\nvs BatchNormalization and Dropout train')
plt.plot(history2bndo.history['val_loss'],
       label='Unnormalized inputs\nvs BatchNormalization and Dropout validation')
plt.ylabel('$MSE$ $loss$')
plt.xlabel('$Epochs$')
plt.title('Model training dynamics\n3 hidden layers')
plt.legend(loc='upper right',ncol=2)

plt.figure()
plt.plot(history2.history['accuracy'], label='Normalized inputs train')
plt.plot(history2.history['val_accuracy'], label='Normalized inputs validation')
plt.plot(history2do.history['accuracy'], label='Normalized inputs\nvs Dropout train')
plt.plot(history2do.history['val_accuracy'],
                                    label='Normalized inputs\nvs Dropout validation')
plt.plot(history2bn.history['accuracy'],
                            label='Unnormalized inputs\nvs BatchNormalization train')
plt.plot(history2bn.history['val_accuracy'],
                       label='Unnormalized inputs\nvs BatchNormalization validation')
plt.plot(history2bndo.history['accuracy'],
                label='Unnormalized inputs\nvs BatchNormalization and Dropout train')
plt.plot(history2bndo.history['val_accuracy'],
           label='Unnormalized inputs\nvs BatchNormalization and Dropout validation')
plt.ylabel('$Accuracy$')
plt.xlabel('$Epochs$')
plt.title('Model training dynamics\n3 hidden layers')
plt.legend(loc='lower right',ncol=2)

The last changes in the script concern the display of model performance results on the test dataset. Here, in addition to adding new data, we split the graphs: we will separately show the results of models with one hidden layer, and we will place the results of models with three hidden layers on a new graph.

plt.figure()
plt.bar(['Normalized inputs','\n\nNormalized inputs\nvs Dropout',
         'Unnormalized inputs\nvs BatchNornalization',
         '\n\nUnnormalized inputs\nvs BatchNornalization and Dropout'],
        [test_loss1,test_loss1do,
         test_loss1bn,test_loss1bndo])
plt.ylabel('$MSE$ $loss$')
plt.title('Test results\n1 hidden layer')

plt.figure()
plt.bar(['Normalized inputs','\n\nNormalized inputs\nvs Dropout',
         'Unnormalized inputs\nvs BatchNornalization',
         '\n\nUnnormalized inputs\nvs BatchNornalization and Dropout'],
        [test_loss2,test_loss2do,
         test_loss2bn,test_loss2bndo])
plt.ylabel('$MSE$ $loss$')
plt.title('Test results\n3 hidden layers')

plt.figure()
plt.bar(['Normalized inputs','\n\nNormalized inputs\nvs Dropout',
         'Unnormalized inputs\nvs BatchNornalization',
         '\n\nUnnormalized inputs\nvs BatchNornalization and Dropout'],
        [test_acc1,test_acc1do,
         test_acc1bn,test_acc1bndo])
plt.ylabel('$Accuracy$')
plt.title('Test results\n1 hidden layer')

plt.figure()
plt.bar(['Normalized inputs','\n\nNormalized inputs\nvs Dropout',
         'Unnormalized inputs\nvs BatchNornalization',
         '\n\nUnnormalized inputs\nvs BatchNornalization and Dropout'],
        [test_acc2,test_acc2do,
         test_acc2bn,test_acc2bndo])
plt.ylabel('$Accuracy$')
plt.title('Test results\n3 hidden layers')
 
plt.show()

The rest of the script code remained unchanged.

We will learn about the results of testing the models in the next section.