Implementing recurrent models in Python

In the previous sections, we reviewed the principles of organizing a recurrent model architecture, and even built a recurrent neural layer using the LSTM block algorithm. Earlier, we used the Keras library for TensorFlow to build previous neural network models in Python. The same library offers a number of options for building recurrent neural layers. These include classes of basic recurrent neural layers as well as more complex models.

AbstractRNNCell — abstract object representing an RNN cell
Bidirectional — bidirectional shell for RNN
ConvLSTM1D — 1D convolutional LSTM block
ConvLSTM2D — 2D convolutional LSTM block
ConvLSTM3D — 3D convolutional LSTM block
GRU — recurrent block by Cho et al. (2014)
LSTM — layer of long-term short-term memory by Hochreiter (1997)
RNN — base class for the recurrent layer
SimpleRNN — fully connected recurrent layer in which the output must be returned to the input

In the presented list, in addition to the basic recurrence layer class, you can find already familiar LSTM and GRU models. It is also possible to create bidirectional recurrent layers, which are most often used in text translation tasks. The ConvLSTM model is built based on the architecture of the LSTM block but uses convolutional layers instead of fully connected layers as gates and a new content layer.

Additionally, there is an abstract recurrent cell class for creating custom architectural solutions for recurrent models.

We won't go deep into the Keras library API right now. We will use the LSTM block to create our test recurrent models. Exactly this kind of model we recreated using MQL5 and will be able to compare the performance of our models created in different programming languages.

The LSTM block class is designed to automatically choose between CuDNN or pure TensorFlow implementations based on available hardware and environment constraints, ensuring optimal performance.

Users have access to an excessive range of parameters for fine-tuning the recurrent block:

units — dimensionality of the output space
activation — activation function
recurrent_activation — activation function for the recurrent step (gate)
use_bias — flag of using an offset vector
kernel_initializer — method to initialize the weights matrix for the new context layer
recurrent_initializer — method to initialize the weight matrix for gates
bias_initializer — initialization method for bias vector
kernel_regularizer — function to regularize the weight matrix for the new content layer
recurrent_regularizer — function to regularize the weight matrix for gates
bias_regularizer — bias vector regularization function
activity_regularizer — output layer regularization function
kernel_constraint — function of constraints for the weight matrix of the new content layer
recurrent_constraint — function of constraints for the weight matrix of gates
bias_constraint — function of vector constraints
dropout — floating-point number from 0 to 1, defining the share of elements to be dropped out during linear transformation of input data
recurrent_dropout — floating-point number from 0 to 1, determining the share of elements to be dropped out during linear transformation of memory state
return_sequences — boolean flag to specify whether to return the last result in the output sequence or the results of the whole sequence
return_state — boolean flag to indicate whether to return the last state in addition to the output
go_backwards — boolean flag to instruct the processing of the input sequence in the backward order and return the reverse sequence
stateful — boolean flag to indicate the use of the last state for each sample with the i index in the batch as the initial state for the sample with the i index in the next batch
time_major — the format of the input and output sequence tensor shapes
unroll — boolean flag used to indicate whether to unroll the recurrent network or use a simple loop; unrolling can accelerate the training of the recurrent network, but it requires more memory

After acquainting ourselves with the control parameters of the LSTM layer class, we will proceed to the practical implementation of various models using the recurrence layer.

Organizing parallel computing in the LSTM block

4.2.4.1 Building a test recurrent model in Python