AIfES 2
2.0.0
|
Base layer implementation of the Batch Normalization layer. More...
Go to the source code of this file.
Data Structures | |
struct | ailayer_batch_norm |
General Batch Normalization layer structure. More... | |
Typedefs | |
typedef struct ailayer_batch_norm | ailayer_batch_norm_t |
Functions | |
ailayer_t * | ailayer_batch_norm (ailayer_batch_norm_t *layer, ailayer_t *input_layer) |
Initialize and connect the given Batch Normalization layer. More... | |
void | ailayer_batch_norm_forward (ailayer_t *self) |
Calculate the forward pass for given Batch Normalization layer. More... | |
void | ailayer_batch_norm_backward (ailayer_t *self) |
Calculate the backward pass for the given Batch Normalization layer. More... | |
void | ailayer_batch_norm_calc_result_shape (ailayer_t *self) |
Calculate the shape of the result tensor (ailayer.result) More... | |
uint32_t | ailayer_batch_norm_sizeof_paramem (const ailayer_t *self) |
Calculate and return the parameter memory size needed for this layer. More... | |
void | ailayer_batch_norm_set_paramem (ailayer_t *self, void *memory_ptr) |
Distribute provided memory to the parameter pointers. More... | |
uint32_t | ailayer_batch_norm_sizeof_trainmem (const ailayer_t *self) |
Calculate and return the memory size needed by this layer for training. More... | |
void | ailayer_batch_norm_set_trainmem (ailayer_t *self, void *memory_ptr) |
Distribute provided memory to the gradients pointers. More... | |
void | ailayer_batch_norm_print_specs (const ailayer_t *self) |
Print the layer specification. More... | |
Variables | |
const aicore_layertype_t * | ailayer_batch_norm_type |
Batch Normalization layer type. More... | |
Base layer implementation of the Batch Normalization layer.
This is an "abstract" data-type independent implementation. To use the layer use one of the provided implementations for a specific hardware and data-type (for example from ailayer_batch_normalization_default.h) or set the required math functions on your own.
The Batch Normalization layer can increase the training speed in deep neural networks by normalizing intermediate activations. For every element \( j \) neuron / channel \( i \) in the batch, the transformation is defined as
\[ y_{i,j} = \mathit{BN}(x_{i,j}) = \gamma_i \cdot \frac{x_{i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]
\[ \mu_i = \frac{1}{m} \sum_{j=1}^{m} x_{i,j} \]
\[ \sigma_i^2 = \frac{1}{m} \sum_{j=1}^{m} (x_{i,j} - \mu_i)^2 \]
\( \beta_i \) and \( \gamma_i \) are trainable parameters of the layer.
Batch Normalization behaves different during training and during inference.
When in training mode (ailayer.settings[AILAYER_SETTINGS_TRAINING_MODE] = TRUE
), the means and variances ( \( \mu_i \) and \( \sigma_i^2 \)) are calculated for the whole batch during forward pass. Additionally, exponential moving averages of \( \mu_i \) and \( \sigma_i^2 \) are calculated to estimate these values for the inference mode.
In inference mode (ailayer.settings[AILAYER_SETTINGS_TRAINING_MODE] = FALSE
), \( \mu_i \) and \( \sigma_i^2 \) are taken as fixed parameters from the averages, which were collected during training.
Batch Normalization works best if a whole batch is processed at once in a forward pass, i.e. the first dimension of the input shape of the model equals the batch size. If this is the case, the layer will run in batch mode (ailayer.settings[AILAYER_SETTINGS_BATCH_MODE] = TRUE
) and calculates the means and variances for the batch as described above. Otherwise ( if the first input shape dimension is smaller than the batch size and therefore ailayer.settings[AILAYER_SETTINGS_BATCH_MODE] = FALSE
), the layer uses the exponential moving averages also for \( \mu_i \) and \( \sigma_i^2 \) during training. This reduces the memory required for intermediate activations, but it may decrease the training speed.
The results of the forward pass of this layer are written to the result tensor of the base ailayer_t struct.
ailayer_t* ailayer_batch_norm | ( | ailayer_batch_norm_t * | layer, |
ailayer_t * | input_layer | ||
) |
Initialize and connect the given Batch Normalization layer.
This function represents the "constructor" of the abstract Batch Normalization layer. It initializes the layer structure and connects it to the previous layer.
This function is not intended to call it directly. Instead use one of the data type specific implementations (like for example ailayer_batch_norm_f32_default()).
*layer | The layer to initialize. |
*input_layer | The previous layer that provides the inputs to the layer. |
void ailayer_batch_norm_backward | ( | ailayer_t * | self | ) |
Calculate the backward pass for the given Batch Normalization layer.
Implementation of ailayer.backward.
It uses the deltas tensor of the next layer as input and writes the result of the backward pass to the deltas tensor (ailayer.deltas) of the given layer.
Calculates the gradients of \( \beta \) and \( \gamma \) and adds them to the corresponding gradients tensor.
Calculates the gradients for backpropagation to the previous layer and writes them to \( \delta_{in} \).
Please refer to the paper by Ioffe and Szegedy (https://arxiv.org/abs/1502.03167) for the equations of the gradients.
Used math functions:
*self | Layer to calculate the backward path for. |
void ailayer_batch_norm_calc_result_shape | ( | ailayer_t * | self | ) |
Calculate the shape of the result tensor (ailayer.result)
Implementation of ailayer.calc_result_shape.
Resulting shape equals input shape.
*self | Layer to calculate the resulting shape for. |
void ailayer_batch_norm_forward | ( | ailayer_t * | self | ) |
Calculate the forward pass for given Batch Normalization layer.
Implementation of ailayer.forward.
It uses the result tensor of the previous layer as input and writes the result of the forward pass to the result tensor (ailayer.result) of the given layer.
Calculation of the forward pass result:
For every element \( j \) neuron / channel \( i \) in the batch, the transformation is defined as
\[ x_{out;i,j} = \mathit{BN}(x_{in;i,j}) = \gamma_i \cdot \frac{x_{out;i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]
\[ \mu_i = \frac{1}{m} \sum_{j=1}^{m} x_{in;i,j} \]
\[ \sigma_i^2 = \frac{1}{m} \sum_{j=1}^{m} (x_{in;i,j} - \mu_i)^2 \]
\( \gamma \): Scaling vector
\( \beta \): Offset vector
\( \epsilon \): Small constant for numerical stability
\( x_{in} \): Result of the forward pass of the previous layer
\( x_{out} \): Result of the forward pass of this layer
Used math functions:
*self | Layer to calculate the forward path for. |
void ailayer_batch_norm_print_specs | ( | const ailayer_t * | self | ) |
Print the layer specification.
*self | The layer to print the specification for |
void ailayer_batch_norm_set_paramem | ( | ailayer_t * | self, |
void * | memory_ptr | ||
) |
Distribute provided memory to the parameter pointers.
Implementation of ailayer.set_paramem.
Distributes the given buffer to the parameter pointers and sets the tensor parameters.
The required parameter size can be calculated with ailayer_batch_norm_sizeof_paramem()
*self | The layer to set the memory fields for. |
*memory_ptr | The memory that can be used for the parameters |
void ailayer_batch_norm_set_trainmem | ( | ailayer_t * | self, |
void * | memory_ptr | ||
) |
Distribute provided memory to the gradients pointers.
Implementation of ailayer.set_trainmem.
The required memory size can be calculated with ailayer_batch_norm_sizeof_trainmem().
*self | The layer to set the memory fields for. |
*memory_ptr | The memory that can be used for the gradients |
uint32_t ailayer_batch_norm_sizeof_paramem | ( | const ailayer_t * | self | ) |
Calculate and return the parameter memory size needed for this layer.
Implementation of ailayer.sizeof_paramem.
The parameter size is calculated for the gammas , betas , moving means and moving variances tensors.
*self | The layer to calculate the parameter memory size for |
uint32_t ailayer_batch_norm_sizeof_trainmem | ( | const ailayer_t * | self | ) |
Calculate and return the memory size needed by this layer for training.
Implementation of ailayer.sizeof_trainmem.
The memory size is calculated for the means and variances and for the gradient tensors of gammas and betas .
*self | The layer to calculate the gradient memory size for. |
|
extern |
Batch Normalization layer type.
Defines the type of the layer (for example for type checks and debug prints). See aicore_layertype for more information about the layer type.