AIfES 2  2.0.0
ailayer_batch_normalization.h File Reference

Base layer implementation of the Batch Normalization layer. More...

Go to the source code of this file.

Data Structures

struct  ailayer_batch_norm
 General Batch Normalization layer structure. More...
 

Typedefs

typedef struct ailayer_batch_norm ailayer_batch_norm_t
 

Functions

ailayer_tailayer_batch_norm (ailayer_batch_norm_t *layer, ailayer_t *input_layer)
 Initialize and connect the given Batch Normalization layer. More...
 
void ailayer_batch_norm_forward (ailayer_t *self)
 Calculate the forward pass for given Batch Normalization layer. More...
 
void ailayer_batch_norm_backward (ailayer_t *self)
 Calculate the backward pass for the given Batch Normalization layer. More...
 
void ailayer_batch_norm_calc_result_shape (ailayer_t *self)
 Calculate the shape of the result tensor (ailayer.result) More...
 
uint32_t ailayer_batch_norm_sizeof_paramem (const ailayer_t *self)
 Calculate and return the parameter memory size needed for this layer. More...
 
void ailayer_batch_norm_set_paramem (ailayer_t *self, void *memory_ptr)
 Distribute provided memory to the parameter pointers. More...
 
uint32_t ailayer_batch_norm_sizeof_trainmem (const ailayer_t *self)
 Calculate and return the memory size needed by this layer for training. More...
 
void ailayer_batch_norm_set_trainmem (ailayer_t *self, void *memory_ptr)
 Distribute provided memory to the gradients pointers. More...
 
void ailayer_batch_norm_print_specs (const ailayer_t *self)
 Print the layer specification. More...
 

Variables

const aicore_layertype_tailayer_batch_norm_type
 Batch Normalization layer type. More...
 

Detailed Description

Base layer implementation of the Batch Normalization layer.

Version
2.2.0

This is an "abstract" data-type independent implementation. To use the layer use one of the provided implementations for a specific hardware and data-type (for example from ailayer_batch_normalization_default.h) or set the required math functions on your own.

The Batch Normalization layer can increase the training speed in deep neural networks by normalizing intermediate activations. For every element \( j \) neuron / channel \( i \) in the batch, the transformation is defined as

\[ y_{i,j} = \mathit{BN}(x_{i,j}) = \gamma_i \cdot \frac{x_{i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]

\[ \mu_i = \frac{1}{m} \sum_{j=1}^{m} x_{i,j} \]

\[ \sigma_i^2 = \frac{1}{m} \sum_{j=1}^{m} (x_{i,j} - \mu_i)^2 \]

\( \beta_i \) and \( \gamma_i \) are trainable parameters of the layer.

Batch Normalization behaves different during training and during inference.

When in training mode (ailayer.settings[AILAYER_SETTINGS_TRAINING_MODE] = TRUE), the means and variances ( \( \mu_i \) and \( \sigma_i^2 \)) are calculated for the whole batch during forward pass. Additionally, exponential moving averages of \( \mu_i \) and \( \sigma_i^2 \) are calculated to estimate these values for the inference mode.

In inference mode (ailayer.settings[AILAYER_SETTINGS_TRAINING_MODE] = FALSE), \( \mu_i \) and \( \sigma_i^2 \) are taken as fixed parameters from the averages, which were collected during training.

Batch Normalization works best if a whole batch is processed at once in a forward pass, i.e. the first dimension of the input shape of the model equals the batch size. If this is the case, the layer will run in batch mode (ailayer.settings[AILAYER_SETTINGS_BATCH_MODE] = TRUE) and calculates the means and variances for the batch as described above. Otherwise ( if the first input shape dimension is smaller than the batch size and therefore ailayer.settings[AILAYER_SETTINGS_BATCH_MODE] = FALSE), the layer uses the exponential moving averages also for \( \mu_i \) and \( \sigma_i^2 \) during training. This reduces the memory required for intermediate activations, but it may decrease the training speed.

The results of the forward pass of this layer are written to the result tensor of the base ailayer_t struct.

Function Documentation

◆ ailayer_batch_norm()

ailayer_t* ailayer_batch_norm ( ailayer_batch_norm_t layer,
ailayer_t input_layer 
)

Initialize and connect the given Batch Normalization layer.

This function represents the "constructor" of the abstract Batch Normalization layer. It initializes the layer structure and connects it to the previous layer.
This function is not intended to call it directly. Instead use one of the data type specific implementations (like for example ailayer_batch_norm_f32_default()).

Parameters
*layerThe layer to initialize.
*input_layerThe previous layer that provides the inputs to the layer.
Returns
Pointer to the (successfully) initialized general layer structure (ailayer_batch_norm.base)

◆ ailayer_batch_norm_backward()

void ailayer_batch_norm_backward ( ailayer_t self)

Calculate the backward pass for the given Batch Normalization layer.

Implementation of ailayer.backward.

It uses the deltas tensor of the next layer as input and writes the result of the backward pass to the deltas tensor (ailayer.deltas) of the given layer.

Calculates the gradients of \( \beta \) and \( \gamma \) and adds them to the corresponding gradients tensor.
Calculates the gradients for backpropagation to the previous layer and writes them to \( \delta_{in} \).

Please refer to the paper by Ioffe and Szegedy (https://arxiv.org/abs/1502.03167) for the equations of the gradients.

Used math functions:

Parameters
*selfLayer to calculate the backward path for.

◆ ailayer_batch_norm_calc_result_shape()

void ailayer_batch_norm_calc_result_shape ( ailayer_t self)

Calculate the shape of the result tensor (ailayer.result)

Implementation of ailayer.calc_result_shape.

Resulting shape equals input shape.

Parameters
*selfLayer to calculate the resulting shape for.

◆ ailayer_batch_norm_forward()

void ailayer_batch_norm_forward ( ailayer_t self)

Calculate the forward pass for given Batch Normalization layer.

Implementation of ailayer.forward.

It uses the result tensor of the previous layer as input and writes the result of the forward pass to the result tensor (ailayer.result) of the given layer.

Calculation of the forward pass result:
For every element \( j \) neuron / channel \( i \) in the batch, the transformation is defined as

\[ x_{out;i,j} = \mathit{BN}(x_{in;i,j}) = \gamma_i \cdot \frac{x_{out;i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]

\[ \mu_i = \frac{1}{m} \sum_{j=1}^{m} x_{in;i,j} \]

\[ \sigma_i^2 = \frac{1}{m} \sum_{j=1}^{m} (x_{in;i,j} - \mu_i)^2 \]

\( \gamma \): Scaling vector
\( \beta \): Offset vector
\( \epsilon \): Small constant for numerical stability
\( x_{in} \): Result of the forward pass of the previous layer
\( x_{out} \): Result of the forward pass of this layer

Used math functions:

Parameters
*selfLayer to calculate the forward path for.

◆ ailayer_batch_norm_print_specs()

void ailayer_batch_norm_print_specs ( const ailayer_t self)

Print the layer specification.

Parameters
*selfThe layer to print the specification for

◆ ailayer_batch_norm_set_paramem()

void ailayer_batch_norm_set_paramem ( ailayer_t self,
void *  memory_ptr 
)

Distribute provided memory to the parameter pointers.

Implementation of ailayer.set_paramem.

Distributes the given buffer to the parameter pointers and sets the tensor parameters.
The required parameter size can be calculated with ailayer_batch_norm_sizeof_paramem()

Parameters
*selfThe layer to set the memory fields for.
*memory_ptrThe memory that can be used for the parameters

◆ ailayer_batch_norm_set_trainmem()

void ailayer_batch_norm_set_trainmem ( ailayer_t self,
void *  memory_ptr 
)

Distribute provided memory to the gradients pointers.

Implementation of ailayer.set_trainmem.

The required memory size can be calculated with ailayer_batch_norm_sizeof_trainmem().

Parameters
*selfThe layer to set the memory fields for.
*memory_ptrThe memory that can be used for the gradients

◆ ailayer_batch_norm_sizeof_paramem()

uint32_t ailayer_batch_norm_sizeof_paramem ( const ailayer_t self)

Calculate and return the parameter memory size needed for this layer.

Implementation of ailayer.sizeof_paramem.

The parameter size is calculated for the gammas , betas , moving means and moving variances tensors.

Parameters
*selfThe layer to calculate the parameter memory size for
Returns
Calculated parameter memory size in bytes.

◆ ailayer_batch_norm_sizeof_trainmem()

uint32_t ailayer_batch_norm_sizeof_trainmem ( const ailayer_t self)

Calculate and return the memory size needed by this layer for training.

Implementation of ailayer.sizeof_trainmem.

The memory size is calculated for the means and variances and for the gradient tensors of gammas and betas .

Parameters
*selfThe layer to calculate the gradient memory size for.
Returns
Calculated gradient memory size in bytes.

Variable Documentation

◆ ailayer_batch_norm_type

const aicore_layertype_t* ailayer_batch_norm_type
extern

Batch Normalization layer type.

Defines the type of the layer (for example for type checks and debug prints). See aicore_layertype for more information about the layer type.