Base layer implementation of the Batch Normalization layer. More...

Data Structures
struct	ailayer_batch_norm
	General Batch Normalization layer structure. More...

Typedefs
typedef struct ailayer_batch_norm	ailayer_batch_norm_t

Functions
ailayer_t *	ailayer_batch_norm (ailayer_batch_norm_t layer, ailayer_t input_layer)
	Initialize and connect the given Batch Normalization layer. More...

void	ailayer_batch_norm_forward (ailayer_t *self)
	Calculate the forward pass for given Batch Normalization layer. More...

void	ailayer_batch_norm_backward (ailayer_t *self)
	Calculate the backward pass for the given Batch Normalization layer. More...

void	ailayer_batch_norm_calc_result_shape (ailayer_t *self)
	Calculate the shape of the result tensor (ailayer.result) More...

uint32_t	ailayer_batch_norm_sizeof_paramem (const ailayer_t *self)
	Calculate and return the parameter memory size needed for this layer. More...

void	ailayer_batch_norm_set_paramem (ailayer_t self, void memory_ptr)
	Distribute provided memory to the parameter pointers. More...

uint32_t	ailayer_batch_norm_sizeof_trainmem (const ailayer_t *self)
	Calculate and return the memory size needed by this layer for training. More...

void	ailayer_batch_norm_set_trainmem (ailayer_t self, void memory_ptr)
	Distribute provided memory to the gradients pointers. More...

void	ailayer_batch_norm_print_specs (const ailayer_t *self)
	Print the layer specification. More...

Variables
const aicore_layertype_t *	ailayer_batch_norm_type
	Batch Normalization layer type. More...

Detailed Description

Base layer implementation of the Batch Normalization layer.

Version: 2.2.0

Copyright: Copyright (C) 2020-2023 Fraunhofer Institute for Microelectronic Circuits and Systems. All rights reserved.

AIfES is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.

This is an "abstract" data-type independent implementation. To use the layer use one of the provided implementations for a specific hardware and data-type (for example from ailayer_batch_normalization_default.h) or set the required math functions on your own.

The Batch Normalization layer can increase the training speed in deep neural networks by normalizing intermediate activations. For every element \( j \) neuron / channel \( i \) in the batch, the transformation is defined as

\[ y_{i,j} = \mathit{BN}(x_{i,j}) = \gamma_i \cdot \frac{x_{i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]

\[ \mu_i = \frac{1}{m} \sum_{j=1}^{m} x_{i,j} \]

\[ \sigma_i^2 = \frac{1}{m} \sum_{j=1}^{m} (x_{i,j} - \mu_i)^2 \]

\( \beta_i \) and \( \gamma_i \) are trainable parameters of the layer.

Batch Normalization behaves different during training and during inference.

When in training mode (ailayer.settings[AILAYER_SETTINGS_TRAINING_MODE] = TRUE), the means and variances ( \( \mu_i \) and \( \sigma_i^2 \)) are calculated for the whole batch during forward pass. Additionally, exponential moving averages of \( \mu_i \) and \( \sigma_i^2 \) are calculated to estimate these values for the inference mode.

In inference mode (ailayer.settings[AILAYER_SETTINGS_TRAINING_MODE] = FALSE), \( \mu_i \) and \( \sigma_i^2 \) are taken as fixed parameters from the averages, which were collected during training.

Batch Normalization works best if a whole batch is processed at once in a forward pass, i.e. the first dimension of the input shape of the model equals the batch size. If this is the case, the layer will run in batch mode (ailayer.settings[AILAYER_SETTINGS_BATCH_MODE] = TRUE) and calculates the means and variances for the batch as described above. Otherwise ( if the first input shape dimension is smaller than the batch size and therefore ailayer.settings[AILAYER_SETTINGS_BATCH_MODE] = FALSE), the layer uses the exponential moving averages also for \( \mu_i \) and \( \sigma_i^2 \) during training. This reduces the memory required for intermediate activations, but it may decrease the training speed.

The results of the forward pass of this layer are written to the result tensor of the base ailayer_t struct.

Function Documentation

◆ ailayer_batch_norm()

ailayer_t* ailayer_batch_norm	(	ailayer_batch_norm_t *	layer,
		ailayer_t *	input_layer
	)

Initialize and connect the given Batch Normalization layer.

This function represents the "constructor" of the abstract Batch Normalization layer. It initializes the layer structure and connects it to the previous layer.
This function is not intended to call it directly. Instead use one of the data type specific implementations (like for example ailayer_batch_norm_f32_default()).

Parameters

*layer	The layer to initialize.
*input_layer	The previous layer that provides the inputs to the layer.

Returns: Pointer to the (successfully) initialized general layer structure (ailayer_batch_norm.base)

◆ ailayer_batch_norm_backward()

void ailayer_batch_norm_backward ( ailayer_t * self )

Calculate the backward pass for the given Batch Normalization layer.

Implementation of ailayer.backward.

It uses the deltas tensor of the next layer as input and writes the result of the backward pass to the deltas tensor (ailayer.deltas) of the given layer.

Calculates the gradients of \( \beta \) and \( \gamma \) and adds them to the corresponding gradients tensor.
Calculates the gradients for backpropagation to the previous layer and writes them to \( \delta_{in} \).

Please refer to the paper by Ioffe and Szegedy (https://arxiv.org/abs/1502.03167) for the equations of the gradients.

Used math functions:

ailayer_batch_norm.d_batch_norm

Parameters

*self Layer to calculate the backward path for.

◆ ailayer_batch_norm_calc_result_shape()

void ailayer_batch_norm_calc_result_shape ( ailayer_t * self )

Calculate the shape of the result tensor (ailayer.result)

Implementation of ailayer.calc_result_shape.

Resulting shape equals input shape.

Parameters

*self Layer to calculate the resulting shape for.

◆ ailayer_batch_norm_forward()

void ailayer_batch_norm_forward ( ailayer_t * self )

Calculate the forward pass for given Batch Normalization layer.

Implementation of ailayer.forward.

It uses the result tensor of the previous layer as input and writes the result of the forward pass to the result tensor (ailayer.result) of the given layer.

Calculation of the forward pass result:
For every element \( j \) neuron / channel \( i \) in the batch, the transformation is defined as

\[ x_{out;i,j} = \mathit{BN}(x_{in;i,j}) = \gamma_i \cdot \frac{x_{out;i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]

\[ \mu_i = \frac{1}{m} \sum_{j=1}^{m} x_{in;i,j} \]

\[ \sigma_i^2 = \frac{1}{m} \sum_{j=1}^{m} (x_{in;i,j} - \mu_i)^2 \]

\( \gamma \): Scaling vector
\( \beta \): Offset vector
\( \epsilon \): Small constant for numerical stability
\( x_{in} \): Result of the forward pass of the previous layer
\( x_{out} \): Result of the forward pass of this layer

Used math functions:

Parameters

*self Layer to calculate the forward path for.

◆ ailayer_batch_norm_print_specs()

void ailayer_batch_norm_print_specs ( const ailayer_t * self )

Print the layer specification.

Parameters

*self The layer to print the specification for

◆ ailayer_batch_norm_set_paramem()

void ailayer_batch_norm_set_paramem	(	ailayer_t *	self,
		void *	memory_ptr
	)

Distribute provided memory to the parameter pointers.

Implementation of ailayer.set_paramem.

Distributes the given buffer to the parameter pointers and sets the tensor parameters.
The required parameter size can be calculated with ailayer_batch_norm_sizeof_paramem()

Parameters

*self	The layer to set the memory fields for.
*memory_ptr	The memory that can be used for the parameters

◆ ailayer_batch_norm_set_trainmem()

void ailayer_batch_norm_set_trainmem	(	ailayer_t *	self,
		void *	memory_ptr
	)

Distribute provided memory to the gradients pointers.

Implementation of ailayer.set_trainmem.

The required memory size can be calculated with ailayer_batch_norm_sizeof_trainmem().

Parameters

*self	The layer to set the memory fields for.
*memory_ptr	The memory that can be used for the gradients

◆ ailayer_batch_norm_sizeof_paramem()

uint32_t ailayer_batch_norm_sizeof_paramem ( const ailayer_t * self )

Calculate and return the parameter memory size needed for this layer.

Implementation of ailayer.sizeof_paramem.

The parameter size is calculated for the gammas , betas , moving means and moving variances tensors.

Parameters

*self The layer to calculate the parameter memory size for

Returns: Calculated parameter memory size in bytes.

◆ ailayer_batch_norm_sizeof_trainmem()

uint32_t ailayer_batch_norm_sizeof_trainmem ( const ailayer_t * self )

Calculate and return the memory size needed by this layer for training.

Implementation of ailayer.sizeof_trainmem.

The memory size is calculated for the means and variances and for the gradient tensors of gammas and betas .

Parameters

*self The layer to calculate the gradient memory size for.

Returns: Calculated gradient memory size in bytes.

Variable Documentation

◆ ailayer_batch_norm_type

const aicore_layertype_t* ailayer_batch_norm_type

extern

Batch Normalization layer type.

Defines the type of the layer (for example for type checks and debug prints). See aicore_layertype for more information about the layer type.

Data Structures

Typedefs

Functions

Variables

Detailed Description

Function Documentation

◆ ailayer_batch_norm()

◆ ailayer_batch_norm_backward()

◆ ailayer_batch_norm_calc_result_shape()

◆ ailayer_batch_norm_forward()

◆ ailayer_batch_norm_print_specs()

◆ ailayer_batch_norm_set_paramem()

◆ ailayer_batch_norm_set_trainmem()

◆ ailayer_batch_norm_sizeof_paramem()

◆ ailayer_batch_norm_sizeof_trainmem()

Variable Documentation

◆ ailayer_batch_norm_type