Math functions for F32 data type, CNN-specific implementation. More...

Functions
void	aimath_f32_default_conv2d_add (const aitensor_t input, const uint16_t stride[2], const uint16_t dilation[2], const int16_t padding[2][2], const aitensor_t kernel, const void bias, const uint8_t rotated_kernel, const int16_t input_use_dims, const int16_t output_use_dims, const int16_t kernel_use_dims, aitensor_t *output)
	Performs 2D convolution on slices of 4D F32 tensors and adds an optional bias. More...

void	aimath_f32_default_conv_transpose2d_add (const aitensor_t input, const uint16_t stride[2], const uint16_t dilation[2], const int16_t padding[2][2], const aitensor_t kernel, const void bias, const uint8_t rotated_kernel, const int16_t input_use_dims, const int16_t output_use_dims, const int16_t kernel_use_dims, aitensor_t *output)
	Performs 2D transposed convolution (or deconvolution) on slices of 4D F32 tensors and adds an optional bias. More...

void	aimath_f32_default_conv2d_fwd (const aitensor_t input, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const aitensor_t weights, const aitensor_t bias, int8_t channel_axis, void work_space, aitensor_t *output)
	Performs 2D convolutions with the given 4D F32 tensors and adds a bias (forward pass of the Conv2D layer) More...

void	aimath_f32_default_conv2d_bwd (const aitensor_t x_in, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const aitensor_t delta_out, int8_t channel_axis, void work_space, aitensor_t d_weights)
	Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type. More...

void	aimath_f32_default_conv2d_bwd_full (const aitensor_t delta_out, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const aitensor_t weights, int8_t channel_axis, void work_space, aitensor_t delta_in)
	Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type. More...

void	aimath_f32_default_conv_transpose2d_fwd (const aitensor_t input, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const uint16_t output_padding[2], const aitensor_t weights, const aitensor_t bias, int8_t channel_axis, void work_space, aitensor_t *output)
	Performs 2D transposed convolutions with the given 4D F32 tensors and adds a bias (forward pass of the ConvTranspose2D layer) More...

void	aimath_f32_default_maxpool2d_fwd (const aitensor_t input, const uint16_t pool_size[2], const uint16_t stride[2], const uint16_t padding[2], int8_t channel_axis, void work_space, uint32_t max_locations, aitensor_t output)
	2D max-pooling on 4D F32 tensors More...

void	aimath_f32_default_maxpool2d_bwd (const aitensor_t delta_out, const uint16_t pool_size[2], const uint16_t stride[2], const uint16_t padding[2], int8_t channel_axis, void work_space, const uint32_t max_locations, aitensor_t delta_in)
	Calculates the gradients of the MaxPool2D layer with respect to the input in F32 data type. More...

void	aimath_f32_default_batch_norm (const aitensor_t x, int8_t axis, const aitensor_t means, const aitensor_t variances, const aitensor_t offsets, const aitensor_t scales, const void eps, aitensor_t *result)
	Batch Normalization on F32 tensors. More...

void	aimath_f32_default_d_batch_norm (const aitensor_t x_in, int8_t axis, const aitensor_t means, const aitensor_t vars, const aitensor_t betas, const aitensor_t gammas, const aitensor_t delta_out, const void eps, aitensor_t delta_in, aitensor_t d_betas, aitensor_t d_gammas)
	Calculates the gradients of Batch Normalization with respect to betas, gammas and the input in F32 data type. More...

void	aimath_f32_default_pad_zeros (const aitensor_t x, const uint16_t(padding)[2], aitensor_t *result)
	Pads a F32 tensor with zeros. More...

Detailed Description

Math functions for F32 data type, CNN-specific implementation.

Version: 2.2.0

Copyright: Copyright (C) 2020-2023 Fraunhofer Institute for Microelectronic Circuits and Systems. All rights reserved.

AIfES is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.

These functions can be used when no hardware specific implementation is available.

Function Documentation

◆ aimath_f32_default_batch_norm()

void aimath_f32_default_batch_norm	(	const aitensor_t *	x,
		int8_t	axis,
		const aitensor_t *	means,
		const aitensor_t *	variances,
		const aitensor_t *	offsets,
		const aitensor_t *	scales,
		const void *	eps,
		aitensor_t *	result
	)

Batch Normalization on F32 tensors.

Performs the Batch Normalization operation (proposed by Ioffe and Szegedy, https://arxiv.org/abs/1502.03167):

\[ y_{i,j} = \mathit{BN}(x_{i,j}) = \gamma_i \cdot \frac{x_{i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]

Parameters

x	Input tensor (N-D)
axis	Axis of the input tensor that stores the channel dimension.
means	1D vector with the means ( \( \mu_i \)) of every channel.
variances	1D vector with the variances ( \( \sigma^2_i \)) of every channel.
offsets	1D vector with the offset parameters ( \( \beta_i \)) of every channel.
scales	1D vector with the scaling parameters ( \( \gamma_i \)) of every channel.
eps	Small constant for numerical stability.
result	The resulting normalized tensor (N-D)

◆ aimath_f32_default_conv2d_add()

void aimath_f32_default_conv2d_add	(	const aitensor_t *	input,
		const uint16_t	stride[2],
		const uint16_t	dilation[2],
		const int16_t	padding[2][2],
		const aitensor_t *	kernel,
		const void *	bias,
		const uint8_t	rotated_kernel,
		const int16_t *	input_use_dims,
		const int16_t *	output_use_dims,
		const int16_t *	kernel_use_dims,
		aitensor_t *	output
	)

Performs 2D convolution on slices of 4D F32 tensors and adds an optional bias.

The function takes 2D slices out of 4D tensors and performs a 2D convolution. The result is then added to the output slice. This technique allows the use of one function for all use-cases (different order like HWC or CHW; different cases in forward and backward pass).

To configure the slices for the convolution, an index array has to be passed for input, kernel and output tensor. The dimension of H is indicated with -1 and W with -2.
Example: [1, 5, -1, -2] -> Perform a 2D convolution on the last two dimensions. The other two dimensions have indices 1 and 5.

The output dimensions of the height and width are given as:

\[ H_{out} = floor \left( \frac{H_{in} + P_{h,top} + P_{h,bottom} - D_h \times (H_{kernel} - 1) - 1}{S_h} \right) + 1 \]

\[ W_{out} = floor \left( \frac{W_{in} + P_{w,left} + P_{w,right} - D_w \times (W_{kernel} - 1) - 1}{S_w} \right) + 1 \]

Parameters

input	Input tensor (4D)
stride	Stride in the direction of height and width
dilation	Dilation of the kernel tensor in the direction of height and width
padding	Asymmetric zero padding in the direction of height and width \( [[P_{h,top},P_{h,bottom}],[P_{w,left},P_{w,right}]] \)
kernel	Kernel / Weights tensor (4D)
bias	Optional bias tensor (1D; set to null if not in use)
rotated_kernel	Determines whether or not to rotate the kernel by 180°; default is TRUE (1)
input_use_dims	Indices for the 2D slice of the input tensor (-1 for the height dimension and -2 for the width dimension)
output_use_dims	Indices for the 2D slice of the output tensor (-1 for the height dimension and -2 for the width dimension)
kernel_use_dims	Indices for the 2D slice of the kernel tensor (-1 for the height dimension and -2 for the width dimension)
output	Output tensor to add the result to (4D)

◆ aimath_f32_default_conv2d_bwd()

void aimath_f32_default_conv2d_bwd	(	const aitensor_t *	x_in,
		const uint16_t	stride[2],
		const uint16_t	dilation[2],
		const uint16_t	padding[2],
		const aitensor_t *	delta_out,
		int8_t	channel_axis,
		void *	work_space,
		aitensor_t *	d_weights
	)

Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type.

Calculates the gradients with respect to the weights \( \partial w = \mathrm{d} L / \mathrm{d} w \).

\[ \partial w = x_{in} \ast delta_{out} \]

This function wraps the aimath_f32_default_conv2d_add() function to perform the backward pass of a Conv2D layer.

Parameters

x_in	Input data with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last)
stride	The stride in the direction of height and width
dilation	The dilation in the direction of height and width
padding	The (symmetric) zero padding in the direction of height and width
delta_out	Gradients backpropagated from the following layer with dimension \( [N,C_{out},H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C_{out}] \) (channels last)
channel_axis	Index of the channel axis (1 for channels first and -1 or 3 for channels last).
work_space	Pointer to a work space buffer for intermediate results (Not in use)
d_weights	Output gradients of the weights with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last)

◆ aimath_f32_default_conv2d_bwd_full()

void aimath_f32_default_conv2d_bwd_full	(	const aitensor_t *	delta_out,
		const uint16_t	stride[2],
		const uint16_t	dilation[2],
		const uint16_t	padding[2],
		const aitensor_t *	weights,
		int8_t	channel_axis,
		void *	work_space,
		aitensor_t *	delta_in
	)

Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type.

Calculates the gradients with respect to the input \( delta_{in} = \mathrm{d} L / \mathrm{d} x_{in} \).

\[ delta_{in} = delta_{out} \ast' w \]

\( \cdot \ast' \cdot \) is a transposed convolution.

This function wraps the aimath_f32_default_conv_transpose2d_add() function to perform the backward pass of a Conv2D layer.

Parameters

delta_out	Gradients backpropagated from the following layer with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last)
stride	The stride in the direction of height and width
dilation	The dilation in the direction of height and width
padding	The (symmetric) zero padding in the direction of height and width
weights	Convolution kernels with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last)
channel_axis	Index of the channel axis (1 for channels first and -1 or 3 for channels last).
work_space	Pointer to a work space buffer for intermediate results.
delta_in	Resulting input gradients for backpropagation to the previous layer with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last)

◆ aimath_f32_default_conv2d_fwd()

void aimath_f32_default_conv2d_fwd	(	const aitensor_t *	input,
		const uint16_t	stride[2],
		const uint16_t	dilation[2],
		const uint16_t	padding[2],
		const aitensor_t *	weights,
		const aitensor_t *	bias,
		int8_t	channel_axis,
		void *	work_space,
		aitensor_t *	output
	)

Performs 2D convolutions with the given 4D F32 tensors and adds a bias (forward pass of the Conv2D layer)

\[ x_{out} = x_{in} \ast w + b \]

This function wraps the aimath_f32_default_conv2d_add() function to perform the forward pass of a Conv2D layer.

Parameters

input	Input ( \( x_{in} \)) data with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last)
stride	The stride in the direction of height and width
dilation	The dilation in the direction of height and width
padding	The (symmetric) zero padding in the direction of height and width
weights	Convolution kernels with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last)
bias	Bias with dimension \( C_{out} \)
channel_axis	Index of the channel axis (1 for channels first and -1 or 3 for channels last).
work_space	Pointer to a work space buffer for intermediate results (Not in use)
output	Output ( \( x_{out} \)) after convolution with dimension \( [N,C_{out},H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C_{out}] \) (channels last)

◆ aimath_f32_default_conv_transpose2d_add()

void aimath_f32_default_conv_transpose2d_add	(	const aitensor_t *	input,
		const uint16_t	stride[2],
		const uint16_t	dilation[2],
		const int16_t	padding[2][2],
		const aitensor_t *	kernel,
		const void *	bias,
		const uint8_t	rotated_kernel,
		const int16_t *	input_use_dims,
		const int16_t *	output_use_dims,
		const int16_t *	kernel_use_dims,
		aitensor_t *	output
	)

Performs 2D transposed convolution (or deconvolution) on slices of 4D F32 tensors and adds an optional bias.

The function takes 2D slices out of 4D tensors and performs a 2D transposed convolution. The result is then added to the output slice. This technique allows the use of one function for all use-cases (different order like HWC or CHW; different cases in forward and backward pass).

To configure the slices for the transposed convolution, an index array has to be passed for input, kernel and output tensor. The dimension of H is indicated with -1 and W with -2.
Example: [1, 5, -1, -2] -> Perform a 2D transposed convolution on the last two dimensions. The other two dimensions have indices 1 and 5.

The output dimensions of the height and width are given as:

\[ H_{out} = S_h \times (H_{in}-1)+1 + P_{h,top} + P_{h,bottom} - D_h \times (H_{kernel} - 1) \]

\[ W_{out} = S_w \times (W_{in}-1)+1 + P_{w,left} + P_{w,right} - D_w \times (W_{kernel} - 1) \]

Parameters

input	Input tensor (4D)
stride	Dilation of the input tensor in the direction of height and width
dilation	Dilation of the kernel tensor in the direction of height and width
padding	Asymmetric zero padding in the direction of height and width \( [[P_{h,top},P_{h,bottom}],[P_{w,left},P_{w,right}]] \)
kernel	Kernel / Weights tensor (4D)
bias	Optional bias tensor (1D; set to null if not in use)
rotated_kernel	Determines whether or not to rotate the kernel by 180°; default is TRUE (1)
input_use_dims	Indices for the 2D slice of the input tensor (-1 for the height dimension and -2 for the width dimension)
output_use_dims	Indices for the 2D slice of the output tensor (-1 for the height dimension and -2 for the width dimension)
kernel_use_dims	Indices for the 2D slice of the kernel tensor (-1 for the height dimension and -2 for the width dimension)
output	Output tensor to add the result to (4D)

◆ aimath_f32_default_conv_transpose2d_fwd()

void aimath_f32_default_conv_transpose2d_fwd	(	const aitensor_t *	input,
		const uint16_t	stride[2],
		const uint16_t	dilation[2],
		const uint16_t	padding[2],
		const uint16_t	output_padding[2],
		const aitensor_t *	weights,
		const aitensor_t *	bias,
		int8_t	channel_axis,
		void *	work_space,
		aitensor_t *	output
	)

Performs 2D transposed convolutions with the given 4D F32 tensors and adds a bias (forward pass of the ConvTranspose2D layer)

\[ x_{out} = x_{in} \ast' w + b \]

\( \cdot \ast' \cdot \) is a transposed convolution.

This function wraps the aimath_f32_default_conv_transpose2d_add() function to perform the forward pass of a ConvTranspose2D layer.

Parameters

input	Input ( \( x_{in} \)) data with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last)
stride	Dilation of the input in the direction of height and width
dilation	Dilation of the kernel in the direction of height and width
padding	The (symmetric) zero padding in the direction of height and width
output_padding	Additional asymmetric zero padding on one side in the direction of height and width
weights	Convolution kernels with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last)
bias	Bias with dimension \( C_{out} \)
channel_axis	Index of the channel axis (1 for channels first and -1 or 3 for channels last).
work_space	Pointer to a work space buffer for intermediate results (Not in use)
output	Output ( \( x_{out} \)) after convolution with dimension \( [N,C_{out},H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C_{out}] \) (channels last)

◆ aimath_f32_default_d_batch_norm()

void aimath_f32_default_d_batch_norm	(	const aitensor_t *	x_in,
		int8_t	axis,
		const aitensor_t *	means,
		const aitensor_t *	vars,
		const aitensor_t *	betas,
		const aitensor_t *	gammas,
		const aitensor_t *	delta_out,
		const void *	eps,
		aitensor_t *	delta_in,
		aitensor_t *	d_betas,
		aitensor_t *	d_gammas
	)

Calculates the gradients of Batch Normalization with respect to betas, gammas and the input in F32 data type.

Calculates the derivative of the Batch Normalization with respect to the input and the trainable parameters ( \( \beta \) and \( \gamma \)).
Please refer to the paper by Ioffe and Szegedy (https://arxiv.org/abs/1502.03167) for the equations of the gradients.

Parameters

x_in	Input tensor (N-D)
axis	Axis of the input tensor that stores the channel dimension.
means	1D vector with the means ( \( \mu_i \)) of every channel.
vars	1D vector with the variances ( \( \sigma^2_i \)) of every channel.
betas	1D vector with the offset parameters ( \( \beta_i \)) of every channel.
gammas	1D vector with the scaling parameters ( \( \gamma_i \)) of every channel.
delta_out	Gradient calculated by the output layer for gradient backpropagation (N-D)
eps	Small constant for numerical stability.
delta_in	The resulting gradients of the input ( \( \mathrm{d}\mathcal{L} / \mathrm{d}x \)).
d_betas	The resulting gradients of the \( \beta \) parameter ( \( \mathrm{d}\mathcal{L} / \mathrm{d}\beta \)).
d_gammas	The resulting gradients of the \( \gamma \) parameter ( \( \mathrm{d}\mathcal{L} / \mathrm{d}\gamma \)).

◆ aimath_f32_default_maxpool2d_bwd()

void aimath_f32_default_maxpool2d_bwd	(	const aitensor_t *	delta_out,
		const uint16_t	pool_size[2],
		const uint16_t	stride[2],
		const uint16_t	padding[2],
		int8_t	channel_axis,
		void *	work_space,
		const uint32_t *	max_locations,
		aitensor_t *	delta_in
	)

Calculates the gradients of the MaxPool2D layer with respect to the input in F32 data type.

Calculates the gradients with respect to the input \( delta_{in} = \mathrm{d} L / \mathrm{d} x_{in} \).

This is done by simply copying the output gradient to the position in the input gradients depicted by max_locations.
An element of max_locations consist of the concatenated 16-bit indices for height and width in the pooling window.

Parameters

delta_out	Gradients backpropagated from the following layer with dimension \( [N,C,H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C] \) (channels last)
pool_size	The size of the pooling window (height and width)
stride	The stride in the direction of height and width.
padding	The (symmetric) minus infinity padding in the direction of height and width
channel_axis	Index of the channel axis (1 for channels first and -1 or 3 for channels last).
work_space	Pointer to a work space buffer for intermediate results.
max_locations	Pointer to memory section where the indices of the maximum values per pooling window are stored.
delta_in	Resulting input gradients for backpropagation to the previous layer \( [N,C,H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C] \) (channels last)

◆ aimath_f32_default_maxpool2d_fwd()

void aimath_f32_default_maxpool2d_fwd	(	const aitensor_t *	input,
		const uint16_t	pool_size[2],
		const uint16_t	stride[2],
		const uint16_t	padding[2],
		int8_t	channel_axis,
		void *	work_space,
		uint32_t *	max_locations,
		aitensor_t *	output
	)

2D max-pooling on 4D F32 tensors

Performs a 2D max-pooling operation on 2D slices of a 4D input tensor. This function is used as the forward pass of the MaxPool2D layer.

For training (max_locations != 0), the index of the max-value in the kernel window is be stored in max_locations.
An element of max_locations simply consist of the concatenated 16-bit indices for height and width in the pooling window.

The output dimensions of the height and width are given as:

\[ H_{out} = floor \left( \frac{H_{in} + 2 * P_h - H_{pool}}{S_h} \right) + 1 \]

\[ W_{out} = floor \left( \frac{W_{in} + 2 * P_w - W_{pool}}{S_w} \right) + 1 \]

Parameters

input	Input data with dimension \( [N,C,H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C] \) (channels last)
pool_size	The size of the pooling window (height and width)
stride	The stride in the direction of height and width.
padding	The (symmetric) minus infinity padding in the direction of height and width
channel_axis	Index of the channel axis (1 for channels first and -1 or 3 for channels last).
work_space	Pointer to a work space buffer for intermediate results.
max_locations	Pointer to memory section where the indices of the maximum values per pooling window are stored.
output	Output after max-pooling with dimension \( [N,C,H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C] \) (channels last)

◆ aimath_f32_default_pad_zeros()

void aimath_f32_default_pad_zeros	(	const aitensor_t *	x,
		const uint16_t(*)	padding[2],
		aitensor_t *	result
	)

Pads a F32 tensor with zeros.

Parameters

x	Input F32 tensor (N-D)
padding	Array of the asymmetric zero paddings for each dimension of the input tensor
result	Resulting padded F32 tensor (N-D)

Functions

Detailed Description

Function Documentation

◆ aimath_f32_default_batch_norm()

◆ aimath_f32_default_conv2d_add()

◆ aimath_f32_default_conv2d_bwd()

◆ aimath_f32_default_conv2d_bwd_full()

◆ aimath_f32_default_conv2d_fwd()

◆ aimath_f32_default_conv_transpose2d_add()

◆ aimath_f32_default_conv_transpose2d_fwd()

◆ aimath_f32_default_d_batch_norm()

◆ aimath_f32_default_maxpool2d_bwd()

◆ aimath_f32_default_maxpool2d_fwd()

◆ aimath_f32_default_pad_zeros()