AIfES 2  2.0.0
aimath_cnn_f32_default.h File Reference

Math functions for F32 data type, CNN-specific implementation. More...

Go to the source code of this file.

Functions

void aimath_f32_default_conv2d_add (const aitensor_t *input, const uint16_t stride[2], const uint16_t dilation[2], const int16_t padding[2][2], const aitensor_t *kernel, const void *bias, const uint8_t rotated_kernel, const int16_t *input_use_dims, const int16_t *output_use_dims, const int16_t *kernel_use_dims, aitensor_t *output)
 Performs 2D convolution on slices of 4D F32 tensors and adds an optional bias. More...
 
void aimath_f32_default_conv_transpose2d_add (const aitensor_t *input, const uint16_t stride[2], const uint16_t dilation[2], const int16_t padding[2][2], const aitensor_t *kernel, const void *bias, const uint8_t rotated_kernel, const int16_t *input_use_dims, const int16_t *output_use_dims, const int16_t *kernel_use_dims, aitensor_t *output)
 Performs 2D transposed convolution (or deconvolution) on slices of 4D F32 tensors and adds an optional bias. More...
 
void aimath_f32_default_conv2d_fwd (const aitensor_t *input, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const aitensor_t *weights, const aitensor_t *bias, int8_t channel_axis, void *work_space, aitensor_t *output)
 Performs 2D convolutions with the given 4D F32 tensors and adds a bias (forward pass of the Conv2D layer) More...
 
void aimath_f32_default_conv2d_bwd (const aitensor_t *x_in, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const aitensor_t *delta_out, int8_t channel_axis, void *work_space, aitensor_t *d_weights)
 Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type. More...
 
void aimath_f32_default_conv2d_bwd_full (const aitensor_t *delta_out, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const aitensor_t *weights, int8_t channel_axis, void *work_space, aitensor_t *delta_in)
 Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type. More...
 
void aimath_f32_default_conv_transpose2d_fwd (const aitensor_t *input, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const uint16_t output_padding[2], const aitensor_t *weights, const aitensor_t *bias, int8_t channel_axis, void *work_space, aitensor_t *output)
 Performs 2D transposed convolutions with the given 4D F32 tensors and adds a bias (forward pass of the ConvTranspose2D layer) More...
 
void aimath_f32_default_maxpool2d_fwd (const aitensor_t *input, const uint16_t pool_size[2], const uint16_t stride[2], const uint16_t padding[2], int8_t channel_axis, void *work_space, uint32_t *max_locations, aitensor_t *output)
 2D max-pooling on 4D F32 tensors More...
 
void aimath_f32_default_maxpool2d_bwd (const aitensor_t *delta_out, const uint16_t pool_size[2], const uint16_t stride[2], const uint16_t padding[2], int8_t channel_axis, void *work_space, const uint32_t *max_locations, aitensor_t *delta_in)
 Calculates the gradients of the MaxPool2D layer with respect to the input in F32 data type. More...
 
void aimath_f32_default_batch_norm (const aitensor_t *x, int8_t axis, const aitensor_t *means, const aitensor_t *variances, const aitensor_t *offsets, const aitensor_t *scales, const void *eps, aitensor_t *result)
 Batch Normalization on F32 tensors. More...
 
void aimath_f32_default_d_batch_norm (const aitensor_t *x_in, int8_t axis, const aitensor_t *means, const aitensor_t *vars, const aitensor_t *betas, const aitensor_t *gammas, const aitensor_t *delta_out, const void *eps, aitensor_t *delta_in, aitensor_t *d_betas, aitensor_t *d_gammas)
 Calculates the gradients of Batch Normalization with respect to betas, gammas and the input in F32 data type. More...
 
void aimath_f32_default_pad_zeros (const aitensor_t *x, const uint16_t(*padding)[2], aitensor_t *result)
 Pads a F32 tensor with zeros. More...
 

Detailed Description

Math functions for F32 data type, CNN-specific implementation.

Version
2.2.0

These functions can be used when no hardware specific implementation is available.

Function Documentation

◆ aimath_f32_default_batch_norm()

void aimath_f32_default_batch_norm ( const aitensor_t x,
int8_t  axis,
const aitensor_t means,
const aitensor_t variances,
const aitensor_t offsets,
const aitensor_t scales,
const void *  eps,
aitensor_t result 
)

Batch Normalization on F32 tensors.

Performs the Batch Normalization operation (proposed by Ioffe and Szegedy, https://arxiv.org/abs/1502.03167):

\[ y_{i,j} = \mathit{BN}(x_{i,j}) = \gamma_i \cdot \frac{x_{i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]

Parameters
xInput tensor (N-D)
axisAxis of the input tensor that stores the channel dimension.
means1D vector with the means ( \( \mu_i \)) of every channel.
variances1D vector with the variances ( \( \sigma^2_i \)) of every channel.
offsets1D vector with the offset parameters ( \( \beta_i \)) of every channel.
scales1D vector with the scaling parameters ( \( \gamma_i \)) of every channel.
epsSmall constant for numerical stability.
resultThe resulting normalized tensor (N-D)

◆ aimath_f32_default_conv2d_add()

void aimath_f32_default_conv2d_add ( const aitensor_t input,
const uint16_t  stride[2],
const uint16_t  dilation[2],
const int16_t  padding[2][2],
const aitensor_t kernel,
const void *  bias,
const uint8_t  rotated_kernel,
const int16_t *  input_use_dims,
const int16_t *  output_use_dims,
const int16_t *  kernel_use_dims,
aitensor_t output 
)

Performs 2D convolution on slices of 4D F32 tensors and adds an optional bias.

The function takes 2D slices out of 4D tensors and performs a 2D convolution. The result is then added to the output slice. This technique allows the use of one function for all use-cases (different order like HWC or CHW; different cases in forward and backward pass).

To configure the slices for the convolution, an index array has to be passed for input, kernel and output tensor. The dimension of H is indicated with -1 and W with -2.
Example: [1, 5, -1, -2] -> Perform a 2D convolution on the last two dimensions. The other two dimensions have indices 1 and 5.

The output dimensions of the height and width are given as:

\[ H_{out} = floor \left( \frac{H_{in} + P_{h,top} + P_{h,bottom} - D_h \times (H_{kernel} - 1) - 1}{S_h} \right) + 1 \]

\[ W_{out} = floor \left( \frac{W_{in} + P_{w,left} + P_{w,right} - D_w \times (W_{kernel} - 1) - 1}{S_w} \right) + 1 \]

Parameters
inputInput tensor (4D)
strideStride in the direction of height and width
dilationDilation of the kernel tensor in the direction of height and width
paddingAsymmetric zero padding in the direction of height and width \( [[P_{h,top},P_{h,bottom}],[P_{w,left},P_{w,right}]] \)
kernelKernel / Weights tensor (4D)
biasOptional bias tensor (1D; set to null if not in use)
rotated_kernelDetermines whether or not to rotate the kernel by 180°; default is TRUE (1)
input_use_dimsIndices for the 2D slice of the input tensor (-1 for the height dimension and -2 for the width dimension)
output_use_dimsIndices for the 2D slice of the output tensor (-1 for the height dimension and -2 for the width dimension)
kernel_use_dimsIndices for the 2D slice of the kernel tensor (-1 for the height dimension and -2 for the width dimension)
outputOutput tensor to add the result to (4D)

◆ aimath_f32_default_conv2d_bwd()

void aimath_f32_default_conv2d_bwd ( const aitensor_t x_in,
const uint16_t  stride[2],
const uint16_t  dilation[2],
const uint16_t  padding[2],
const aitensor_t delta_out,
int8_t  channel_axis,
void *  work_space,
aitensor_t d_weights 
)

Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type.

Calculates the gradients with respect to the weights \( \partial w = \mathrm{d} L / \mathrm{d} w \).

\[ \partial w = x_{in} \ast delta_{out} \]

This function wraps the aimath_f32_default_conv2d_add() function to perform the backward pass of a Conv2D layer.

Parameters
x_inInput data with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last)
strideThe stride in the direction of height and width
dilationThe dilation in the direction of height and width
paddingThe (symmetric) zero padding in the direction of height and width
delta_outGradients backpropagated from the following layer with dimension \( [N,C_{out},H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C_{out}] \) (channels last)
channel_axisIndex of the channel axis (1 for channels first and -1 or 3 for channels last).
work_spacePointer to a work space buffer for intermediate results (Not in use)
d_weightsOutput gradients of the weights with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last)

◆ aimath_f32_default_conv2d_bwd_full()

void aimath_f32_default_conv2d_bwd_full ( const aitensor_t delta_out,
const uint16_t  stride[2],
const uint16_t  dilation[2],
const uint16_t  padding[2],
const aitensor_t weights,
int8_t  channel_axis,
void *  work_space,
aitensor_t delta_in 
)

Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type.

Calculates the gradients with respect to the input \( delta_{in} = \mathrm{d} L / \mathrm{d} x_{in} \).

\[ delta_{in} = delta_{out} \ast' w \]

\( \cdot \ast' \cdot \) is a transposed convolution.

This function wraps the aimath_f32_default_conv_transpose2d_add() function to perform the backward pass of a Conv2D layer.

Parameters
delta_outGradients backpropagated from the following layer with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last)
strideThe stride in the direction of height and width
dilationThe dilation in the direction of height and width
paddingThe (symmetric) zero padding in the direction of height and width
weightsConvolution kernels with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last)
channel_axisIndex of the channel axis (1 for channels first and -1 or 3 for channels last).
work_spacePointer to a work space buffer for intermediate results.
delta_inResulting input gradients for backpropagation to the previous layer with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last)

◆ aimath_f32_default_conv2d_fwd()

void aimath_f32_default_conv2d_fwd ( const aitensor_t input,
const uint16_t  stride[2],
const uint16_t  dilation[2],
const uint16_t  padding[2],
const aitensor_t weights,
const aitensor_t bias,
int8_t  channel_axis,
void *  work_space,
aitensor_t output 
)

Performs 2D convolutions with the given 4D F32 tensors and adds a bias (forward pass of the Conv2D layer)

\[ x_{out} = x_{in} \ast w + b \]

This function wraps the aimath_f32_default_conv2d_add() function to perform the forward pass of a Conv2D layer.

Parameters
inputInput ( \( x_{in} \)) data with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last)
strideThe stride in the direction of height and width
dilationThe dilation in the direction of height and width
paddingThe (symmetric) zero padding in the direction of height and width
weightsConvolution kernels with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last)
biasBias with dimension \( C_{out} \)
channel_axisIndex of the channel axis (1 for channels first and -1 or 3 for channels last).
work_spacePointer to a work space buffer for intermediate results (Not in use)
outputOutput ( \( x_{out} \)) after convolution with dimension \( [N,C_{out},H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C_{out}] \) (channels last)

◆ aimath_f32_default_conv_transpose2d_add()

void aimath_f32_default_conv_transpose2d_add ( const aitensor_t input,
const uint16_t  stride[2],
const uint16_t  dilation[2],
const int16_t  padding[2][2],
const aitensor_t kernel,
const void *  bias,
const uint8_t  rotated_kernel,
const int16_t *  input_use_dims,
const int16_t *  output_use_dims,
const int16_t *  kernel_use_dims,
aitensor_t output 
)

Performs 2D transposed convolution (or deconvolution) on slices of 4D F32 tensors and adds an optional bias.

The function takes 2D slices out of 4D tensors and performs a 2D transposed convolution. The result is then added to the output slice. This technique allows the use of one function for all use-cases (different order like HWC or CHW; different cases in forward and backward pass).

To configure the slices for the transposed convolution, an index array has to be passed for input, kernel and output tensor. The dimension of H is indicated with -1 and W with -2.
Example: [1, 5, -1, -2] -> Perform a 2D transposed convolution on the last two dimensions. The other two dimensions have indices 1 and 5.

The output dimensions of the height and width are given as:

\[ H_{out} = S_h \times (H_{in}-1)+1 + P_{h,top} + P_{h,bottom} - D_h \times (H_{kernel} - 1) \]

\[ W_{out} = S_w \times (W_{in}-1)+1 + P_{w,left} + P_{w,right} - D_w \times (W_{kernel} - 1) \]

Parameters
inputInput tensor (4D)
strideDilation of the input tensor in the direction of height and width
dilationDilation of the kernel tensor in the direction of height and width
paddingAsymmetric zero padding in the direction of height and width \( [[P_{h,top},P_{h,bottom}],[P_{w,left},P_{w,right}]] \)
kernelKernel / Weights tensor (4D)
biasOptional bias tensor (1D; set to null if not in use)
rotated_kernelDetermines whether or not to rotate the kernel by 180°; default is TRUE (1)
input_use_dimsIndices for the 2D slice of the input tensor (-1 for the height dimension and -2 for the width dimension)
output_use_dimsIndices for the 2D slice of the output tensor (-1 for the height dimension and -2 for the width dimension)
kernel_use_dimsIndices for the 2D slice of the kernel tensor (-1 for the height dimension and -2 for the width dimension)
outputOutput tensor to add the result to (4D)

◆ aimath_f32_default_conv_transpose2d_fwd()

void aimath_f32_default_conv_transpose2d_fwd ( const aitensor_t input,
const uint16_t  stride[2],
const uint16_t  dilation[2],
const uint16_t  padding[2],
const uint16_t  output_padding[2],
const aitensor_t weights,
const aitensor_t bias,
int8_t  channel_axis,
void *  work_space,
aitensor_t output 
)

Performs 2D transposed convolutions with the given 4D F32 tensors and adds a bias (forward pass of the ConvTranspose2D layer)

\[ x_{out} = x_{in} \ast' w + b \]

\( \cdot \ast' \cdot \) is a transposed convolution.

This function wraps the aimath_f32_default_conv_transpose2d_add() function to perform the forward pass of a ConvTranspose2D layer.

Parameters
inputInput ( \( x_{in} \)) data with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last)
strideDilation of the input in the direction of height and width
dilationDilation of the kernel in the direction of height and width
paddingThe (symmetric) zero padding in the direction of height and width
output_paddingAdditional asymmetric zero padding on one side in the direction of height and width
weightsConvolution kernels with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last)
biasBias with dimension \( C_{out} \)
channel_axisIndex of the channel axis (1 for channels first and -1 or 3 for channels last).
work_spacePointer to a work space buffer for intermediate results (Not in use)
outputOutput ( \( x_{out} \)) after convolution with dimension \( [N,C_{out},H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C_{out}] \) (channels last)

◆ aimath_f32_default_d_batch_norm()

void aimath_f32_default_d_batch_norm ( const aitensor_t x_in,
int8_t  axis,
const aitensor_t means,
const aitensor_t vars,
const aitensor_t betas,
const aitensor_t gammas,
const aitensor_t delta_out,
const void *  eps,
aitensor_t delta_in,
aitensor_t d_betas,
aitensor_t d_gammas 
)

Calculates the gradients of Batch Normalization with respect to betas, gammas and the input in F32 data type.

Calculates the derivative of the Batch Normalization with respect to the input and the trainable parameters ( \( \beta \) and \( \gamma \)).
Please refer to the paper by Ioffe and Szegedy (https://arxiv.org/abs/1502.03167) for the equations of the gradients.

Parameters
x_inInput tensor (N-D)
axisAxis of the input tensor that stores the channel dimension.
means1D vector with the means ( \( \mu_i \)) of every channel.
vars1D vector with the variances ( \( \sigma^2_i \)) of every channel.
betas1D vector with the offset parameters ( \( \beta_i \)) of every channel.
gammas1D vector with the scaling parameters ( \( \gamma_i \)) of every channel.
delta_outGradient calculated by the output layer for gradient backpropagation (N-D)
epsSmall constant for numerical stability.
delta_inThe resulting gradients of the input ( \( \mathrm{d}\mathcal{L} / \mathrm{d}x \)).
d_betasThe resulting gradients of the \( \beta \) parameter ( \( \mathrm{d}\mathcal{L} / \mathrm{d}\beta \)).
d_gammasThe resulting gradients of the \( \gamma \) parameter ( \( \mathrm{d}\mathcal{L} / \mathrm{d}\gamma \)).

◆ aimath_f32_default_maxpool2d_bwd()

void aimath_f32_default_maxpool2d_bwd ( const aitensor_t delta_out,
const uint16_t  pool_size[2],
const uint16_t  stride[2],
const uint16_t  padding[2],
int8_t  channel_axis,
void *  work_space,
const uint32_t *  max_locations,
aitensor_t delta_in 
)

Calculates the gradients of the MaxPool2D layer with respect to the input in F32 data type.

Calculates the gradients with respect to the input \( delta_{in} = \mathrm{d} L / \mathrm{d} x_{in} \).

This is done by simply copying the output gradient to the position in the input gradients depicted by max_locations.
An element of max_locations consist of the concatenated 16-bit indices for height and width in the pooling window.

Parameters
delta_outGradients backpropagated from the following layer with dimension \( [N,C,H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C] \) (channels last)
pool_sizeThe size of the pooling window (height and width)
strideThe stride in the direction of height and width.
paddingThe (symmetric) minus infinity padding in the direction of height and width
channel_axisIndex of the channel axis (1 for channels first and -1 or 3 for channels last).
work_spacePointer to a work space buffer for intermediate results.
max_locationsPointer to memory section where the indices of the maximum values per pooling window are stored.
delta_inResulting input gradients for backpropagation to the previous layer \( [N,C,H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C] \) (channels last)

◆ aimath_f32_default_maxpool2d_fwd()

void aimath_f32_default_maxpool2d_fwd ( const aitensor_t input,
const uint16_t  pool_size[2],
const uint16_t  stride[2],
const uint16_t  padding[2],
int8_t  channel_axis,
void *  work_space,
uint32_t *  max_locations,
aitensor_t output 
)

2D max-pooling on 4D F32 tensors

Performs a 2D max-pooling operation on 2D slices of a 4D input tensor. This function is used as the forward pass of the MaxPool2D layer.

For training (max_locations != 0), the index of the max-value in the kernel window is be stored in max_locations.
An element of max_locations simply consist of the concatenated 16-bit indices for height and width in the pooling window.

The output dimensions of the height and width are given as:

\[ H_{out} = floor \left( \frac{H_{in} + 2 * P_h - H_{pool}}{S_h} \right) + 1 \]

\[ W_{out} = floor \left( \frac{W_{in} + 2 * P_w - W_{pool}}{S_w} \right) + 1 \]

Parameters
inputInput data with dimension \( [N,C,H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C] \) (channels last)
pool_sizeThe size of the pooling window (height and width)
strideThe stride in the direction of height and width.
paddingThe (symmetric) minus infinity padding in the direction of height and width
channel_axisIndex of the channel axis (1 for channels first and -1 or 3 for channels last).
work_spacePointer to a work space buffer for intermediate results.
max_locationsPointer to memory section where the indices of the maximum values per pooling window are stored.
outputOutput after max-pooling with dimension \( [N,C,H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C] \) (channels last)

◆ aimath_f32_default_pad_zeros()

void aimath_f32_default_pad_zeros ( const aitensor_t x,
const uint16_t(*)  padding[2],
aitensor_t result 
)

Pads a F32 tensor with zeros.

Parameters
xInput F32 tensor (N-D)
paddingArray of the asymmetric zero paddings for each dimension of the input tensor
resultResulting padded F32 tensor (N-D)