AIfES 2
2.0.0
|
Math functions for F32 data type, CNN-specific implementation. More...
Go to the source code of this file.
Functions | |
void | aimath_f32_default_conv2d_add (const aitensor_t *input, const uint16_t stride[2], const uint16_t dilation[2], const int16_t padding[2][2], const aitensor_t *kernel, const void *bias, const uint8_t rotated_kernel, const int16_t *input_use_dims, const int16_t *output_use_dims, const int16_t *kernel_use_dims, aitensor_t *output) |
Performs 2D convolution on slices of 4D F32 tensors and adds an optional bias. More... | |
void | aimath_f32_default_conv_transpose2d_add (const aitensor_t *input, const uint16_t stride[2], const uint16_t dilation[2], const int16_t padding[2][2], const aitensor_t *kernel, const void *bias, const uint8_t rotated_kernel, const int16_t *input_use_dims, const int16_t *output_use_dims, const int16_t *kernel_use_dims, aitensor_t *output) |
Performs 2D transposed convolution (or deconvolution) on slices of 4D F32 tensors and adds an optional bias. More... | |
void | aimath_f32_default_conv2d_fwd (const aitensor_t *input, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const aitensor_t *weights, const aitensor_t *bias, int8_t channel_axis, void *work_space, aitensor_t *output) |
Performs 2D convolutions with the given 4D F32 tensors and adds a bias (forward pass of the Conv2D layer) More... | |
void | aimath_f32_default_conv2d_bwd (const aitensor_t *x_in, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const aitensor_t *delta_out, int8_t channel_axis, void *work_space, aitensor_t *d_weights) |
Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type. More... | |
void | aimath_f32_default_conv2d_bwd_full (const aitensor_t *delta_out, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const aitensor_t *weights, int8_t channel_axis, void *work_space, aitensor_t *delta_in) |
Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type. More... | |
void | aimath_f32_default_conv_transpose2d_fwd (const aitensor_t *input, const uint16_t stride[2], const uint16_t dilation[2], const uint16_t padding[2], const uint16_t output_padding[2], const aitensor_t *weights, const aitensor_t *bias, int8_t channel_axis, void *work_space, aitensor_t *output) |
Performs 2D transposed convolutions with the given 4D F32 tensors and adds a bias (forward pass of the ConvTranspose2D layer) More... | |
void | aimath_f32_default_maxpool2d_fwd (const aitensor_t *input, const uint16_t pool_size[2], const uint16_t stride[2], const uint16_t padding[2], int8_t channel_axis, void *work_space, uint32_t *max_locations, aitensor_t *output) |
2D max-pooling on 4D F32 tensors More... | |
void | aimath_f32_default_maxpool2d_bwd (const aitensor_t *delta_out, const uint16_t pool_size[2], const uint16_t stride[2], const uint16_t padding[2], int8_t channel_axis, void *work_space, const uint32_t *max_locations, aitensor_t *delta_in) |
Calculates the gradients of the MaxPool2D layer with respect to the input in F32 data type. More... | |
void | aimath_f32_default_batch_norm (const aitensor_t *x, int8_t axis, const aitensor_t *means, const aitensor_t *variances, const aitensor_t *offsets, const aitensor_t *scales, const void *eps, aitensor_t *result) |
Batch Normalization on F32 tensors. More... | |
void | aimath_f32_default_d_batch_norm (const aitensor_t *x_in, int8_t axis, const aitensor_t *means, const aitensor_t *vars, const aitensor_t *betas, const aitensor_t *gammas, const aitensor_t *delta_out, const void *eps, aitensor_t *delta_in, aitensor_t *d_betas, aitensor_t *d_gammas) |
Calculates the gradients of Batch Normalization with respect to betas, gammas and the input in F32 data type. More... | |
void | aimath_f32_default_pad_zeros (const aitensor_t *x, const uint16_t(*padding)[2], aitensor_t *result) |
Pads a F32 tensor with zeros. More... | |
Math functions for F32 data type, CNN-specific implementation.
These functions can be used when no hardware specific implementation is available.
void aimath_f32_default_batch_norm | ( | const aitensor_t * | x, |
int8_t | axis, | ||
const aitensor_t * | means, | ||
const aitensor_t * | variances, | ||
const aitensor_t * | offsets, | ||
const aitensor_t * | scales, | ||
const void * | eps, | ||
aitensor_t * | result | ||
) |
Batch Normalization on F32 tensors.
Performs the Batch Normalization operation (proposed by Ioffe and Szegedy, https://arxiv.org/abs/1502.03167):
\[ y_{i,j} = \mathit{BN}(x_{i,j}) = \gamma_i \cdot \frac{x_{i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]
x | Input tensor (N-D) |
axis | Axis of the input tensor that stores the channel dimension. |
means | 1D vector with the means ( \( \mu_i \)) of every channel. |
variances | 1D vector with the variances ( \( \sigma^2_i \)) of every channel. |
offsets | 1D vector with the offset parameters ( \( \beta_i \)) of every channel. |
scales | 1D vector with the scaling parameters ( \( \gamma_i \)) of every channel. |
eps | Small constant for numerical stability. |
result | The resulting normalized tensor (N-D) |
void aimath_f32_default_conv2d_add | ( | const aitensor_t * | input, |
const uint16_t | stride[2], | ||
const uint16_t | dilation[2], | ||
const int16_t | padding[2][2], | ||
const aitensor_t * | kernel, | ||
const void * | bias, | ||
const uint8_t | rotated_kernel, | ||
const int16_t * | input_use_dims, | ||
const int16_t * | output_use_dims, | ||
const int16_t * | kernel_use_dims, | ||
aitensor_t * | output | ||
) |
Performs 2D convolution on slices of 4D F32 tensors and adds an optional bias.
The function takes 2D slices out of 4D tensors and performs a 2D convolution. The result is then added to the output slice. This technique allows the use of one function for all use-cases (different order like HWC or CHW; different cases in forward and backward pass).
To configure the slices for the convolution, an index array has to be passed for input, kernel and output tensor. The dimension of H is indicated with -1 and W with -2.
Example: [1, 5, -1, -2] -> Perform a 2D convolution on the last two dimensions. The other two dimensions have indices 1 and 5.
The output dimensions of the height and width are given as:
\[ H_{out} = floor \left( \frac{H_{in} + P_{h,top} + P_{h,bottom} - D_h \times (H_{kernel} - 1) - 1}{S_h} \right) + 1 \]
\[ W_{out} = floor \left( \frac{W_{in} + P_{w,left} + P_{w,right} - D_w \times (W_{kernel} - 1) - 1}{S_w} \right) + 1 \]
input | Input tensor (4D) |
stride | Stride in the direction of height and width |
dilation | Dilation of the kernel tensor in the direction of height and width |
padding | Asymmetric zero padding in the direction of height and width \( [[P_{h,top},P_{h,bottom}],[P_{w,left},P_{w,right}]] \) |
kernel | Kernel / Weights tensor (4D) |
bias | Optional bias tensor (1D; set to null if not in use) |
rotated_kernel | Determines whether or not to rotate the kernel by 180°; default is TRUE (1) |
input_use_dims | Indices for the 2D slice of the input tensor (-1 for the height dimension and -2 for the width dimension) |
output_use_dims | Indices for the 2D slice of the output tensor (-1 for the height dimension and -2 for the width dimension) |
kernel_use_dims | Indices for the 2D slice of the kernel tensor (-1 for the height dimension and -2 for the width dimension) |
output | Output tensor to add the result to (4D) |
void aimath_f32_default_conv2d_bwd | ( | const aitensor_t * | x_in, |
const uint16_t | stride[2], | ||
const uint16_t | dilation[2], | ||
const uint16_t | padding[2], | ||
const aitensor_t * | delta_out, | ||
int8_t | channel_axis, | ||
void * | work_space, | ||
aitensor_t * | d_weights | ||
) |
Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type.
Calculates the gradients with respect to the weights \( \partial w = \mathrm{d} L / \mathrm{d} w \).
\[ \partial w = x_{in} \ast delta_{out} \]
This function wraps the aimath_f32_default_conv2d_add() function to perform the backward pass of a Conv2D layer.
x_in | Input data with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last) |
stride | The stride in the direction of height and width |
dilation | The dilation in the direction of height and width |
padding | The (symmetric) zero padding in the direction of height and width |
delta_out | Gradients backpropagated from the following layer with dimension \( [N,C_{out},H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C_{out}] \) (channels last) |
channel_axis | Index of the channel axis (1 for channels first and -1 or 3 for channels last). |
work_space | Pointer to a work space buffer for intermediate results (Not in use) |
d_weights | Output gradients of the weights with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last) |
void aimath_f32_default_conv2d_bwd_full | ( | const aitensor_t * | delta_out, |
const uint16_t | stride[2], | ||
const uint16_t | dilation[2], | ||
const uint16_t | padding[2], | ||
const aitensor_t * | weights, | ||
int8_t | channel_axis, | ||
void * | work_space, | ||
aitensor_t * | delta_in | ||
) |
Calculates the gradients of the Conv2D layer with respect to the weights in F32 data type.
Calculates the gradients with respect to the input \( delta_{in} = \mathrm{d} L / \mathrm{d} x_{in} \).
\[ delta_{in} = delta_{out} \ast' w \]
\( \cdot \ast' \cdot \) is a transposed convolution.
This function wraps the aimath_f32_default_conv_transpose2d_add() function to perform the backward pass of a Conv2D layer.
delta_out | Gradients backpropagated from the following layer with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last) |
stride | The stride in the direction of height and width |
dilation | The dilation in the direction of height and width |
padding | The (symmetric) zero padding in the direction of height and width |
weights | Convolution kernels with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last) |
channel_axis | Index of the channel axis (1 for channels first and -1 or 3 for channels last). |
work_space | Pointer to a work space buffer for intermediate results. |
delta_in | Resulting input gradients for backpropagation to the previous layer with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last) |
void aimath_f32_default_conv2d_fwd | ( | const aitensor_t * | input, |
const uint16_t | stride[2], | ||
const uint16_t | dilation[2], | ||
const uint16_t | padding[2], | ||
const aitensor_t * | weights, | ||
const aitensor_t * | bias, | ||
int8_t | channel_axis, | ||
void * | work_space, | ||
aitensor_t * | output | ||
) |
Performs 2D convolutions with the given 4D F32 tensors and adds a bias (forward pass of the Conv2D layer)
\[ x_{out} = x_{in} \ast w + b \]
This function wraps the aimath_f32_default_conv2d_add() function to perform the forward pass of a Conv2D layer.
input | Input ( \( x_{in} \)) data with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last) |
stride | The stride in the direction of height and width |
dilation | The dilation in the direction of height and width |
padding | The (symmetric) zero padding in the direction of height and width |
weights | Convolution kernels with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last) |
bias | Bias with dimension \( C_{out} \) |
channel_axis | Index of the channel axis (1 for channels first and -1 or 3 for channels last). |
work_space | Pointer to a work space buffer for intermediate results (Not in use) |
output | Output ( \( x_{out} \)) after convolution with dimension \( [N,C_{out},H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C_{out}] \) (channels last) |
void aimath_f32_default_conv_transpose2d_add | ( | const aitensor_t * | input, |
const uint16_t | stride[2], | ||
const uint16_t | dilation[2], | ||
const int16_t | padding[2][2], | ||
const aitensor_t * | kernel, | ||
const void * | bias, | ||
const uint8_t | rotated_kernel, | ||
const int16_t * | input_use_dims, | ||
const int16_t * | output_use_dims, | ||
const int16_t * | kernel_use_dims, | ||
aitensor_t * | output | ||
) |
Performs 2D transposed convolution (or deconvolution) on slices of 4D F32 tensors and adds an optional bias.
The function takes 2D slices out of 4D tensors and performs a 2D transposed convolution. The result is then added to the output slice. This technique allows the use of one function for all use-cases (different order like HWC or CHW; different cases in forward and backward pass).
To configure the slices for the transposed convolution, an index array has to be passed for input, kernel and output tensor. The dimension of H is indicated with -1 and W with -2.
Example: [1, 5, -1, -2] -> Perform a 2D transposed convolution on the last two dimensions. The other two dimensions have indices 1 and 5.
The output dimensions of the height and width are given as:
\[ H_{out} = S_h \times (H_{in}-1)+1 + P_{h,top} + P_{h,bottom} - D_h \times (H_{kernel} - 1) \]
\[ W_{out} = S_w \times (W_{in}-1)+1 + P_{w,left} + P_{w,right} - D_w \times (W_{kernel} - 1) \]
input | Input tensor (4D) |
stride | Dilation of the input tensor in the direction of height and width |
dilation | Dilation of the kernel tensor in the direction of height and width |
padding | Asymmetric zero padding in the direction of height and width \( [[P_{h,top},P_{h,bottom}],[P_{w,left},P_{w,right}]] \) |
kernel | Kernel / Weights tensor (4D) |
bias | Optional bias tensor (1D; set to null if not in use) |
rotated_kernel | Determines whether or not to rotate the kernel by 180°; default is TRUE (1) |
input_use_dims | Indices for the 2D slice of the input tensor (-1 for the height dimension and -2 for the width dimension) |
output_use_dims | Indices for the 2D slice of the output tensor (-1 for the height dimension and -2 for the width dimension) |
kernel_use_dims | Indices for the 2D slice of the kernel tensor (-1 for the height dimension and -2 for the width dimension) |
output | Output tensor to add the result to (4D) |
void aimath_f32_default_conv_transpose2d_fwd | ( | const aitensor_t * | input, |
const uint16_t | stride[2], | ||
const uint16_t | dilation[2], | ||
const uint16_t | padding[2], | ||
const uint16_t | output_padding[2], | ||
const aitensor_t * | weights, | ||
const aitensor_t * | bias, | ||
int8_t | channel_axis, | ||
void * | work_space, | ||
aitensor_t * | output | ||
) |
Performs 2D transposed convolutions with the given 4D F32 tensors and adds a bias (forward pass of the ConvTranspose2D layer)
\[ x_{out} = x_{in} \ast' w + b \]
\( \cdot \ast' \cdot \) is a transposed convolution.
This function wraps the aimath_f32_default_conv_transpose2d_add() function to perform the forward pass of a ConvTranspose2D layer.
input | Input ( \( x_{in} \)) data with dimension \( [N,C_{in},H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C_{in}] \) (channels last) |
stride | Dilation of the input in the direction of height and width |
dilation | Dilation of the kernel in the direction of height and width |
padding | The (symmetric) zero padding in the direction of height and width |
output_padding | Additional asymmetric zero padding on one side in the direction of height and width |
weights | Convolution kernels with dimension \( [C_{out},C_{in},H_{kernel},W_{kernel}] \) (channels first) or \( [C_{out},H_{kernel},W_{kernel},C_{in}] \) (channels last) |
bias | Bias with dimension \( C_{out} \) |
channel_axis | Index of the channel axis (1 for channels first and -1 or 3 for channels last). |
work_space | Pointer to a work space buffer for intermediate results (Not in use) |
output | Output ( \( x_{out} \)) after convolution with dimension \( [N,C_{out},H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C_{out}] \) (channels last) |
void aimath_f32_default_d_batch_norm | ( | const aitensor_t * | x_in, |
int8_t | axis, | ||
const aitensor_t * | means, | ||
const aitensor_t * | vars, | ||
const aitensor_t * | betas, | ||
const aitensor_t * | gammas, | ||
const aitensor_t * | delta_out, | ||
const void * | eps, | ||
aitensor_t * | delta_in, | ||
aitensor_t * | d_betas, | ||
aitensor_t * | d_gammas | ||
) |
Calculates the gradients of Batch Normalization with respect to betas, gammas and the input in F32 data type.
Calculates the derivative of the Batch Normalization with respect to the input and the trainable parameters ( \( \beta \) and \( \gamma \)).
Please refer to the paper by Ioffe and Szegedy (https://arxiv.org/abs/1502.03167) for the equations of the gradients.
x_in | Input tensor (N-D) |
axis | Axis of the input tensor that stores the channel dimension. |
means | 1D vector with the means ( \( \mu_i \)) of every channel. |
vars | 1D vector with the variances ( \( \sigma^2_i \)) of every channel. |
betas | 1D vector with the offset parameters ( \( \beta_i \)) of every channel. |
gammas | 1D vector with the scaling parameters ( \( \gamma_i \)) of every channel. |
delta_out | Gradient calculated by the output layer for gradient backpropagation (N-D) |
eps | Small constant for numerical stability. |
delta_in | The resulting gradients of the input ( \( \mathrm{d}\mathcal{L} / \mathrm{d}x \)). |
d_betas | The resulting gradients of the \( \beta \) parameter ( \( \mathrm{d}\mathcal{L} / \mathrm{d}\beta \)). |
d_gammas | The resulting gradients of the \( \gamma \) parameter ( \( \mathrm{d}\mathcal{L} / \mathrm{d}\gamma \)). |
void aimath_f32_default_maxpool2d_bwd | ( | const aitensor_t * | delta_out, |
const uint16_t | pool_size[2], | ||
const uint16_t | stride[2], | ||
const uint16_t | padding[2], | ||
int8_t | channel_axis, | ||
void * | work_space, | ||
const uint32_t * | max_locations, | ||
aitensor_t * | delta_in | ||
) |
Calculates the gradients of the MaxPool2D layer with respect to the input in F32 data type.
Calculates the gradients with respect to the input \( delta_{in} = \mathrm{d} L / \mathrm{d} x_{in} \).
This is done by simply copying the output gradient to the position in the input gradients depicted by max_locations.
An element of max_locations consist of the concatenated 16-bit indices for height and width in the pooling window.
delta_out | Gradients backpropagated from the following layer with dimension \( [N,C,H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C] \) (channels last) |
pool_size | The size of the pooling window (height and width) |
stride | The stride in the direction of height and width. |
padding | The (symmetric) minus infinity padding in the direction of height and width |
channel_axis | Index of the channel axis (1 for channels first and -1 or 3 for channels last). |
work_space | Pointer to a work space buffer for intermediate results. |
max_locations | Pointer to memory section where the indices of the maximum values per pooling window are stored. |
delta_in | Resulting input gradients for backpropagation to the previous layer \( [N,C,H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C] \) (channels last) |
void aimath_f32_default_maxpool2d_fwd | ( | const aitensor_t * | input, |
const uint16_t | pool_size[2], | ||
const uint16_t | stride[2], | ||
const uint16_t | padding[2], | ||
int8_t | channel_axis, | ||
void * | work_space, | ||
uint32_t * | max_locations, | ||
aitensor_t * | output | ||
) |
2D max-pooling on 4D F32 tensors
Performs a 2D max-pooling operation on 2D slices of a 4D input tensor. This function is used as the forward pass of the MaxPool2D layer.
For training (max_locations != 0), the index of the max-value in the kernel window is be stored in max_locations.
An element of max_locations simply consist of the concatenated 16-bit indices for height and width in the pooling window.
The output dimensions of the height and width are given as:
\[ H_{out} = floor \left( \frac{H_{in} + 2 * P_h - H_{pool}}{S_h} \right) + 1 \]
\[ W_{out} = floor \left( \frac{W_{in} + 2 * P_w - W_{pool}}{S_w} \right) + 1 \]
input | Input data with dimension \( [N,C,H_{in},W_{in}] \) (channels first) or \( [N,H_{in},W_{in},C] \) (channels last) |
pool_size | The size of the pooling window (height and width) |
stride | The stride in the direction of height and width. |
padding | The (symmetric) minus infinity padding in the direction of height and width |
channel_axis | Index of the channel axis (1 for channels first and -1 or 3 for channels last). |
work_space | Pointer to a work space buffer for intermediate results. |
max_locations | Pointer to memory section where the indices of the maximum values per pooling window are stored. |
output | Output after max-pooling with dimension \( [N,C,H_{out},W_{out}] \) (channels first) or \( [N,H_{out},W_{out},C] \) (channels last) |
void aimath_f32_default_pad_zeros | ( | const aitensor_t * | x, |
const uint16_t(*) | padding[2], | ||
aitensor_t * | result | ||
) |
Pads a F32 tensor with zeros.
x | Input F32 tensor (N-D) |
padding | Array of the asymmetric zero paddings for each dimension of the input tensor |
result | Resulting padded F32 tensor (N-D) |