AIfES 2
2.0.0
|
General Batch Normalization layer structure. More...
#include <ailayer_batch_normalization.h>
Data Fields | |
ailayer_t | base |
Inherited field members from general ailayer struct. | |
Layer configuration | |
Required configuration parameters for the layer These fields have to be configured by the user before calling the initializer function. | |
int8_t | channel_axis |
Index of the channel axis (1 for channels first, -1 for channels last) | |
void * | momentum |
Momentum for the exponential moving average of means and variances. | |
void * | eps |
Small constant for numeric stability. | |
Trainable parameters | |
Data fields for the trainable parameters (beta, gamma) of the layer | |
aitensor_t | betas |
Vector of the shift parameters ( \( \beta_i \)). | |
aitensor_t | gammas |
Vector of the scale parameters ( \( \gamma_i \)). | |
aitensor_t | moving_means |
Vector of the moving averages of the means (required for inference). | |
aitensor_t | moving_variances |
Vector of the moving averages of the variance (required for inference). | |
aitensor_t * | trainable_params [2] |
Pointer to \( \beta \) and \( \gamma \) (which are the trainable parameters). | |
aitensor_t * | gradients [2] |
Gradients of \( \beta \) and \( \gamma \) for the back propagation algorithm. | |
void * | optimem [2] |
Memory field used by the trainings optimizer. | |
Variables for internal computation | |
These fields are automatically configured in the initializer function. | |
uint16_t | parameter_shape [1] |
Shape of the parameter vectors ( \( \beta, \gamma, \mu, \sigma^2 \)). | |
aitensor_t * | means |
Vector of the means ( \( \mu_i \)). | |
aitensor_t * | variances |
Vector of the variances ( \( \sigma^2_i \)). | |
Math functions | |
Required data type specific math functions | |
void(* | empirical_mean_channelwise )(const aitensor_t *x, int8_t channel_axis, aitensor_t *means) |
Required math function: Channel-wise empirical mean calculation. More... | |
void(* | empirical_variance_channelwise )(const aitensor_t *x, int8_t channel_axis, const aitensor_t *means, aitensor_t *variances) |
Required math function: Channel-wise empirical variance calculation. More... | |
void(* | exponential_moving_average )(const aitensor_t *new_data, const void *momentum, aitensor_t *average) |
Required math function: Exponential moving average. More... | |
void(* | batch_norm )(const aitensor_t *x, int8_t channel_axis, const aitensor_t *means, const aitensor_t *variances, const aitensor_t *offsets, const aitensor_t *scales, const void *eps, aitensor_t *result) |
Required math function: Batch Normalization. More... | |
void(* | d_batch_norm )(const aitensor_t *x_in, int8_t axis, const aitensor_t *means, const aitensor_t *vars, const aitensor_t *betas, const aitensor_t *gammas, const aitensor_t *delta_out, const void *eps, aitensor_t *delta_in, aitensor_t *d_betas, aitensor_t *d_gammas) |
Required math function: Gradients of Batch Normalization. More... | |
General Batch Normalization layer structure.
void(* batch_norm) (const aitensor_t *x, int8_t channel_axis, const aitensor_t *means, const aitensor_t *variances, const aitensor_t *offsets, const aitensor_t *scales, const void *eps, aitensor_t *result) |
Required math function: Batch Normalization.
Requires a math function that performs Batch Normalization:
\[ y_{i,j} = \mathit{BN}(x_{i,j}) = \gamma_i \cdot \frac{x_{i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]
x | Input tensor. |
channel_axis | Axis of the input tensor that stores the channel dimension. |
means | Vector with the means ( \( \mu_i \)) of every channel. |
variances | Vector with the variances ( \( \sigma^2_i \)) of every channel. |
offsets | Vector with the offset parameters ( \( \beta_i \)) of every channel. |
scales | Vector with the scaling parameters ( \( \gamma_i \)) of every channel. |
eps | Small constant for numerical stability. |
result | The resulting normalized tensor. |
void(* d_batch_norm) (const aitensor_t *x_in, int8_t axis, const aitensor_t *means, const aitensor_t *vars, const aitensor_t *betas, const aitensor_t *gammas, const aitensor_t *delta_out, const void *eps, aitensor_t *delta_in, aitensor_t *d_betas, aitensor_t *d_gammas) |
Required math function: Gradients of Batch Normalization.
Requires a math function that calculates the derivative of the Batch Normalization with respect to the input and the trainable parameters ( \( \beta \) and \( \gamma \)).
Please refer to the paper by Ioffe and Szegedy (https://arxiv.org/abs/1502.03167) for the equations of the gradients.
x | Input tensor. |
channel_axis | Axis of the input tensor that stores the channel dimension. |
means | Vector with the means ( \( \mu_i \)) of every channel. |
variances | Vector with the variances ( \( \sigma^2_i \)) of every channel. |
betas | Vector with the offset parameters ( \( \beta_i \)) of every channel. |
gammas | Vector with the scaling parameters ( \( \gamma_i \)) of every channel. |
delta_out | Gradient calculated by the output layer for gradient backpropagation. |
eps | Small constant for numerical stability. |
delta_in | The resulting gradients of the input ( \( \mathrm{d}\mathcal{L} / \mathrm{d}x \)). |
d_betas | The resulting gradients of the \( \beta \) parameter ( \( \mathrm{d}\mathcal{L} / \mathrm{d}\beta \)). |
d_gammas | The resulting gradients of the \( \gamma \) parameter ( \( \mathrm{d}\mathcal{L} / \mathrm{d}\gamma \)). |
void(* empirical_mean_channelwise) (const aitensor_t *x, int8_t channel_axis, aitensor_t *means) |
Required math function: Channel-wise empirical mean calculation.
Requires a math function that calculates the empirical mean for each channel of the given axis:
\[ means_i = \frac{1}{m} \sum_{j=1}^{m} x_{i,j} \]
x | Input tensor |
channel_axis | Axis of the input tensor that stores the channel dimension. |
means | Resulting mean vector (1D) |
void(* empirical_variance_channelwise) (const aitensor_t *x, int8_t channel_axis, const aitensor_t *means, aitensor_t *variances) |
Required math function: Channel-wise empirical variance calculation.
Requires a math function that calculates the empirical variance for each channel of the given axis:
\[ variances_i = \frac{1}{m} \sum_{j=1}^{m} (x_{i,j} - \mu_i)^2 \]
x | Input tensor |
channel_axis | Axis of the input tensor that stores the channel dimension. |
means | Channel-wise mean values (1D) |
variances | Resulting variance vector (1D) |
void(* exponential_moving_average) (const aitensor_t *new_data, const void *momentum, aitensor_t *average) |
Required math function: Exponential moving average.
Requires a math function that updates the moving average with a new data point:
\[ average \leftarrow momentum \cdot average + (1 - momentum) \cdot newdata \]
new_data | Input tensor with the new data point. |
momentum | aiscalar_t which controls the momentum of the average (range [0, 1]). |
average | The average that is modified (input and output value), |