![]() |
AIfES 2
2.0.0
|
General Batch Normalization layer structure. More...
#include <ailayer_batch_normalization.h>
Data Fields | |
| ailayer_t | base |
| Inherited field members from general ailayer struct. | |
Layer configuration | |
Required configuration parameters for the layer These fields have to be configured by the user before calling the initializer function. | |
| int8_t | channel_axis |
| Index of the channel axis (1 for channels first, -1 for channels last) | |
| void * | momentum |
| Momentum for the exponential moving average of means and variances. | |
| void * | eps |
| Small constant for numeric stability. | |
Trainable parameters | |
Data fields for the trainable parameters (beta, gamma) of the layer | |
| aitensor_t | betas |
| Vector of the shift parameters ( \( \beta_i \)). | |
| aitensor_t | gammas |
| Vector of the scale parameters ( \( \gamma_i \)). | |
| aitensor_t | moving_means |
| Vector of the moving averages of the means (required for inference). | |
| aitensor_t | moving_variances |
| Vector of the moving averages of the variance (required for inference). | |
| aitensor_t * | trainable_params [2] |
| Pointer to \( \beta \) and \( \gamma \) (which are the trainable parameters). | |
| aitensor_t * | gradients [2] |
| Gradients of \( \beta \) and \( \gamma \) for the back propagation algorithm. | |
| void * | optimem [2] |
| Memory field used by the trainings optimizer. | |
Variables for internal computation | |
These fields are automatically configured in the initializer function. | |
| uint16_t | parameter_shape [1] |
| Shape of the parameter vectors ( \( \beta, \gamma, \mu, \sigma^2 \)). | |
| aitensor_t * | means |
| Vector of the means ( \( \mu_i \)). | |
| aitensor_t * | variances |
| Vector of the variances ( \( \sigma^2_i \)). | |
Math functions | |
Required data type specific math functions | |
| void(* | empirical_mean_channelwise )(const aitensor_t *x, int8_t channel_axis, aitensor_t *means) |
| Required math function: Channel-wise empirical mean calculation. More... | |
| void(* | empirical_variance_channelwise )(const aitensor_t *x, int8_t channel_axis, const aitensor_t *means, aitensor_t *variances) |
| Required math function: Channel-wise empirical variance calculation. More... | |
| void(* | exponential_moving_average )(const aitensor_t *new_data, const void *momentum, aitensor_t *average) |
| Required math function: Exponential moving average. More... | |
| void(* | batch_norm )(const aitensor_t *x, int8_t channel_axis, const aitensor_t *means, const aitensor_t *variances, const aitensor_t *offsets, const aitensor_t *scales, const void *eps, aitensor_t *result) |
| Required math function: Batch Normalization. More... | |
| void(* | d_batch_norm )(const aitensor_t *x_in, int8_t axis, const aitensor_t *means, const aitensor_t *vars, const aitensor_t *betas, const aitensor_t *gammas, const aitensor_t *delta_out, const void *eps, aitensor_t *delta_in, aitensor_t *d_betas, aitensor_t *d_gammas) |
| Required math function: Gradients of Batch Normalization. More... | |
General Batch Normalization layer structure.
| void(* batch_norm) (const aitensor_t *x, int8_t channel_axis, const aitensor_t *means, const aitensor_t *variances, const aitensor_t *offsets, const aitensor_t *scales, const void *eps, aitensor_t *result) |
Required math function: Batch Normalization.
Requires a math function that performs Batch Normalization:
\[ y_{i,j} = \mathit{BN}(x_{i,j}) = \gamma_i \cdot \frac{x_{i,j} - \mu_{i}}{\sqrt{\sigma_{i}^2+\epsilon}} + \beta_i \]
| x | Input tensor. |
| channel_axis | Axis of the input tensor that stores the channel dimension. |
| means | Vector with the means ( \( \mu_i \)) of every channel. |
| variances | Vector with the variances ( \( \sigma^2_i \)) of every channel. |
| offsets | Vector with the offset parameters ( \( \beta_i \)) of every channel. |
| scales | Vector with the scaling parameters ( \( \gamma_i \)) of every channel. |
| eps | Small constant for numerical stability. |
| result | The resulting normalized tensor. |
| void(* d_batch_norm) (const aitensor_t *x_in, int8_t axis, const aitensor_t *means, const aitensor_t *vars, const aitensor_t *betas, const aitensor_t *gammas, const aitensor_t *delta_out, const void *eps, aitensor_t *delta_in, aitensor_t *d_betas, aitensor_t *d_gammas) |
Required math function: Gradients of Batch Normalization.
Requires a math function that calculates the derivative of the Batch Normalization with respect to the input and the trainable parameters ( \( \beta \) and \( \gamma \)).
Please refer to the paper by Ioffe and Szegedy (https://arxiv.org/abs/1502.03167) for the equations of the gradients.
| x | Input tensor. |
| channel_axis | Axis of the input tensor that stores the channel dimension. |
| means | Vector with the means ( \( \mu_i \)) of every channel. |
| variances | Vector with the variances ( \( \sigma^2_i \)) of every channel. |
| betas | Vector with the offset parameters ( \( \beta_i \)) of every channel. |
| gammas | Vector with the scaling parameters ( \( \gamma_i \)) of every channel. |
| delta_out | Gradient calculated by the output layer for gradient backpropagation. |
| eps | Small constant for numerical stability. |
| delta_in | The resulting gradients of the input ( \( \mathrm{d}\mathcal{L} / \mathrm{d}x \)). |
| d_betas | The resulting gradients of the \( \beta \) parameter ( \( \mathrm{d}\mathcal{L} / \mathrm{d}\beta \)). |
| d_gammas | The resulting gradients of the \( \gamma \) parameter ( \( \mathrm{d}\mathcal{L} / \mathrm{d}\gamma \)). |
| void(* empirical_mean_channelwise) (const aitensor_t *x, int8_t channel_axis, aitensor_t *means) |
Required math function: Channel-wise empirical mean calculation.
Requires a math function that calculates the empirical mean for each channel of the given axis:
\[ means_i = \frac{1}{m} \sum_{j=1}^{m} x_{i,j} \]
| x | Input tensor |
| channel_axis | Axis of the input tensor that stores the channel dimension. |
| means | Resulting mean vector (1D) |
| void(* empirical_variance_channelwise) (const aitensor_t *x, int8_t channel_axis, const aitensor_t *means, aitensor_t *variances) |
Required math function: Channel-wise empirical variance calculation.
Requires a math function that calculates the empirical variance for each channel of the given axis:
\[ variances_i = \frac{1}{m} \sum_{j=1}^{m} (x_{i,j} - \mu_i)^2 \]
| x | Input tensor |
| channel_axis | Axis of the input tensor that stores the channel dimension. |
| means | Channel-wise mean values (1D) |
| variances | Resulting variance vector (1D) |
| void(* exponential_moving_average) (const aitensor_t *new_data, const void *momentum, aitensor_t *average) |
Required math function: Exponential moving average.
Requires a math function that updates the moving average with a new data point:
\[ average \leftarrow momentum \cdot average + (1 - momentum) \cdot newdata \]
| new_data | Input tensor with the new data point. |
| momentum | aiscalar_t which controls the momentum of the average (range [0, 1]). |
| average | The average that is modified (input and output value), |