AIfES 2
2.0.0
|
AIfES Express functions for weights with Q7 (int8) data type. More...
Go to the source code of this file.
Functions | |
uint32_t | AIFES_E_flat_weights_number_fnn_q7 (uint32_t *fnn_structure, uint32_t layer_count) |
Calculates the required length of the uint8_t array for the FNN. More... | |
int8_t | AIFES_E_quantisation_fnn_f32_to_q7 (aitensor_t *representative_f32_dataset, AIFES_E_model_parameter_fnn_f32 *AIFES_E_fnn, uint8_t *q7_parameter_dataset) |
Quantizes the weights of an F32 FNN into a Q7 FNN. More... | |
int8_t | AIFES_E_inference_fnn_q7 (aitensor_t *input_tensor, AIFES_E_model_parameter_fnn_f32 *AIFES_E_fnn, aitensor_t *output_tensor) |
Executes the inference of a Q7 FNN. More... | |
AIfES Express functions for weights with Q7 (int8) data type.
AIfES Express is a beginner friendly high-level API of AIfES. This file contains all necessary functions for neural networks with int8 weights.
uint32_t AIFES_E_flat_weights_number_fnn_q7 | ( | uint32_t * | fnn_structure, |
uint32_t | layer_count | ||
) |
Calculates the required length of the uint8_t array for the FNN.
Contains the number of weights and additionally the necessary parameters for the fixpoint shifting
*fnn_structure | The FNN structure |
layer_count | Number of layers |
int8_t AIFES_E_inference_fnn_q7 | ( | aitensor_t * | input_tensor, |
AIFES_E_model_parameter_fnn_f32 * | AIFES_E_fnn, | ||
aitensor_t * | output_tensor | ||
) |
Executes the inference of a Q7 FNN.
Requires the input tensor, the FNN model parameters and an output tensor for the results. Use here as flat_weights the q7_parameter_dataset calculated from quantization The function takes float data as input tensor and converts it to Q7 format. The inference is performed with the Q7 FNN. For the output, the results are again converted as float and output via the output tensor.
Possible returns:
Example:
*input_tensor | Tensor with the inputs |
*AIFES_E_fnn | The FNN model parameters |
*output_tensor | Tensor for the results |
int8_t AIFES_E_quantisation_fnn_f32_to_q7 | ( | aitensor_t * | representative_f32_dataset, |
AIFES_E_model_parameter_fnn_f32 * | AIFES_E_fnn, | ||
uint8_t * | q7_parameter_dataset | ||
) |
Quantizes the weights of an F32 FNN into a Q7 FNN.
The representative dataset should cover the min max values of the inputs. The dataset is needed for the calculation of the shift and the zero point of each layer.
** The Q7 quantized buffer can only be used with architectures with the same AIFES_MEMORY_ALIGNMENT. The inference might crash if you export the buffer to architectures with different AIFES_MEMORY_ALIGNMENT. **
Possible returns:
Example:
*representative_f32_dataset | Tensor with the inputs |
*AIFES_E_fnn | The FNN model parameters |
*q7_parameter_dataset | Tensor for the results |