AIfES 2  2.0.0
aiopti_sgd.h File Reference

Base optimizer implementation of the Stochastic Gradient Descent (with momentum) optimizer. More...

Go to the source code of this file.

Data Structures

struct  aiopti_sgd
 General Stochastic Gradient Descent (SGD) optimizer struct. More...
 

Typedefs

typedef struct aiopti_sgd aiopti_sgd_t
 New data type name for code reduction.
 

Functions

aiopti_taiopti_sgd (aiopti_sgd_t *opti)
 Initialize the given SGD optimizer. More...
 
uint32_t aiopti_sgd_sizeof_optimem_with_momentum (aiopti_t *self, const aitensor_t *params)
 Calculates the required memory for the optimization step when the momentum is not zero. More...
 
uint32_t aiopti_sgd_sizeof_optimem_without_momentum (aiopti_t *self, const aitensor_t *params)
 Calculates the required memory for the optimization step when the momentum is zero. More...
 
void aiopti_sgd_init_optimem_with_momentum (aiopti_t *self, const aitensor_t *params, const aitensor_t *gradients, void *optimem)
 Initialization of the optimization memory buffer when the momentum is not zero. More...
 
void aiopti_sgd_init_optimem_without_momentum (aiopti_t *self, const aitensor_t *params, const aitensor_t *gradients, void *optimem)
 Initialization of the optimization memory buffer when the momentum is zero. More...
 
void aiopti_sgd_zero_gradients (aiopti_t *self, aitensor_t *gradients)
 Set the gradients to zero. More...
 
void aiopti_sgd_update_params_with_momentum (aiopti_t *self, aitensor_t *params, const aitensor_t *gradients, void *optimem)
 Update the given parameter tensor with respect to the gradients when the momentum is not zero. More...
 
void aiopti_sgd_update_params_without_momentum (aiopti_t *self, aitensor_t *params, const aitensor_t *gradients, void *optimem)
 Update the given parameter tensor with respect to the gradients when the momentum is zero. More...
 
void aiopti_sgd_print_specs (const aiopti_t *self)
 Print the optimizer specification. More...
 

Variables

const aicore_optitype_taiopti_sgd_type
 SGD optimizer type. More...
 

Detailed Description

Base optimizer implementation of the Stochastic Gradient Descent (with momentum) optimizer.

Version
2.2.0

This is an "abstract" data-type independent implementation. To use the optimizer, use one of the provided implementations for a specific hardware and data-type (for example from aiopti_sgd_default.h) or set the required math functions on your own.

The Stochastic Gradient Descent (SGD) optimizer is the most basic optimizer in backpropagation. It uses the pre-calculated gradients to optimize the given parameters. In addition, a momentum term can be configured.
For every parameter \( p \) of the parameters to optimize (trainable parameters) and the related gradient \( g \) it calculates

\[ p_t = p_{t-1} - lr \cdot g_t \]

if the momentum \( \mu = 0 \) and

\[ v_t = \mu \cdot v_{t-1} + g_t \]

\[ p_t = p_{t-1} - lr \cdot v_t \]

if the momentum \( \mu \neq 0 \) in every optimization step.
\( lr \) is the learning rate that defines how big the optimization steps should be, and therefore how fast the training will be. \( v \) is the momentum term or velocity related to the parameter and must be stored in the optimization memory for every parameter when momentum is set.

Function Documentation

◆ aiopti_sgd()

Initialize the given SGD optimizer.

This function represents the "constructor" of the abstract SGD optimizer.
This function is not intended to call it directly. Instead use one of the data type specific implementations (like for example aiopti_sgd_f32_default()).

Parameters
*optiThe optimizer to initialize.
Returns
Pointer to the (successfully) initialized general optimizer structure (aiopti_sgd.base)

◆ aiopti_sgd_init_optimem_with_momentum()

void aiopti_sgd_init_optimem_with_momentum ( aiopti_t self,
const aitensor_t params,
const aitensor_t gradients,
void *  optimem 
)

Initialization of the optimization memory buffer when the momentum is not zero.

Implementation of aiopti.init_optimem.

Initialize the velocity tensor with zeros:

\[ v_{0,i} \leftarrow 0 \]

Used math functions:

Parameters
*selfThe optimizer
*paramsThe tensor of trainable parameters
*gradientsThe gradients associated to the parameters
*optimemThe optimization memory (containing the velocities) associated to the parameters

◆ aiopti_sgd_init_optimem_without_momentum()

void aiopti_sgd_init_optimem_without_momentum ( aiopti_t self,
const aitensor_t params,
const aitensor_t gradients,
void *  optimem 
)

Initialization of the optimization memory buffer when the momentum is zero.

Implementation of aiopti.init_optimem.

Does nothing because no optimization memory is needed in this case.

Parameters
*selfThe optimizer
*paramsThe tensor of trainable parameters
*gradientsThe gradients associated to the parameters
*optimemThe optimization memory associated to the parameters

◆ aiopti_sgd_print_specs()

void aiopti_sgd_print_specs ( const aiopti_t self)

Print the optimizer specification.

Parameters
*selfThe optimizer to print the specification for

◆ aiopti_sgd_sizeof_optimem_with_momentum()

uint32_t aiopti_sgd_sizeof_optimem_with_momentum ( aiopti_t self,
const aitensor_t params 
)

Calculates the required memory for the optimization step when the momentum is not zero.

Implementation of aiopti.sizeof_optimem.

Calculates the size of the memory space that must be reserved. The memory is used for the velocity tensor (momentum term) and is calculated by:

sizeof(aitensor) + sizeof(params.data)
A tensor in AIfES.
Definition: aifes_math.h:89
Parameters
*selfThe optimizer
*paramsThe tensor of trainable parameters to calculate the memory for

◆ aiopti_sgd_sizeof_optimem_without_momentum()

uint32_t aiopti_sgd_sizeof_optimem_without_momentum ( aiopti_t self,
const aitensor_t params 
)

Calculates the required memory for the optimization step when the momentum is zero.

Implementation of aiopti.sizeof_optimem.

Calculates the size of the memory space that must be reserved. The required memory is zero because no velocity term is needed.

Parameters
*selfThe optimizer
*paramsThe tensor of trainable parameters to calculae the memory for

◆ aiopti_sgd_update_params_with_momentum()

void aiopti_sgd_update_params_with_momentum ( aiopti_t self,
aitensor_t params,
const aitensor_t gradients,
void *  optimem 
)

Update the given parameter tensor with respect to the gradients when the momentum is not zero.

Implementation of aiopti.update_params.

Calculate and update the values of the trainable parameters (perform one update step):

\[ v_t \leftarrow \mu \cdot v_{t-1} + g_t \]

\[ p_t \leftarrow p_{t-1} - lr \cdot v_t \]

\( v \): Velocity tensor (momentum term), stored in the optimem
\( p \): Tensor of trainable parameters to update (params)
\( g \): Gradients
\( lr \): Learning rate / Optimization step size
\( \mu \): Momentum

Used math functions:

Parameters
*selfThe optimizer
*paramsThe tensor of trainable parameters \( p \) to update
*gradientsThe gradients \( g \) associated to the parameters
*optimemThe buffer to store the velocity \( v \)

◆ aiopti_sgd_update_params_without_momentum()

void aiopti_sgd_update_params_without_momentum ( aiopti_t self,
aitensor_t params,
const aitensor_t gradients,
void *  optimem 
)

Update the given parameter tensor with respect to the gradients when the momentum is zero.

Implementation of aiopti.update_params.

Calculate and update the values of the trainable parameters (perform one update step):

\[ p_t \leftarrow p_{t-1} - lr \cdot g_t \]

Used math functions:

  • aiopti_sgd.scalar_mul
  • aiopti_sgd.tensor_sub

    \( p \): Tensor of trainable parameters to update (params)
    \( g \): Gradients
    \( lr \): Learning rate / Optimization step size

    Parameters
    *selfThe optimizer
    *paramsThe tensor of trainable parameters \( p \) to update
    *gradientsThe gradients \( g \) associated to the parameters
    *optimemNot required because no velocity is stored

◆ aiopti_sgd_zero_gradients()

void aiopti_sgd_zero_gradients ( aiopti_t self,
aitensor_t gradients 
)

Set the gradients to zero.

Implementation of aiopti.zero_gradients.

\[ g_{i} \leftarrow 0 \]

Used math functions:

Parameters
*selfThe optimizer
*gradientsThe gradients to set to zero

Variable Documentation

◆ aiopti_sgd_type

const aicore_optitype_t* aiopti_sgd_type
extern

SGD optimizer type.

Defines the type of the optimizer (for example for type checks and debug prints). See aicore_optitype for more information about the optimizer type.