Mixed Precision Trainings

DynamicLossScalingUpdater

class nnabla.experimental.mixed_precision_training.DynamicLossScalingUpdater(solver, loss, data_feeder=<function DynamicLossScalingUpdater.<lambda>>, scale=8.0, scaling_factor=2.0, N=2000, clear_buffer=True, accum_grad=1, weight_decay=None, comm=None, grads=[])[source]

Dynamic Loss Scaling Updater for the mixed precision training.

Parameters:

solver (nnabla.solvers.Solver) – Solver object. E.g., Momentum or Adam.
loss (nnabla.Variable) – Loss variable from which the forward and the backward is called.
data_feeder (callable object, function, or lambda) – Data feeder
scale (float) – Loss scale constant. This is dynamically changing during training.
scaling_factor (float) – Scaling factor for the dynamic loss scaling.
N (int) – Interval, the number of iterations in training for increasing loss scale by scaling_factor.
clear_buffer (bool) – Clears the no longer referenced variables during backpropagation to save memory.
accum_grad (int) – Number of accumulation of gradients. Update method of the Solver is called after the accum_grad number of the forward and backward is called.
weight_decay (float) – Decay constant. Default is None, not applying the weight decay.
comm (nnabla.communicators.Communicator) – Communicator when to do distributed training. Default is None.
grads (list of nnabla.NdArray) – The list of gradients to be exchanged when to do distributed training. Default is the empty list.