Mixed Precision Trainings¶
DynamicLossScalingUpdater¶
- class nnabla.experimental.mixed_precision_training.DynamicLossScalingUpdater(solver, loss, data_feeder=<function DynamicLossScalingUpdater.<lambda>>, scale=8.0, scaling_factor=2.0, N=2000, clear_buffer=True, accum_grad=1, weight_decay=None, comm=None, grads=[])[ソース]¶
Dynamic Loss Scaling Updater for the mixed precision training.
- パラメータ
solver (
nnabla.solvers.Solver
) -- Solver object. E.g., Momentum or Adam.loss (
nnabla.Variable
) -- Loss variable from which the forward and the backward is called.data_feeder (callable
object
, function, or lambda) -- Data feederscale (
float
) -- Loss scale constant. This is dynamically changing during training.scaling_factor (
float
) -- Scaling factor for the dynamic loss scaling.N (
int
) -- Interval, the number of iterations in training for increasingloss scale
byscaling_factor
.clear_buffer (
bool
) -- Clears the no longer referenced variables during backpropagation to save memory.accum_grad (
int
) -- Number of accumulation of gradients. Update method of the Solver is called after theaccum_grad
number of the forward and backward is called.weight_decay (
float
) -- Decay constant. Default isNone
, not applying the weight decay.comm (
nnabla.communicators.Communicator
) -- Communicator when to do distributed training. Default isNone
.grads (
list
ofnnabla.NdArray
) -- The list of gradients to be exchanged when to do distributed training. Default is the emptylist
.
- solver¶
Solver object. E.g., Momentum or Adam.
- loss¶
Loss variable from which the forward and the backward is called.
- Type
- N¶
Interval, the number of iterations in training for increasing
loss scale
byscaling_factor
.- Type
- clear_buffer¶
Clears the no longer referenced variables during backpropagation to save memory.
- Type
- accum_grad¶
Number of accumulation of gradients. Update method of the Solver is called after the
accum_grad
number of the forward and backward is called.- Type
- comm¶
Communicator when to do distributed training.
- grads¶
The list of gradients to be exchanged when to do distributed training.
- Type
サンプル
Reference: