Quantization Aware Training¶
QATConfig¶
Configuration for quantization aware training.
- class nnabla.utils.qnn.QATConfig[ソース]¶
ベースクラス:
object
- class RecorderPosition(value)[ソース]¶
ベースクラス:
enum.Enum
Position to add recorder for function.
- BEFORE = 0¶
Add recoder only before a function
- BOTH = 1¶
Add recoder before/after a function
- class RoundingMethod(value)[ソース]¶
ベースクラス:
enum.Enum
Round method of scale
- CEIL = 'CEIL'¶
round up. e.g. ceil(9.4) = 10
- FLOOR = 'FLOOR'¶
round down. e.g. floor(9.5) = 9
- NOTROUND = 'NOTROUND'¶
not round
- ROUND = 'ROUND'¶
round. e.g. round(9.4) = 9, round(9.5) = 10
- bn_folding = False¶
Enable Batch Normalization Folding. Note that sometimes this can cause the training become unstable.
- bn_self_folding = False¶
Enable Batch Normalization Self-Folding. Note that sometimes this can cause the training become unstable.
- channel_last = False¶
Enable channel last (channel_first is only supported now)
- channel_wise = False¶
Enable channel-wise quantization
- ext_name = 'cudnn'¶
Extension Context. 'cpu', 'cuda' or 'cudnn'
- learning_rate_scale = 0.1¶
QAT Learning_rate = NonQNN Learning_rate * learning_rate_scale. Recommend setting it to 0.1 or 0.01
- narrow_range = False¶
Narrow the lower-bound (e.g., when in int8, -128 -> -127)
- niter_to_recording = 0¶
Step start to record
- niter_to_training = -1¶
Step start to QAT. The number of steps between recording and training should be greater than the number of steps of one epoch training.
- pow2 = 'ROUND'¶
Member of
nnabla.utils.qnn.QATConfig.RoundingMethod
. Round the scale to power-of-2. If you want to deploy the model with tensorrt, please enable this.
- record_layers = []¶
list of nnabla function names. Recording layers. If empty, add recoders to all layers. Otherwise, only add recoders to functions in record_layers.
- recorder_activation¶
One of
nnabla.utils.qnn.MinMaxMinMaxRecorderCallback
,nnabla.utils.qnn.AbsMaxRecorderCallback
,nnabla.utils.qnn.MinMaxMvaRecorderCallback
,nnabla.utils.qnn.MaxMaxRecorderCallback
,nnabla.utils.qnn.MaxMvaRecorderCallback
Recorder of activation:py:class:`nnabla.utils.qnn.MaxMvaRecorderCallback`の別名です。
- recorder_position = 0¶
Member of
nnabla.utils.qnn.QATConfig.RecorderPosition
. Recorder position
- recorder_weight¶
One of
nnabla.utils.qnn.MinMaxMinMaxRecorderCallback
,nnabla.utils.qnn.AbsMaxRecorderCallback
,nnabla.utils.qnn.MinMaxMvaRecorderCallback
,nnabla.utils.qnn.MaxMaxRecorderCallback
,nnabla.utils.qnn.MaxMvaRecorderCallback
Recorder of weight:py:class:`nnabla.utils.qnn.MinMaxMinMaxRecorderCallback`の別名です。
- round_mode = 'HALF_TO_EVEN'¶
Round mode of quantize layer
- skip_bias = False¶
Skip quantizing bias of Affine and bias of the Convolution function family
- skip_inputs_layers = ['Convolution', 'Deconvolution']¶
List of nnabla function name. Skip quantizing inputs layers of network
- skip_outputs_layers = ['Affine']¶
List of nnabla function name. Skip quantizing outputs layers of network
- zero_point = False¶
Use zero-point (asymmetric) or not use (symmetric)
QATTensorRTConfig¶
The default quantization aware training Configuration that meets the requirements of TensorRT.
QATScheduler¶
- class nnabla.utils.qnn.QATScheduler(config=<nnabla.utils.qnn.QATTensorRTConfig object>, solver=None)[ソース]¶
ベースクラス:
object
Scheduler for quantization aware training.
- パラメータ
config (
QATConfig
) -- Quantization-Aware-Training Configurationsolver (
nnabla.solver.Solver
) -- Neural Network Solver
サンプル
from nnabla.utils.qnn import QATScheduler, QATConfig, PrecisionMode # Set configuration config = QATConfig() config.bn_folding = True config.bn_self_folding = True config.channel_last = False config.precision_mode = PrecisionMode.SIM_QNN config.niter_to_recording = 1 config.niter_to_training = 500 qat_scheduler = QATScheduler(config=config, solver=solver) # convert graph to enable quantization aware training. qat_scheduler(pred) # pred is the output variable of training network qat_scheduler(vpred, training=False) # vpred is the output variable of evaluation network # Training loop for i in range(training_step): qat_scheduler.step() # Your training code here # save quantized nnp qat_scheduler.save('qnn.np', vimage, deploy=False) # vimage is the input variable of network
- save(fname, inputs, batch_size=1, net_name='net', deploy=False)[ソース]¶
Save QAT network model to NNP file as default.
- パラメータ
fname (str) -- NNP file name.
inputs (
nnabla.Variable
or list ofnnabla.Variable
) -- Network inputs variables.batch_size (int) -- batch size.
net_name (str) -- network name.
deploy (bool) -- Whether to apply QNN deployment conversion. deploy=True is not supported yet.
- 戻り値
None