class nbla::INQAffine

template<typename T, typename T1>
class INQAffine : public nbla::BaseFunction<int, int, const vector<int>&, const string&, int>

This function implements an INQ affine layer.

During training, the weights are sequentially quantized to power-of-two values, which allows the training of a multiplierless network.

Using `inq_iterations`, one can specify after how many forward passes half of the learnable weights are fixed and quantized to powers-of-two. After reaching the last value in `inq_iterations`, all weights are fixed.

Please note that the weights are quantized in the forward pass. Therefore, in order to make sure that we only have power-of-two values, one needs to do a final call to `forward` as the weights might have been updated by the solver.

For more details, please refer to the reference.

Reference: Zhou A, Yao A, Guo Y, Xu L, Chen Y. Incremental network quantization: Towards lossless CNNs with low-precision weights. https://arxiv.org/abs/1702.03044

Inputs ( \(B\) is base_axis):

  • Input N-D array with shape ( \(M_0 \times ... \times M_{B-1} \times D_B \times ... \times D_N\)). Dimensions before and after base_axis are flattened as if it is a matrix.

  • Weight matrix with shape ( \((D_B \times ... \times D_N) \times L\))

  • Indicator matrix with shape ( \((D_B \times ... \times D_N) \times L\)) where `0` indicates learnable weights and `1` indicates fixed weights

  • (optional) Bias vector ( \(L\))

Outputs:

  • \((B + 1)\)-D array. ( \( M_0 \times ... \times M_{B-1} \times L \))

Template Parameters:

T – Data type for computation.

Param base_axis:

Base axis of BinaryConnectAffine operation. Dimensions up to base_axis is treated as sample dimension.

Param num_bits:

Number of bits per weight. Needs to be >= 2.

Param inq_iterations:

Vector of integer values which give after how many forward passes we fix 50% of the learnable weights.

Param selection_algorithm:

Chooses algorithm for selection of weights that we want to fix (“largest_abs” … fix weights with largest absolute value, “random” … fix all learnable weights randomly with a probability of 50%)

Param seed:

Random seed

Public Functions

inline virtual shared_ptr<Function> copy() const

Copy another instance of Function with the same context.

inline virtual vector<dtypes> in_types()

Get input dtypes.

Last in_type will be used repeatedly if size of in_types is smaller than size of inputs

inline virtual vector<dtypes> out_types()

Get output dtypes.

Last out_type will be used repeatedly if size of out_types is smaller than size of outputs

inline virtual int min_inputs()

Get minimum number of inputs.

This is meant to be used in setup function with in_types which is used to get maximum number of inputs.

inline virtual int min_outputs()

Get minimum number of outputs.

This is meant to be used in setup function with out_types which is used to get max number of outputs.

inline virtual string name()

Get function name in string.

inline virtual vector<string> allowed_array_classes()

Get array classes that are allowed to be specified by Context.

inline virtual bool grad_depends_output_data(int i, int o) const

Dependency flag for checking if in-grad depends on out-data.

Checking if i-th input’ gradient computation requires o-th output’s data or not.

Note

If any of inputs requires an output variable data when computing its gradient, this function must be overridden to return appropriate boolean value. Otherwise, backward computation will be incorrect.

Parameters:
  • i[in] Input variable index.

  • o[in] Output variable index.