Functions

All NNabla functions are derived from the nnabla.function.Function class.

Function

class nnabla.function.Function

Function interface class.

Instances of nnabla.function.Function are not directly created by users. It is indirectly created by the functions available in nnabla.functions. These functions return nnabla.Variable (s) holding the created function instance as the parent property.

backward(self, inputs, outputs, accum=None)
forward(self, inputs, outputs)
grad_depends_output_data(self, int i, int o)
info

object

Type:info
inplace_data(self, int i)
inplace_data_with(self, int i)
inplace_grad(self, int i)
inplace_grad_with(self, int i)
min_outputs(self)
setup(self, inputs, outputs)
tags

Experimental

Get tags of the function.

List of Functions

The nnabla.functions module provides various types of functions listed below. These functions takes input nnabla.Variable (s) as its leading argument(s), followed by options specific to each function.

Note:
The functions can also take NdArray (s) as output(s) holding output values of the operation. We call this “Imperative Mode” (NdArray + Functions).

Neural Network Layers

nnabla.functions.affine(x, weight, bias=None, base_axis=1, n_outputs=-1, outputs=None)[source]

Affine layer, also called as the fully connected layer. It calculates:

\[{\mathbf y} = {\mathbf A} {\mathbf x} + {\mathbf b}.\]

where \({\mathbf x}\) is the input and \({\mathbf y}\) is the output.

Parameters:
  • x (Variable) – Input N-D array with shape (\(M_0 \times ... \times M_{B-1} \times D_B \times ... \times D_N\)). Dimensions before and after base_axis are flattened as if it is a matrix.
  • weight (Variable) – Weight matrix with shape (\((D_B \times ... \times D_N) \times L_{0} \times \ldots \times L_{I}\)) [parameter]
  • bias (Variable) – Bias vector (\(L_{0} \times \ldots \times L_{I}\)) [optional][parameter]
  • base_axis (int) – Base axis of Affine operation. Dimensions up to base_axis is treated as sample dimension. [default=``1``]
Returns:

\((B + 1)\)-D array. (\(M_0 \times ... \times M_{B-1} \times L_{0} \times \ldots \times L_{I}\))

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.convolution(x, weight, bias=None, base_axis=1, pad=None, stride=None, dilation=None, group=1, channel_last=False, n_outputs=-1, outputs=None)[source]

N-D Convolution with bias.

See references for dilated convolution (a.k.a. atrous convolution).

References

Note

Convolution is a computationally intensive operation that should preferrably be run with the cudnn backend. NNabla then uses CuDNN library functions to determine and cache the fastest algorithm for the given set of convolution parameters, which results in additional memory consumption which may pose a problem for GPUs with insufficient memory size. In that case, the NNABLA_CUDNN_WORKSPACE_LIMIT environment variable can be used to restrict the choice of algorithms to those that fit the given workspace memory limit, expressed in bytes. In some cases it may also be desired to restrict the automatic search to algorithms that produce deterministic (reproducable) results. This can be requested by setting the the environment variable NNABLA_CUDNN_DETERMINISTIC to a non-zero value.

Parameters:
  • x (Variable) – \((B + 1 + N)\)-D array (\(M_1 \times ... \times M_B \times C \times L_1 \times ... \times L_N\)).
  • weight (Variable) – \((2 + N)\)-D array (\(C' \times C \times K_1 \times ... \times K_N\)). [parameter]
  • bias (Variable) – Bias vector (\(C'\)). [optional][parameter]
  • base_axis (int) – base axis \(B\). [default=``1``]
  • pad (tuple of int) – Padding sizes for dimensions. [default=``(0,) * (len(x.shape) - (base_axis+1))``]
  • stride (tuple of int) – Stride sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • dilation (tuple of int) – Dilation sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • group (int) – Number of groups of channels. This makes the connection across channels sparser, by grouping connections along the mapping direction. [default=``1``]
  • channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. [default=``False``]
Returns:

\((B + 1 + N)\)-D array (\(M_1 \times ... \times M_B \times C' \times L'_1 \times ... \times L'_N\)).

A spatial size of the output is calculated as

\[L'_i = \frac{L_i + 2 p_i - d_i (k_i - 1) - 1}{s_i} + 1,\]

where \(L_i\) is the spatial size, \(p_i\) is the padding, \(d_i\) is the dilation, \(k_i\) is the kernel size, and \(s_i\) is the stride for \(i\)-th spatial dimension. The same calculation can also be applied to the other spatial dimensions.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.depthwise_convolution(x, weight, bias=None, base_axis=1, pad=None, stride=None, dilation=None, multiplier=1, n_outputs=-1, outputs=None)[source]

N-D Depthwise Convolution with bias.

References

Parameters:
  • x (Variable) – \((B + 1 + N)\)-D array (\(M_1 \times ... \times M_B \times C \times L_1 \times ... \times L_N\)).
  • weight (Variable) – \((1 + N)\)-D array (\(C \times K_1 \times ... \times K_N\)). [parameter]
  • bias (Variable) – Bias vector (\(C\)). [optional][parameter]
  • base_axis (int) – base axis \(B\). [default=``1``]
  • pad (tuple of int) – Padding sizes for dimensions. [default=``(0,) * (len(x.shape) - (base_axis+1))``]
  • stride (tuple of int) – Stride sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • dilation (tuple of int) – Dilation sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • multiplier (int) – Number of output feature maps per input feature map. [default=``1``]
Returns:

\((B + 1 + N)\)-D array (\(M_1 \times ... \times M_B \times C' \times L'_1 \times ... \times L'_N\)).

The output map size \(C'\) is \(C\) multiplied by \(m\)

\[C' = m \times C,\]

where \(m\) is the multiplier.

A spatial size of the output is calculated as

\[L'_i = \frac{L_i + 2 p_i - d_i (k_i - 1) - 1}{s_i} + 1,\]

where \(L_i\) is the spatial size, \(p_i\) is the padding, \(d_i\) is the dilation, \(k_i\) is the kernel size, and \(s_i\) is the stride for \(i\)-th spatial dimension. The same calculation can also be applied to the other spatial dimensions.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.deconvolution(x, weight, bias=None, base_axis=1, pad=None, stride=None, dilation=None, group=1, n_outputs=-1, outputs=None)[source]

N-D deconvolution, also known as transposed convolution, with bias operates backward convolution (derivative of the output w.r.t. the input) plus channel-wise learned bias.

The weights are specified in the same manner as convolution() , as if it was an ordinary convolution function. The forward operation of deconvolution() will then be operationally equivalent to the backward pass of convolution() . Therefore, the number of input channels (can be seen as output channels of forward convolution) is specified in the first dimension, and the number of the output channels divided by the number of groups is specified in the second dimension.

Parameters:
  • x (Variable) – \((B + 1 + N)\)-D array (\(M_1 \times ... \times M_B \times C \times L_1 \times ... \times L_N\)).
  • weight (Variable) – \((2 + N)\)-D array (\(C' \times C \times K_1 \times ... \times K_N\)). [parameter]
  • bias (Variable) – Bias vector (\(C'\)). [optional][parameter]
  • base_axis (int) – base axis \(B\). [default=``1``]
  • pad (tuple of int) – Padding sizes for dimensions. [default=``(0,) * (len(x.shape) - (base_axis+1))``]
  • stride (tuple of int) – Stride sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • dilation (tuple of int) – Dilation sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • group (int) – Number of groups of channels. This makes the connection across channels sparser, by grouping connections along the mapping direction. [default=``1``]
Returns:

\((B + 1 + N)\)-D array (\(M_1 \times ... \times M_B \times C' \times L'_1 \times ... \times L'_N\)).

A spatial size of the output is calculated as

\[L'_i =s_i (L_i - 1) - 2 p_i + d_i (k_i - 1) + 1,\]

where \(s_i\) is the stride, \(L_i\) is the spatial size, \(p_i\) is the padding, \(d_i\) is the dilation, and \(k_i\) is the kernel size for \(i\)-th spatial dimension. The same calculation can also be applied to the other spatial dimensions.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.depthwise_deconvolution(x, weight, bias=None, base_axis=1, pad=None, stride=None, dilation=None, divisor=1, n_outputs=-1, outputs=None)[source]

Depthwise deconvolution computes the transposed depthwise convolution with bias for one-dimensional and two-dimensional input data.

Parameters:
  • x (Variable) – \((B + 1 + N)\)-D array (\(M_1 \times ... \times M_B \times C \times L_1 \times ... \times L_N\)).
  • weight (Variable) – \((1 + N)\)-D array (\(C \times K_1 \times ... \times K_N\)). [parameter]
  • bias (Variable) – Bias vector (\(C\)). [optional][parameter]
  • base_axis (int) – base axis \(B\). [default=``1``]
  • pad (tuple of int) – Padding sizes for dimensions. [default=``(0,) * (len(x.shape) - (base_axis+1))``]
  • stride (tuple of int) – Stride sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • dilation (tuple of int) – Dilation sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • divisor (int) – Number of input feature maps per output feature map. [default=``1``]
Returns:

\((B + 1 + N)\)-D array (\(M_1 \times ... \times M_B \times C' \times L'_1 \times ... \times L'_N\)).

The output map size \(C'\) is \(C\) multiplied by \(m\)

\[C' = \frac{C}{d},\]

where \(d\) is the divisor.

A spatial size of the output is calculated as

\[L'_i =s_i (L_i - 1) - 2 p_i + d_i (k_i - 1) + 1,\]

where \(s_i\) is the stride, \(L_i\) is the spatial size, \(p_i\) is the padding, \(d_i\) is the dilation, and \(k_i\) is the kernel size for \(i\)-th spatial dimension. The same calculation can also be applied to the other spatial dimensions.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.max_pooling(x, kernel, stride=None, ignore_border=True, pad=None, channel_last=False, n_outputs=-1, outputs=None)[source]

Max pooling. It pools the maximum values inside the scanning kernel:

\[y_{i_1, i_2} = \max_{k_1, k_2 \in K} (x_{i_1 + k_1, i_2 + k_2})\]

where \(x_{i_1 + k_1, i_2 + k_2}\) is the input and \(y_{i_1, i_2}\) is the output.

Parameters:
  • x (Variable) – Input variable.
  • kernel (tuple of int) – Kernel sizes for each spatial axis.
  • stride (tuple of int) – Subsampling factors for each spatial axis. [default=``kernel``]
  • ignore_border (bool) – If false, kernels covering borders are also considered for the output. [default=``True``]
  • pad (tuple of int) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. [default=``(0,) * len(kernel)``]
  • channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. [default=``False``]
Returns:

Maximum values variable

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.average_pooling(x, kernel, stride=None, ignore_border=True, pad=None, channel_last=False, including_pad=True, n_outputs=-1, outputs=None)[source]

Average pooling. It pools the averaged values inside the scanning kernel:

\[y_{i_1, i_2} = \frac{1}{K_1 K_2} \sum_{k1} \sum_{k2} x_{i_1 + k_1, i_2 + k_2}\]

where \(x_{i_1 + k_1, i_2 + k_2}\) is the input and \(y_{i_1, i_2}\) is the output.

Parameters:
  • x (Variable) – Input variable.
  • kernel (tuple of int) – Kernel sizes for each spatial axis.
  • stride (tuple of int) – Subsampling factors for each spatial axis. [default=``kernel``]
  • ignore_border (bool) – If false, kernels covering borders are also considered for the output. [default=``True``]
  • pad (tuple of int) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. [default=``(0,) * len(kernel)``]
  • channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. [default=``False``]
  • including_pad (bool) – If true, border padding values are considered for the output. [default=``True``]
Returns:

Average values variable

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.global_average_pooling(x, n_outputs=-1, outputs=None)[source]

Warning

This function is experimental support, so please do not actively use it.

Global average pooling. It pools an averaged value from the whole image

Parameters:x (Variable) – Input variable.
Returns:Average values variable
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.sum_pooling(x, kernel, stride=None, ignore_border=True, pad=None, channel_last=False, n_outputs=-1, outputs=None)[source]

Sum pooling. It pools the summed values inside the scanning kernel:

\[y_{i_1, i_2} = \sum_{k1} \sum_{k2} x_{i_1 + k_1, i_2 + k_2}\]

where \(x_{i_1 + k_1, i_2 + k_2}\) is the input and \(y_{i_1, i_2}\) is the output.

Parameters:
  • x (Variable) – Input variable.
  • kernel (tuple of int) – Kernel sizes for each spatial axis.
  • stride (tuple of int) – Subsampling factors for each spatial axis. [default=``kernel``]
  • ignore_border (bool) – If false, kernels covering borders are also considered for the output. [default=``True``]
  • pad (tuple of int) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. [default=``(0,) * len(kernel)``]
  • channel_last (bool) – If True, the last dimension is considered as channel dimension, a.k.a NHWC order. [default=``False``]
Returns:

Summed values variable

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.unpooling(x, kernel, n_outputs=-1, outputs=None)[source]

Inverse operation of pooling. It spreads the input values:

\[y_{k_1 i_1 + j_1, k_2 i_2 + j_2} = x_{i_1, i_2}\]

where \(_{i_1, i_2}\) is the input and \(y_{k_1 i_1 + j_1, k_2 i_2 + j_2}\) is the output.

Parameters:
  • x (Variable) – Input variable.
  • kernel (tuple of int) – Kernel sizes for each spatial axis.
Returns:

Spread values variable

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.embed(x0, w, n_outputs=-1, outputs=None)[source]

Embed slices of a matrix/tensor with indexing array/tensor.

Parameters:
  • x0 (Variable) – Indices with shape \((I_0, ..., I_N)\)
  • w (Variable) – Weights with shape \((W_0, ..., W_M)\) [parameter]
Returns:

Output with shape \((I_0, ..., I_N, W_1, ..., W_M)\)

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.rnn(x, h, weight_l0, weight=None, bias=None, num_layers=1, nonlinearity='tanh', dropout=None, bidirectional=False, training=True, n_outputs=-1, outputs=None)[source]

RNN function implements Elman RNN with nonlineraity to input sequence. RNN function is defined as following:

\[{\mathbf h_t} = {\mathbf \tanh}( {\mathbf w_{ih}} *{\mathbf x_t} + {\mathbf b_{ih}} + {\mathbf w_{hh}}* {\mathbf h_{(t-1)}} + {\mathbf b_{hh}}).\]

We use the following notations to describe the inputs and outputs below. \(T\): sequcne length, \(B\): batch size, \(I\): input size, \(L\): number of layers, \(D\): number of directions, can be either 1 or 2, \(H\): hidden size.

References

Parameters:
  • x (Variable) – Input N-D array with shape \((T, B, I)\).
  • h (Variable) – Input N-D array with shape \((L, D, B, H)\).
  • weight_l0 (Variable) – Input N-D array with shape \((D, H, I + H)\). [parameter]
  • weight (Variable) – Input N-D array with shape \((L-1, D, H, D * H + H)\). [optional][parameter]
  • bias (Variable) – Input N-D array with shape \((L, D, H)\). [optional][parameter]
  • num_layers (int) – Number of layers in the network. If set to 1, only the weights for the first layer will be invoked. Default is 1. [default=``1``]
  • nonlinearity (string) – Type of nonlinearity applied to input sequcne. Must be either tanh or relu. Default is tanh. [default=``’tanh’``]
  • dropout (float) – Dropout ratio applied to parameters. Default is 0.0. [default=``0.0``]
  • bidirectional (bool) – If True, bidirectional computation will be performed in each layer. Default is False. [default=``False``]
  • training (bool) – Backpropagation will be performed only when it is true. Default is True. [default=``True``]
Returns:

Output \(y\) with shape \((T, B, D * H)\) ~nnabla.Variable: Output \(h_n\) with shape \((L, D, B, H)\)

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.lstm(x, h, c, weight_l0, weight=None, bias=None, num_layers=1, dropout=None, bidirectional=False, training=True, n_outputs=-1, outputs=None)[source]

N-Step LSTM layer.

\[\begin{split}{\mathbf f_t} = {\mathbf \sigma}( {\mathbf W_f} *{\mathbf x_t} + {\mathbf U_f}* {\mathbf h_{(t-1)}} + {\mathbf b_f})\\ {\mathbf i_t} = {\mathbf \sigma}( {\mathbf W_i} *{\mathbf x_t} + {\mathbf U_i}* {\mathbf h_{(t-1)}} + {\mathbf b_i})\\ {\mathbf o_t} = {\mathbf \sigma}( {\mathbf W_o} *{\mathbf x_t} + {\mathbf U_o}* {\mathbf h_{(t-1)}} + {\mathbf b_o})\\ {\mathbf c_t} = {\mathbf f_t}\odot {\mathbf c_{(t-1)}} + {\mathbf i_t}\odot {\mathbf \tanh}({\mathbf W_c}*{\mathbf x_t} + {\mathbf U_c} *{\mathbf h_{(t-1)}} + {\mathbf b_c})\\ {\mathbf h_t} = {\mathbf o_t} \odot {\mathbf \tanh}({\mathbf c_t}).\end{split}\]

We use the following notations to describe the inputs and outputs below. \(T\): sequcne length, \(B\): batch size, \(I\): input size, \(L\): number of layers, \(D\): number of directions, can be either 1 or 2, \(H\): hidden size.

References

Parameters:
  • x (Variable) – Input N-D array with shape \((T, B, I)\).
  • h (Variable) – Input N-D array with shape \((L, D, B, H)\).
  • c (Variable) – Input N-D array with shape \((L, D, B, H)\).
  • weight_l0 (Variable) – weight parameters for the first layer. Shape is \((D, 4, H, I + H)\). [parameter]
  • weight (Variable) – weight parameters for the second layer and above. Shape is \((L-1, D, 4, H, D * H + H)\). [optional][parameter]
  • bias (Variable) – Bias vector (\(L\)). Shape is \((L, D, 4, H)\). [optional][parameter]
  • num_layers (int) – Number of layers in the network. If set to 1, only the weights for the first layer will be invoked. Default is 1. [default=``1``]
  • dropout (float) – Dropout ratio applied to parameters. Default is 0.0. [default=``0.0``]
  • bidirectional (bool) – If True, bidirecitonal computation will be performed in each layer. Default is False. [default=``False``]
  • training (bool) – Backpropagation will be performed only when it is True. Default is True. [default=``True``]
Returns:

Output \(y\) with shape \((T, B, D * H)\). Its memory layout can be reshaped as \((T, B, D, H)\). ~nnabla.Variable: Output \(h_n\) with shape \((L, D, B, H)\) ~nnabla.Variable: Output \(c_n\) with shape \((L, D, B, H)\)

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.gru(x, h, weight_l0, weight=None, bias=None, num_layers=1, dropout=None, bidirectional=False, training=True, n_outputs=-1, outputs=None)[source]

N-Step GRU layer.

\[\begin{split}{\mathbf r_t} = {\mathbf \sigma}( {\mathbf W_r} *{\mathbf x_t} + {\mathbf U_r}* {\mathbf h_{(t-1)}} + {\mathbf b_r})\\ {\mathbf z_t} = {\mathbf \sigma}( {\mathbf W_z} *{\mathbf x_t} + {\mathbf U_z}* {\mathbf h_{(t-1)}} + {\mathbf b_z})\\ {\mathbf n_t} = {\mathbf \tanh}( {\mathbf W_n}{\mathbf x_t}+ {\mathbf b_{in}}+ {\mathbf r_n}\odot( {\mathbf U_n}{\mathbf h_{t-1}}+ {\mathbf b_{hn}})) \\ {\mathbf h_t} = (1- {\mathbf z_t})\odot {\mathbf n_t} + {\mathbf z_t}\odot {\mathbf h_{t-1}}.\end{split}\]

We use the following notations to describe the inputs and outputs below. \(T\): sequcne length, \(B\): batch size, \(I\): input size, \(L\): number of layers, \(D\): number of directions, can be either 1 or 2, \(H\): hidden size.

References

Parameters:
  • x (Variable) – Input N-D array with shape \((T, B, I)\).
  • h (Variable) – Input N-D array with shape \((L, D, B, H)\).
  • weight_l0 (Variable) – weight parameters for the first layer. Shape is \((D, 3, H, I + H)\). [parameter]
  • weight (Variable) – weight parameters for the second layer and above. Shape is \((L-1, D, 3, H, D * H + H)\). [optional][parameter]
  • bias (Variable) – Bias vector (\(L\)). Shape is \((L, D, 4, H)\). [optional][parameter]
  • num_layers (int) – Number of layers in the network. If set to 1, only the weights for the first layer will be invoked. Default is 1. [default=``1``]
  • dropout (float) – Dropout ratio applied to parameters. Default is 0.0. [default=``0.0``]
  • bidirectional (bool) – If True, bidirecitonal computation will be performed in each layer. Default is False. [default=``False``]
  • training (bool) – Backpropagation will be performed only when it is True. Default is True. [default=``True``]
Returns:

Output \(y\) with shape \((T, B, D * H)\). Its memory layout can be reshaped as \((T, B, D, H)\). ~nnabla.Variable: Output \(h_n\) with shape \((L, D, B, H)\)

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.multi_head_attention(query, key, value, num_heads, q_weight, k_weight, v_weight, out_weight, q_bias=None, k_bias=None, v_bias=None, out_bias=None, attn_bias_k=None, attn_bias_v=None, dropout=0.0, additive_mask=None, key_padding_mask=None)[source]

MultiHeadAttention.

Computes multi-headed attention with query, key, and value. We use the following notations to describe the inputs and outputs below. \(L_T\): target sequence length, \(L_S\): source sequence length, \(B\): batch size, \(E\): embedding dimension, :math`H`: number of attention heads.

References

A. Vaswani et al. “Attention is All You Need.” NIPS. 2017. <https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf>

Parameters:
  • query (Variable) – Input N-D array with shape \((L_T, B, E)\).
  • key (Variable) – Input N-D array with shape \((L_S, B, E_k)\).
  • value (Variable) – Input N-D array with shape \((L_S, B, E_v)\).
  • num_heads (int) – Number of attention heads. Note that embedding dimensoin E must be divisible by the number of heads. Default is 12 which is conventional.
  • q_weight (Variable) – Input N-D array with shape \((E E)\).
  • k_weight (Variable) – Input N-D array with shape \((E_k, E)\).
  • v_weight (Variable) – Input N-D array with shape \((E_v, E)\).
  • out_weight (Variable) – Input N-D array with shape \((E, E)\).
  • q_bias (Variable, optional) – Input N-D array with shape \((E, )\).
  • k_bias (Variable, optional) – Input N-D array with shape \((E, )\).
  • v_bias (Variable, optional) – Input N-D array with shape \((E, )\).
  • out_bias (Variable, optional) – Input N-D array with shape \((E, )\).
  • attn_bias_k (Variable, optional) – Input N-D array with shape \((E, )\).
  • attn_bias_v (Variable, optional) – Input N-D array with shape \((E, )\).
  • dropout (float, optional) – Dropout ratio applied to parameters. Default is 0.
  • additive_mask (Variable, optional) – Input N-D array with shape \((L_T, L_S)\). Values will be added to the attention layer to prevent attention to certain positions.
  • key_padding_mask (Variable, optional) – Input N-D array with shape \((B, L_S)\). Specified padding elements will be ignored by the attention layer. Values must be either 1 or 0.
Returns:

Output \(y\) with shape \((L_T, B, E)\) ~nnabla.Variable: Output \(h_n\) with shape \((B, L_T, L_S)\)

Return type:

Variable

Neural Network Activation

nnabla.functions.sigmoid(x, n_outputs=-1, outputs=None)[source]

Element-wise sigmoid function.

\[f(x) = \frac{1}{1 + \exp(-x)},\]
Parameters:x (Variable) – Input
Returns:Output
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.swish(x, n_outputs=-1, outputs=None)[source]

Element-wise swish function, by Ramachandran et al. (2017).

\[y_i = \frac{x_i}{1 + \exp(-x_i)},\]

References

Parameters:x (Variable) – Input
Returns:Output
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.tanh(x, n_outputs=-1, outputs=None)[source]

Element-wise hyperbolic tangent (tanh) function.

\[y_i = \tanh (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.relu(x, inplace=False, n_outputs=-1, outputs=None)[source]

Element-wise Rectified Linear Unit (ReLU) function.

\[y_i = \max (0, x_i)\]
Parameters:
  • x (Variable) – N-D array
  • inplace (bool) – The output array is shared with the input array if True. [default=``False``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.softmax(x, axis=None, n_outputs=-1, outputs=None)[source]

Softmax normalization. Calculates

\[y_i = \frac{\exp(x_i)}{\sum_j \exp(x_j)}\]

along the dimension specified by axis, where \(x_i\) is the input and \(y_i\) is the output.

Parameters:
  • x (Variable) – N-D array. Typically indicates a score.
  • axis (int) – Axis normalization is taken. [default=``len(x.shape) - 1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.elu(x, alpha=1.0, n_outputs=-1, outputs=None)[source]

Element-wise Exponential Linear Unit (ELU) function.

\[\begin{split}y_i= \left\{ \begin{array}{ll} x_i & (x > 0)\\ \alpha (\exp(x_i) - 1) & (x \leq 0) \end{array} \right..\end{split}\]

References

Parameters:
  • x (Variable) – N-D array
  • alpha (float) – Coefficient for negative outputs. \(\alpha\) in definition [default=``1.0``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.selu(x, scale=1.05070098735548, alpha=1.673263242354377, n_outputs=-1, outputs=None)[source]

Element-wise Scaled Exponential Linear Unit (SELU) function by Klambauer et al. (2017).

\[\begin{split}y_i= \lambda \left\{ \begin{array}{ll} x_i & (x > 0)\\ \alpha (\exp(x_i) - 1) & (x \leq 0) \end{array} \right..\end{split}\]

The coefficients \(\lambda\) and \(\alpha\) default to the following values \(\lambda_{01}\) and \(\alpha_{01}\), respectively, provided by Klambauer et al. (2017):

\[\begin{split}\begin{array}{lll} \lambda_{01} &=& \left( 1 - \operatorname{erfc}\left( \frac{1}{\sqrt{2}} \right) \sqrt{e} \right) \sqrt{2 \pi} \\ && \left( 2 \operatorname{erfc} \left( \sqrt{2} \right) e^2 + \pi \operatorname{erfc}\left( \frac{1}{\sqrt{2}} \right)^2 e \right. \\ && \left. - 2(2 + \pi) \operatorname{erfc} \left( \frac{1}{\sqrt{2}} \right) \sqrt{e} + \pi + 2 \right)^{-1/2} \\ &\approx& 1.0507 \\ \alpha_{01} &=& - \frac {\sqrt {\frac {2}{\pi}}} {\operatorname{erfc} \left( \frac{1}{\sqrt{2}} \right) \exp \left(\frac {1} {2} \right) - 1} \\ &\approx& 1.67326 \end{array}\end{split}\]

References

Parameters:
  • x (Variable) – N-D array
  • scale (float) – The coefficient \(\lambda\) in the definition. [default=``1.05070098735548``]
  • alpha (float) – The coefficient \(\alpha\) in the definition. [default=``1.673263242354377``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.crelu(x, axis=1, n_outputs=-1, outputs=None)[source]

Element-wise Concatenated Rectified Linear Unit (CReLU) function. This function calculates the ReLU of \(x\) and \(-x\) , then concatenates the results together at a specified axis, and returns the resulting array.

References

Parameters:
  • x (Variable) – N-D array.
  • axis (int) – The ReLU activations of positive inputs and negative inputs are concatenated at axis. [default=``1``]
Returns:

N-D array where axis dimension is doubled by concatenating.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.celu(x, alpha=1.0, axis=1, n_outputs=-1, outputs=None)[source]

Element-wise Concatenated Exponential Linear Unit (CELU) function. Concatenates ELU outputs of positive and negative inputs together at specified axis.

Parameters:
  • x (Variable) – N-D array.
  • alpha (float) – Coefficient for negative outputs. \(\alpha\) in definition. [default=``1.0``]
  • axis (int) – The ELU activations of positive inputs and negative inputs are concatenated at axis. [default=``1``]
Returns:

N-D array where axis dimension is doubled by concatenating.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.gelu(x, n_outputs=-1, outputs=None)[source]

Gaussian Error Unit (GELU) function.

\[GELU(x) = xP(X \leq x) = x \Phi (x)\]

which is approximated by

\[GELU(x) = 0.5x (1 + \tanh ( \sqrt(2/\pi)(x + 0.044715x^3) ))\]

References

Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.prelu(x0, x1, base_axis=1, n_outputs=-1, outputs=None)[source]

Element-wise Parametrized Rectified Linear Unit function. Calculates:

\[y_i = \max(0, x_i) + w_i \min(0, x_i)\]

where negative slope \(w\) is learned and can vary across channels (an axis specified with base_axis).

Parameters:
  • x0 (Variable) – (N-D array) Input
  • x1 (Variable) – (N-D array) Weights
  • base_axis (int) – Dimensions up to base_axis is treated as sample dimension. [default=``1``]
Returns:

N-D array.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.leaky_relu(x, alpha=0.1, inplace=False, n_outputs=-1, outputs=None)[source]

Element-wise Leaky Rectified Linear Unit (ReLU) function.

It is defined as:

\[y_i = \alpha * \min(0, x_i) + \max (0, x_i)\]
Parameters:
  • x (Variable) – N-D array
  • alpha (float) – The slope value multiplied to negative numbers. \(\alpha\) in the definition. [default=``0.1``]
  • inplace (bool) – The output array is shared with the input array if True. [default=``False``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.relu6(x, n_outputs=-1, outputs=None)[source]

Element-wise ReLU6 function. Capping ReLU activation to 6 is often observed to learn sparse features earlier.

\[ReLU6(x) = \min(\max(0,x,),6)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.hard_sigmoid(x, n_outputs=-1, outputs=None)[source]

Segment-wise linear approximation of sigmoid. Preferable when speed of computation is more important than precision. Returns \(0\) if \(x < -2.5\). Returns \(1\) if \(x> 2.5\). Returns \(0.2x + 0.5\) if \(-2.5 <= x <= 2.5\).

Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.hard_tanh(x, n_outputs=-1, outputs=None)[source]

Element-wise HardTanh function. Computationally cheaper than Tanh function. Returns \(1\) if \(x > 1\). Returns \(-1\) if \(x < -1\). Returns \(x\) otherwise.

Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.log_sigmoid(x, n_outputs=-1, outputs=None)[source]

Element-wise LogSigmoid function.

\[LogSigmoid(x) = \log(1/(1+\exp(-x_i)))\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.softplus(x, n_outputs=-1, outputs=None)[source]

Element-wise SoftPlus function. Unlike Sigmoid and Tanh that have upper and lower bound, SoftPlus is only lower-bounded by 0.

\[SoftPlus(x) = \log(1+\exp(x_i))\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.softsign(x, n_outputs=-1, outputs=None)[source]

Element-wise SoftSign. Can be used in place of Tanh function. While Tanh converges exponentially, SoftSign converges polynomially.

\[SoftSign(x) = x/(1+|x|)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.tanh_shrink(x, n_outputs=-1, outputs=None)[source]

Element-wies TanhShrink function.

\[TanhShrink(x) = x - \tanh(x)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.sinc(x, n_outputs=-1, outputs=None)[source]

Element-wise Sinc function. Unlike other popular activation functions, it has rises and falls. returns \(1\) if \(x = 0\). returns \(\sin(x)/x\) otherwise.

Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Normalization

nnabla.functions.batch_normalization(x, beta, gamma, mean, variance, axes=[1], decay_rate=0.9, eps=1e-05, batch_stat=True, output_stat=False, n_outputs=None)[source]

Batch normalization.

\[\begin{split}\begin{eqnarray} \mu &=& \frac{1}{M} \sum x_i \\ \sigma^2 &=& \frac{1}{M} \sum \left(x_i - \mu\right)^2 \\ \hat{x}_i &=& \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon}} \\ y_i &=& \hat{x}_i \gamma + \beta. \end{eqnarray}\end{split}\]

At testing time, the mean and variance values used are those that were computed during training by moving average.

References

Parameters:
  • x (Variable) – N-D array of input.
  • beta (Variable or None) – N-D array of beta which is learned. If None, the bias term is omitted.
  • gamma (Variable or None) – N-D array of gamma which is learned. If None, the scale term is omitted.
  • mean (Variable or None) – N-D array of running mean (modified during forward execution). If None, dummy variable is created and running mean is not updated. mean=None with batch_stat=False is prohibited.
  • variance (Variable or None) – N-D array of running variance (modified during forward execution). If None, dummy variable is created and running variance is not updated. variance=None with batch_stat=False is prohibited.
  • axes (list of int or int) – Mean and variance are calculated along these axes.
  • decay_rate (float) – Decay rate of running mean and variance.
  • eps (float) – Tiny value to avoid zero division by std.
  • batch_stat (bool) – Use mini-batch statistics rather than running ones. If False, mean and variance must be ~nnabla.Variable. (None is prohibited.)
  • output_stat (bool) – It true, the batch statistics of mean and variance, will be returned as Variables. They are also differentiable.
Returns:

Returns batch normalization output as Variable. If output_stat=True, it also returns the mean and variance of the mini-batch

See also

nnabla.function_bases.batch_normalization.

nnabla.functions.sync_batch_normalization(x, beta, gamma, mean, variance, comm, group='world', axes=[1], decay_rate=0.9, eps=1e-05, batch_stat=True, output_stat=False, n_outputs=None)[source]

Synchronized batch normalization.

For some tasks (e.g., semantic segmentation), batch size will be too small and BatchNormalization layer might not work well. SyncBatchNorlization layer solves these problems by synchronizing batch stats (mean and var) between multiple processes.

\[\begin{split}\begin{eqnarray} \mu &=& \frac{1}{M} \sum x_i \\ \sigma^2 &=& \frac{1}{M} \left(\sum x_i - \mu\right)^2 \\ \hat{x}_i &=& \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon}} \\ y_i &=& \hat{x}_i \gamma + \beta. \end{eqnarray}\end{split}\]

References

Parameters:
  • x (Variable) – N-D array of input.
  • beta (Variable or None) – N-D array of beta which is learned. If None, the bias term is omitted.
  • gamma (Variable or None) – N-D array of gamma which is learned. If None, the scale term is omitted.
  • mean (Variable or None) – N-D array of running mean (modified during forward execution). If None, dummy variable is created and running mean is never updated. mean=None with batch_stat=False is prohibited.
  • variance (Variable or None) – N-D array of running variance (modified during forward execution). If None, dummy variable is created and running variance is never updated. variance=None with batch_stat=False is prohibited.
  • comm (Communicator) – The communicator
  • group (string) – The name of the communicator group
  • axes (list of int or int) – Mean and variance are calculated along these axes.
  • decay_rate (float) – Decay rate of running mean and variance.
  • eps (float) – Tiny value to avoid zero division by std.
  • batch_stat (bool) – Use mini-batch statistics rather than running ones. If False, mean and variance must be ~nnabla.Variable. (None is prohibited.)
  • output_stat (bool) – It true, the batch statistics of mean and variance, will be returned as Variables. They are also differentiable.
Returns:

Returns batch normalization output as Variable. If output_stat=True, it also returns the mean and variance of the mini-batch

See also

nnabla.function_bases.batch_normalization.

nnabla.functions.mean_subtraction(x, mean, t, base_axis=1, update_running_mean=True)[source]

It subtracts the mean of the elements of the input array, and normalizes it to \(0\). Preprocessing arrays with this function has the effect of improving accuracy in various tasks such as image classification.

At training time, this function is defined as

\[\begin{split}\begin{eqnarray} \mu &=& \frac{1}{M} \sum x_i \\ y_i &=& x_i - \mu \end{eqnarray}\end{split}\]

At testing time, the mean values used are those that were computed during training by moving average.

Note

The backward performs an approximated differentiation that takes into account only the latest mini-batch.

Parameters:
  • x (Variable) – N-D array of input.
  • mean (Variable) – N-D array of running mean (modified during forward execution).
  • t (Variable) – Scalar of num of iteration of running mean (modified during forward execution).
  • base_axis (int) – Base axis of Mean Subtraction operation. Dimensions up to base_axis is treated as sample dimension. [default=``1``]
  • update_running_mean (bool) – Update running mean during forward execution. [default=``True``]
Returns:

N-D array.

Return type:

Variable

See also

nnabla.function_bases.mean_subtraction.

nnabla.functions.clip_by_value(x, min, max)[source]

Clip inputs by values.

\[\begin{split}y = \begin{cases} max & (x > max) \\ x & (otherwise) \\ min & (x < min) \end{cases}.\end{split}\]
Parameters:
  • x (Variable) – An input variable.
  • min (Variable) – A min variable by which x is clipped. Note that the shape of min must be the same as x’s.
  • max (Variable) – A max variable by which x is clipped. Note that the shape of max must be the same as x’s
Returns:

N-D array.

Return type:

Variable

nnabla.functions.clip_grad_by_value(x, min, max, n_outputs=-1, outputs=None)[source]

In forward pass, the function behaves as the identity.

In backward pass,

\[\begin{split}g_x = \begin{cases} max & (g_y > max) \\ g_y & (otherwise) \\ min & (g_y < min) \end{cases}.\end{split}\]

A typical case for use is to prevent the gradient explosion through a whole computational graph. For example, if you want to clip gradient values for each feature map,

x = nn.Variable([16, 3, 32, 32])
min = F.broadcast(nn.Variable.from_numpy_array(np.asarray([-1.0]).reshape((1, 1, 1, 1))), (16, 3, 32, 32))
max = F.broadcast(nn.Variable.from_numpy_array(np.asarray([1.0]).reshape((1, 1, 1, 1))), (16, 3, 32, 32))
c = F.clip_grad_by_value(x, min=min, max=max)
h = PF.convolution(c, 64, (3, 3), pad=(1, 1))
Parameters:
  • x (Variable) – N-D array of input.
  • min (Variable) – N-D array of minimum input value by which the gradients of the y are clipped. Note that the shape of min must be the same as x’s and the backward to min is not performed.
  • max (Variable) – N-D array of maximum input value by which the gradients of the y are clipped. Note that the shape of max must be the same as x’s and the backward to max is not performed.
Returns:

N-D array.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.clip_by_norm(x, clip_norm, axis=None)[source]

Clip inputs by its L2 norm when the L2 norm is larger than the threshold value (defined by clip_norm). If it is less than the threshold, inputs are not modified. If it is applied, the operation is represented as

\[y = N \times \frac{x}{\|x\|_2}.\]

where \(x\) is the input, \(y\) is the output, and \(N\) is clip_norm. this is the case that axes is not set. When axes is set, the norm is computed over axes.

Parameters:
  • x (Variable) – An input variable.
  • clip_norm (Variable or float) – An input scalar variable or float value. Must be positive.
  • axis (None, int or tuple of ints) – Axis or axes along which the reduction is performed. Passing the default value None will reduce all dimensions.
Returns:

N-D array.

Return type:

Variable

nnabla.functions.clip_grad_by_norm(x, clip_norm=None, axes=None, n_outputs=-1, outputs=None)[source]

In the forward pass, the function behaves like the identity.

In the backward pass,

\[g_x = N \times \frac{g_y}{\|g_y\|_2}.\]

where \(g_x\) is the gradient w.r.t the input, \(g_y\) is the gradient w.r.t. the output, and \(N\) is clip_norm where the norm of \(g_y\) becomes. this is the case that axes is not set. When axes is set, the norm is computed over axes.

A typical case for use is to prevent the gradient explosion through a whole computational graph. For example, if you want to normalize gradient values over feature axis,

x = nn.Variable([16, 3, 32, 32])
c = F.clip_grad_by_norm(x, axes=(1, ))
h = PF.convolution(c, 64, (3, 3), pad=(1, 1))
Parameters:
  • x (Variable) – N-D array of input.
  • clip_norm (float) – Clip to the norm of input to clip_norm in the backward pass. [default=``1.0``]
  • axes (repeated int64) – Axes to be reduced. If empty list is given, all dimensions are reduced to scalar. This is used in the forward pass. [default=``range(x.ndim)``]
Returns:

N-D array.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.layer_normalization(x, beta, gamma, batch_axis=0, eps=1e-05, output_stat=False)[source]

Applies Layer Normalization over an input tensor, which is defined as:

\[\begin{split}\begin{eqnarray} \mu^l &=& \frac{1}{H} \sum_{i=1}^{H} x_i^l \\ \sigma^l &=& \sqrt{\frac{1}{H} \sum_{i=1}^{H} \left(x_i^l - \mu^l\right)^2} \\ y &=& \frac{x - \mu^l}{\sigma^l + \epsilon} \gamma + \beta \end{eqnarray}\end{split}\]

where \(x\) and \(y\) are input and output variable, \(\mu^l\) and \(\sigma^l\) are the mean and std of each layer which is separately calculated for each batch, and \(\beta\) and \(\gamma\) are adaptive biases and gains.

If the input shape is [B, C, H, W] (= batch_axis=0), the shape of calculated mean and std are [B, 1, 1, 1]

References

Parameters:
  • x (Variable) – An input variable.
  • beta (Variable or None) – An Adaptive biases. If None, the bias term is omitted.
  • gamma (Variable or None) – An Adaptive gains. If None, the scale term is omitted.
  • batch_axis (int or repeated int) – Axes mean and variance are taken.
  • eps (float) – Tiny value to avoid zero division by std.
  • output_stat (bool) – If true, calculated mean and variance are also returned.
Returns:

output variable which is normalized its statics and rescaled by alpha and beta. * Variable: Mean (if ``output_stat=True`). * Variable: Std (if ``output_stat=True`)

Return type:

nnabla.functions.instance_normalization(x, beta, gamma, channel_axis=1, batch_axis=0, eps=1e-05, output_stat=False)[source]

Applies Instance Normalization over an input tensor, which is defined as:

\[\begin{split}\begin{eqnarray} \mu^i &=& \frac{1}{H} \sum_{i=1}^{H} x_i^i \\ \sigma^i &=& \sqrt{\frac{1}{H} \sum_{i=1}^{H} \left(x_i^i - \mu^i\right)^2} \\ y &=& \frac{x - \mu^i}{\sigma^i + \epsilon} \gamma + \beta \end{eqnarray}\end{split}\]

where \(x\) and \(y\) are input and output variable, \(\mu^i\) and \(\sigma^i\) are the mean and std of each instance which is separately calculated for each batch and channel, and \(\gamma\) and \(\beta\) are adaptive gains and biases.

If the input shape is [B, C, H, W] (= channel_axis=1, batch_axis=0), the shape of calculated mean and std are [B, C, 1, 1]

References

Parameters:
  • x (Variable) – An input variable.
  • beta (Variable) – An Adaptive biases.
  • gamma (Variable) – An Adaptive gains.
  • channel_axis (int) – Channel axis.
  • batch_axis (int or repeated int) – Batch axes.
  • eps (float) – Tiny value to avoid zero division by std.
  • output_stat (bool) – If true, the batch statistics of mean and variance.
Returns:

Normalized output variable. * Variable: Mean (if ``output_stat=True`) * Variable: Std (if ``output_stat=True`)

Return type:

nnabla.functions.group_normalization(x, beta, gamma, num_groups, channel_axis=1, batch_axis=0, eps=1e-05, output_stat=False)[source]

Applies Group Normalization over an input tensor, which is defined as:

\[\begin{split}\begin{eqnarray} \mu^g &=& \frac{1}{H} \sum_{i=1}^{H} x_i^g \\ \sigma^g &=& \sqrt{\frac{1}{H} \sum_{i=1}^{H} \left(x_i^g - \mu^g\right)^2} \\ y &=& \frac{x - \mu^g}{\sigma^g + \epsilon} \gamma + \beta \end{eqnarray}\end{split}\]

where \(x\) and \(y\) are input and output variable, \(\mu^g\) and \(\sigma^g\) are the mean and std of each group which contains num_channels / num_groups channels, and \(\gamma\) and \(\beta\) are adaptive gains and biases.

The input channels, specified by channel_axis, are separated into num_groups groups, and the mean and std are calculated over the each group. For example, if the input shape is [B, C, H, W] (= channel_axis=1, batch_axis=0), an input variable is once reshaped to [B, num_groups, C / num_groups, H, W] and standardize by its mean and std whose shapes are [B, num_groups, 1, 1, 1]. Finally, an output variable is reshaped again to the original input shape (= [B, C, H, W] in the case above).

References

Parameters:
  • x (Variable) – An input variable.
  • beta (Variable or None) – An Adaptive biases. If None, the bias term is omitted.
  • gamma (Variable or None) – An Adaptive gains. If None, the scale term is omitted.
  • num_groups (int) – A number of groups. The channel dim of ‘x’ must be integer multiple of num_groups.
  • channel_axis (int) – Channel axis.
  • batch_axis (int or repeated int) – Batch axes.
  • eps (float) – Tiny value to avoid zero division by std.
  • output_stat (bool) – If true, the batch statistics of mean and variance.
Returns:

Normalized output variable. * Variable: Mean (if ``output_stat=True`) * Variable: Std (if ``output_stat=True`)

Return type:

nnabla.functions.weight_standardization(w, channel_axis=0, eps=1e-05, output_stat=False)[source]

Applies Weight Standardization over an input weight, which is defined as:

\[\begin{split}\begin{eqnarray} \mu_{W_i} &=& \frac{1}{I} \sum_{j=1}^{I} W_{ij} \\ \sigma_{W_i} &=& \sqrt{\frac{1}{I} \sum_{i=1}^{I} \left(W_{ij} - \mu_{W_{i}}\right)^2} \\ \hat{W_{ij}} &=& \frac{W_{ij} - \mu_{W_i}}{\sigma_{W_i} + \epsilon} \\ y &=& \hat{W} \ast x \end{eqnarray}\end{split}\]

References

Parameters:
  • w (Variable) – A weight variable.
  • channel_axis (int) – An axis for output channel. Default value is 0 which assumes the weights of convolution.
  • eps (float) – Tiny value to avoid zero division by std.
  • output_stat (bool) – If true, the batch statistics of mean and variance.
Returns:

Standardized output weight. * Variable: Mean (if ``output_stat=True`) * Variable: Std (if ``output_stat=True`)

Return type:

Reduction

nnabla.functions.sum(x, axis=None, keepdims=False)[source]

Reduction along axes with sum operation.

Parameters:
  • x (Variable) – An input variable.
  • axis (None, int or tuple of ints) – Axis or axes along which the sum is calculated. Passing the default value None will reduce all dimensions.
  • keepdims (bool) – Flag whether the reduced axes are kept as a dimension with 1 element.
Returns:

N-D array.

Return type:

Variable

nnabla.functions.mean(x, axis=None, keepdims=False)[source]

Reduction along axes with mean operation.

Parameters:
  • x (Variable) – An input variable.
  • axis (None, int or tuple of ints) – Axis or axes along which mean is calculated. Passing the default value None will reduce all dimensions.
  • keepdims (bool) – Flag whether the reduced axes are kept as a dimension with 1 element.
Returns:

N-D array.

Return type:

Variable

nnabla.functions.max(x, axis=None, keepdims=False, with_index=False, only_index=False)[source]

Reduce the input N-D array x along the given axis using the max operation. The axis argument may be a single integer to reduce over one axis, a tuple of integers to reduce over multiple axes, or None to reduce over all axes. If keepdims is True, the output will keep all reduced dimensions with size 1. If with_index is True, result is a tuple (sorted, indices) or only indices if only_index is True. Setting only_index to True implies that with_index is also True.

import numpy as np
import nnabla as nn
import nnabla.functions as F

nn.set_auto_forward(True)
x = nn.Variable.from_numpy_array(np.random.rand(2, 3, 4))

maxval = F.max(x, axis=1)
assert np.allclose(maxval.d, np.max(x.d, axis=1))

maxval, indices = F.max(x, axis=1, with_index=True)
assert np.allclose(maxval.d, np.max(x.d, axis=1))
assert np.all(indices.d == np.argmax(x.d, axis=1))

indices = F.max(x, axis=1, only_index=True)
assert np.all(indices.d == np.argmax(x.d, axis=1))
Parameters:
  • x (Variable) – An input variable.
  • axis (None, int or tuple of ints) – Axis or axes along which max is calculated. The default value None will reduce all dimensions.
  • keepdims (bool) – Keep reduced axes as dimension with 1 element.
  • with_index (bool) – Return tuple of max values and index.
  • only_index (bool) – Return only the index of max values.
Returns:

N-D array.

Return type:

Variable

nnabla.functions.min(x, axis=None, keepdims=False, with_index=False, only_index=False)[source]

Reduce the input N-D array x along the given axis using the min operation. The axis argument may be a single integer to reduce over one axis, a tuple of integers to reduce over multiple axes, or None to reduce over all axes. If keepdims is True, the output will keep all reduced dimensions with size 1. If with_index is True, result is a tuple (sorted, indices) or only indices if only_index is True. Setting only_index to True implies that with_index is also True.

import numpy as np
import nnabla as nn
import nnabla.functions as F

nn.set_auto_forward(True)
x = nn.Variable.from_numpy_array(np.random.rand(2, 3, 4))

minval = F.min(x, axis=1)
assert np.allclose(minval.d, np.min(x.d, axis=1))

minval, indices = F.min(x, axis=1, with_index=True)
assert np.allclose(minval.d, np.min(x.d, axis=1))
assert np.all(indices.d == np.argmin(x.d, axis=1))

indices = F.min(x, axis=1, only_index=True)
assert np.all(indices.d == np.argmin(x.d, axis=1))
Parameters:
  • x (Variable) – An input variable.
  • axis (None, int or tuple of ints) – Axis or axes along which min is calculated. The default value None will reduce all dimensions.
  • keepdims (bool) – Keep reduced axes as dimension with 1 element.
  • with_index (bool) – Return tuple of min values and index.
  • only_index (bool) – Return only the index of min values.
Returns:

N-D array.

Return type:

Variable

nnabla.functions.prod(x, axis=None, keepdims=False)[source]

Reduction along axes with product operation.

Parameters:
  • x (Variable) – An input variable.
  • axis (None, int or tuple of ints) – Axis or axes along which product is calculated. Passing the default value None will reduce all dimensions.
  • keepdims (bool) – Flag whether the reduced axes are kept as a dimension with 1 element.
Returns:

N-D array.

Return type:

Variable

Note

Backward computation is not accurate in a zero value input.

nnabla.functions.reduce_sum(x, n_outputs=-1, outputs=None)[source]

Reduction along an axis with sum operation.

Note

This is deprecated. Use sum instead.

Parameters:x (Variable) – N-D array.
Returns:N-D array
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.reduce_mean(x, n_outputs=-1, outputs=None)[source]

Reduction by mean along an axis.

Note

This is deprecated. Use mean instead.

Parameters:x (Variable) – N-D array
Returns:N-D array
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Arithmetic

nnabla.functions.add2(x0, x1, inplace=False, n_outputs=-1, outputs=None)[source]

Element-wise addition.

\[y_i = x^{(0)}_i + x^{(1)}_i\]
Parameters:
  • x0 (Variable) – N-D array
  • x1 (Variable) – N-D array
  • inplace (bool) – The output array is shared with the 1st input array if True. [default=``False``]
Returns:

N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.sub2(x0, x1, n_outputs=-1, outputs=None)[source]

Element-wise subtraction.

\[y_i = x^{(0)}_i - x^{(1)}_i\]
Parameters:
Returns:

N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.mul2(x0, x1, n_outputs=-1, outputs=None)[source]

Element-wise multiplication.

\[y_i = x^{(0)}_i x^{(1)}_i\]
Parameters:
Returns:

N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.div2(x0, x1, n_outputs=-1, outputs=None)[source]

Element-wise division.

\[y_i = \frac{x^{(0)}_i} {x^{(1)}_i}\]
Parameters:
Returns:

N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.pow2(x0, x1, n_outputs=-1, outputs=None)[source]

Element-wise power function.

\[y_i = {(x^{(0)}_i)} ^ {x^{(1)}_i}\]
Parameters:
Returns:

N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.add_scalar(x, val=1, n_outputs=-1, outputs=None)[source]

Element-wise scalar addition.

\[y_i = x_i + v\]
Parameters:
  • x (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.mul_scalar(x, val=1, n_outputs=-1, outputs=None)[source]

Element-wise scalar multiplication.

\[y_i = v x_i\]
Parameters:
  • x (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.pow_scalar(x, val=1, n_outputs=-1, outputs=None)[source]

Element-wise scalar power function.

\[y_i = (x_i) ^ v\]
Parameters:
  • x (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.r_sub_scalar(x, val=1, n_outputs=-1, outputs=None)[source]

Element-wise scalar subtraction.

\[y_i = v - x_i\]
Parameters:
  • x (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.r_div_scalar(x, val=1, n_outputs=-1, outputs=None)[source]

Element-wise scalar division.

\[y_i = \frac{v}{x_i}\]
Parameters:
  • x (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.r_pow_scalar(x, val=1, n_outputs=-1, outputs=None)[source]

Element-wise scalar power function.

\[y_i = v ^ {x_i}\]
Parameters:
  • x (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Logical

nnabla.functions.equal(x0, x1, n_outputs=-1, outputs=None)[source]

Element wise ‘equal’

\[\begin{split}f(x^{(0)}_i,x^{(1)}_i) = \begin{cases} 1 & (x^{(0)}_i = x^{(1)}_i) \\ 0 & otherwise \end{cases}.\end{split}\]
Parameters:
Returns:

No Description

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.equal_scalar(x0, val=1, n_outputs=-1, outputs=None)[source]

Element wise ‘equal’ with a scalar

\[\begin{split}f(x_i,v) = \begin{cases} 1 & (x_i = v) \\ 0 & otherwise \end{cases}.\end{split}\]
Parameters:
  • x0 (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.greater(x0, x1, n_outputs=-1, outputs=None)[source]

Element wise comparison. The \(i^{th}\) element of the output is:

\[\begin{split}f(x^{(0)}_i,x^{(1)}_i) = \begin{cases} 1 & (x^{(0)}_i > x^{(1)}_i) \\ 0 & (x^{(0)}_i \leq x^{(1)}_i) \end{cases}.\end{split}\]
Parameters:
Returns:

No Description

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.greater_equal(x0, x1, n_outputs=-1, outputs=None)[source]

Element wise comparison. The \(i^{th}\) element of the output is:

\[\begin{split}f(x^{(0)}_i,x^{(1)}_i) = \begin{cases} 1 & (x^{(0)}_i \geq x^{(1)}_i) \\ 0 & (x^{(0)}_i < x^{(1)}_i) \end{cases}.\end{split}\]
Parameters:
Returns:

No Description

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.greater_equal_scalar(x0, val=1, n_outputs=-1, outputs=None)[source]

Element wise comparison with a scalar. The \(i^{th}\) element of the output is:

\[\begin{split}f(x^{(0)}_i,v) = \begin{cases} 1 & (x^{(0)}_i \geq v \\ 0 & (x^{(0)}_i < v \end{cases}.\end{split}\]
Parameters:
  • x0 (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.greater_scalar(x0, val=1, n_outputs=-1, outputs=None)[source]

Element wise comparison with a scalar. The \(i^{th}\) element of the output is:

\[\begin{split}f(x^{(0)}_i,v) = \begin{cases} 1 & (x^{(0)}_i > v \\ 0 & (x^{(0)}_i \leq v \end{cases}.\end{split}\]
Parameters:
  • x0 (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.less(x0, x1, n_outputs=-1, outputs=None)[source]

Element wise comparison. The \(i^{th}\) element of the output is:

\[\begin{split}f(x^{(0)}_i,x^{(1)}_i) = \begin{cases} 1 & (x^{(0)}_i < x^{(1)}_i) \\ 0 & (x^{(0)}_i \geq x^{(1)}_i) \end{cases}.\end{split}\]
Parameters:
Returns:

No Description

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.less_equal(x0, x1, n_outputs=-1, outputs=None)[source]

Element wise comparison. The \(i^{th}\) element of the output is:

\[\begin{split}f(x^{(0)}_i,x^{(1)}_i) = \begin{cases} 1 & (x^{(0)}_i \leq x^{(1)}_i) \\ 0 & (x^{(0)}_i > x^{(1)}_i) \end{cases}.\end{split}\]
Parameters:
Returns:

No Description

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.less_equal_scalar(x0, val=1, n_outputs=-1, outputs=None)[source]

Element wise comparison with a scalar. The \(i^{th}\) element of the output is:

\[\begin{split}f(x^{(0)}_i,v) = \begin{cases} 1 & (x^{(0)}_i \leq v) \\ 0 & (x^{(0)}_i > v) \end{cases}.\end{split}\]
Parameters:
  • x0 (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.less_scalar(x0, val=1, n_outputs=-1, outputs=None)[source]

Element wise comparison with a scalar. The \(i^{th}\) element of the output is:

\[\begin{split}f(x^{(0)}_i,v) = \begin{cases} 1 & (x^{(0)}_i < v) \\ 0 & (x^{(0)}_i \geq v) \end{cases}.\end{split}\]
Parameters:
  • x0 (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.logical_and(x0, x1, n_outputs=-1, outputs=None)[source]

Elementwise logical AND.

\[\begin{split}f(x^{(0)}_i,x^{(1)}_i) = \begin{cases} 1 & (x^{(0)}_i \neq 0 \;\&\; x^{(1)}_i \neq 0) \\ 0 & otherwise \end{cases}.\end{split}\]
Parameters:
Returns:

No Description

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.logical_and_scalar(x0, val, n_outputs=-1, outputs=None)[source]

Elementwise logical AND with scalar.

\[\begin{split}f(x_i,v) = \begin{cases} 1 & (x_i \neq 0 \;\&\; v \neq 0) \\ 0 & otherwise \end{cases}.\end{split}\]
Parameters:
  • x0 (Variable) – Input variable
  • val (bool) – No Description
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.logical_not(x0, n_outputs=-1, outputs=None)[source]

Element-wise logical NOT operation

\[\begin{split}f(x_i) = \begin{cases} 1 & (x_i = 0) \\ 0 & otherwise \end{cases}.\end{split}\]
Parameters:x0 (Variable) – Input variable
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.logical_or(x0, x1, n_outputs=-1, outputs=None)[source]

Elementwise logical OR.

\[\begin{split}f(x^{(0)}_i,x^{(1)}_i) = \begin{cases} 0 & (x^{(0)}_i = 0 \;\&\; x^{(1)}_i = 0) \\ 1 & otherwise \end{cases}.\end{split}\]
Parameters:
Returns:

No Description

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.logical_or_scalar(x0, val, n_outputs=-1, outputs=None)[source]

Elementwise logical OR with scalar.

\[\begin{split}f(x_i,v) = \begin{cases} 0 & (x_i = 0 \;\&\; v = 0) \\ 1 & otherwise \end{cases}.\end{split}\]
Parameters:
  • x0 (Variable) – Input variable
  • val (bool) – No Description
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.logical_xor(x0, x1, n_outputs=-1, outputs=None)[source]

Elementwise logical XOR.

\[\begin{split}f(x^{(0)}_i,x^{(1)}_i) = \begin{cases} 1 & (x^{(0)}_i = 0 \;\&\; x^{(1)}_i = 0) \\ 1 & (x^{(0)}_i \neq 0 \;\&\; x^{(1)}_i \neq 0) \\ 0 & otherwise \end{cases}.\end{split}\]
Parameters:
Returns:

No Description

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.logical_xor_scalar(x0, val, n_outputs=-1, outputs=None)[source]

Elementwise logical XOR with scalar.

\[\begin{split}f(x_i,v) = \begin{cases} 1 & (x_i = 0 \;\&\; v = 0) \\ 1 & (x_i \neq 0 \;\&\; v \neq 0) \\ 0 & otherwise \end{cases}.\end{split}\]
Parameters:
  • x0 (Variable) – Input variable
  • val (bool) – No Description
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.not_equal(x0, x1, n_outputs=-1, outputs=None)[source]

Element wise ‘not equal’

\[\begin{split}f(x^{(0)}_i,x^{(1)}_i) = \begin{cases} 0 & (x^{(0)}_i = x^{(1)}_i) \\ 1 & otherwise \end{cases}.\end{split}\]
Parameters:
Returns:

No Description

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.not_equal_scalar(x0, val=1, n_outputs=-1, outputs=None)[source]

Element wise ‘not equal’ with a scalar

\[\begin{split}f(x_i,v) = \begin{cases} 0 & (x_i = v) \\ 1 & otherwise \end{cases}.\end{split}\]
Parameters:
  • x0 (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.sign(x, alpha=1.0, n_outputs=-1, outputs=None)[source]

Element-wise sign function.

In the forward pass, it is defined as

\[\begin{split}f(x) = \begin{cases} 1 & (x > 0) \\ -1 & (x < 0) \\ \alpha & (x = 0) \end{cases}.\end{split}\]

In the backward pass, it is defined as

\[\frac{\partial f(x)}{\partial x} = 1,\]

or in other words, it behaves as the identity function for the gradient in the backward pass.

Parameters:
  • x (Variable) – Input
  • alpha (float) – Value in case of \(x = 0\). [default=``1.0``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.minimum2(x0, x1, n_outputs=-1, outputs=None)[source]

Element-wise minimum.

\[y_i = \min(x^{(0)}_i, x^{(1)}_i)\]
Parameters:
Returns:

N-D array of min value

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.maximum2(x0, x1, n_outputs=-1, outputs=None)[source]

Element-wise maximum.

\[y_i = \max(x^{(0)}_i, x^{(1)}_i)\]
Parameters:
Returns:

N-D array of max value

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.minimum_scalar(x, val=1.0, n_outputs=-1, outputs=None)[source]

Element-wise scalar minimum.

\[y_i = \min(x_i, v)\]
Parameters:
  • x (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1.0``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.maximum_scalar(x, val=1.0, n_outputs=-1, outputs=None)[source]

Element-wise scalar maximum.

\[y_i = \max (x_i, v)\]
Parameters:
  • x (Variable) – Input variable
  • val (float) – Value of the scalar [default=``1.0``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Math

nnabla.functions.constant(val=0, shape=[], n_outputs=-1, outputs=None)[source]

Generate a constant-valued array.

Parameters:
  • val (float) – Constant value. [default=``0``]
  • shape (tuple of int) – Shape of the output array. [default=``[]``]
Returns:

N-D array where all values are the specified constant.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.arange(start, stop, step=1, n_outputs=-1, outputs=None)[source]

Generate a range of values within the half-open interval [start, stop) (the interval including start but excluding stop) with step increments.

Parameters:
  • start (float) – Start value.
  • stop (float) – End value.
  • step (float) – Step value. [default=``1``]
Returns:

1-D array with the generated values.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.abs(x, n_outputs=-1, outputs=None)[source]

Element-wise absolute value function.

\[y_i = |x_i|\]
Parameters:x (Variable) – Input variable
Returns:Element-wise absolute variable
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.exp(x, n_outputs=-1, outputs=None)[source]

Element-wise natural exponential function.

\[y_i = \exp(x_i).\]
Parameters:x (Variable) – Input variable
Returns:Element-wise exp variable
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.log(x, n_outputs=-1, outputs=None)[source]

Element-wise natural logarithm function.

\[y_i = \ln(x_i).\]
Parameters:x (Variable) – Input variable
Returns:Element-wise log variable
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.round(x, n_outputs=-1, outputs=None)[source]

Element-wise round function.

In the forward pass, this function simply computes round to the nearest integer value.

\[y_i = round(x_i).\]

In the backward pass, the simple Straight-Through Estimator (STE) is applied,

\[\frac{\partial y_i}{\partial x_i} = 1.\]
Parameters:x (Variable) – Input variable
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.ceil(x, n_outputs=-1, outputs=None)[source]

Element-wise ceil function.

In the forward pass, this function simply returns the smallest integer which is not less than the input.

\[y_i = ceil(x_i).\]

In the backward pass, the simple Straight-Through Estimator (STE) is applied,

\[\frac{\partial y_i}{\partial x_i} = 1.\]
Parameters:x (Variable) – Input variable
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.floor(x, n_outputs=-1, outputs=None)[source]

Element-wise floor function.

In the forward pass, this function simply returns the largest integer which is not greater than the input.

\[y_i = floor(x_i).\]

In the backward pass, the simple Straight-Through Estimator (STE) is applied,

\[\frac{\partial y_i}{\partial x_i} = 1.\]
Parameters:x (Variable) – Input variable
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.identity(x, n_outputs=-1, outputs=None)[source]

Identity function.

\[y = x\]
Parameters:x (Variable) – N-D array.
Returns:N-D array
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.matrix_diag(x, n_outputs=-1, outputs=None)[source]

Returns an array where the last two dimensions consist of the diagonal matrix.

Parameters:x (Variable) – N-D array with shape (\(M_0 \times \ldots \times M_N\)).
Returns:N-D array with shape (\(M_0 \times \ldots \times M_N \times M_N\)).
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.matrix_diag_part(x, n_outputs=-1, outputs=None)[source]

Returns an array in which the values of the last dimension consist of the diagonal elements of the last two dimensions of an input array.

Parameters:x (Variable) – N-D array with shape (\(M_0 \times \ldots \times M_N \times M_N\)).
Returns:N-D array with shape (\(M_0 \times \ldots \times M_N\)).
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.batch_matmul(a, b, transpose_a=False, transpose_b=False, n_outputs=-1, outputs=None)[source]

Batch matrix multiplication.

Two of batchs of matrices are multiplied for each sample in a batch. A batch of matrices is composed as […, P, Q] where the last two dimensions compose matrix dimensions, and the first dimensions up to the third last dimension are considered as batch samples.

Parameters:
  • a (Variable) – N-D array with >= 2-dim. The last two dimensions will be treated as a matrix.
  • b (Variable) – N-D array with >= 2-dim. The last two dimensions will be treated as a matrix. The product of the size of 0-th dimension through the size of the third last dimension must be same as that of the input a.
  • transpose_a (bool) – Transpose the last two axes of a in matrix multiplication. [default=``False``]
  • transpose_b (bool) – Transpose the last two axes of b in matrix multiplication. [default=``False``]
Returns:

Output of sample-wise matrix multiplication in a batch. When a is of a shape of [N, P, Q], b is of a shape of [N, Q, R], and transpose options are all False, the output will be a shape of [N, P, R].

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.sin(x, n_outputs=-1, outputs=None)[source]

Element-wise sine (sin) function.

\[y_i = \sin (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.cos(x, n_outputs=-1, outputs=None)[source]

Element-wise cosine (cos) function.

\[y_i = \cos (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.tan(x, n_outputs=-1, outputs=None)[source]

Element-wise tangent (tan) function.

\[y_i = \tan (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.sinh(x, n_outputs=-1, outputs=None)[source]

Element-wise hyperbolic sine (sinh) function.

\[y_i = \sinh (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.cosh(x, n_outputs=-1, outputs=None)[source]

Element-wise hyperbolic cosine (cosh) function.

\[y_i = \cosh (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.tanh(x, n_outputs=-1, outputs=None)[source]

Element-wise hyperbolic tangent (tanh) function.

\[y_i = \tanh (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.asin(x, n_outputs=-1, outputs=None)[source]

Element-wise arcsine (asin) function.

\[y_i = \arcsin (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.acos(x, n_outputs=-1, outputs=None)[source]

Element-wise arccosine (acos) function.

\[y_i = \arccos (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.atan(x, n_outputs=-1, outputs=None)[source]

Element-wise arctangent (atan) function.

\[y_i = \arctan (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.atan2(x0, x1, n_outputs=-1, outputs=None)[source]

Element-wise arctangent (atan) function with 2 input variables.

\[y_i = \arctan2 (x_{i1}, x_{i2})\]
Parameters:
Returns:

N-D array with the same shape as input variables

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.asinh(x, n_outputs=-1, outputs=None)[source]

Element-wise hyperbolic arcsine (asinh) function.

\[y_i = \text{arcsinh} (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.acosh(x, n_outputs=-1, outputs=None)[source]

Element-wise hyperbolic arccosine (acosh) function.

\[y_i = \text{arccosh} (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.atanh(x, n_outputs=-1, outputs=None)[source]

Element-wise hyperbolic arctangent (atanh) function.

\[y_i = \text{arctanh} (x_i)\]
Parameters:x (Variable) – N-D array
Returns:N-D array with the same shape as x
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Array Manipulation

nnabla.functions.concatenate(*x, **kw)[source]

Concatenate a variable number of input arrays along the specified axis.

Parameters:
  • *x (Variable) – N-D arrays. [variadic]
  • axis (int) – Axis [default=``len(x[0].shape) - 1``]
Returns:

Concatenate variable

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.split(x, axis=0)[source]

Split arrays at the specified axis.

It returns a number corresponding the size of the given axis (i.e x.shape[axis]) of Variable s.

Parameters:

Returns: A tuple of Variable s

See also

nnabla.function_bases.split().

nnabla.functions.stack(*x, **kw)[source]

Joins two or more arrays on a new axis.

Note

Unlike nnabla.functions.concatenate() , which joins arrays on an existing axis, Stack joins arrays on a new axis.

Parameters:
  • *x (Variable) – N-D arrays. The sizes of all the arrays to be stacked must be the same. [variadic]
  • axis (int) – The axis on which to concatenate arrays. Axis indices take on values 0, 1, 2, and so on from the left. For example, to stack four (3,28,28) inputs on the second axis, specify 1. In this case, the output size will be (3,4,28,28). [default=``0``]
Returns:

Output

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.slice(x, start=None, stop=None, step=None, n_outputs=-1, outputs=None)[source]

Slice arrays along specified axis. This function complies with python slice wherre slice(None, None, -1) and slice(-1, None, -1) are the special case, which flips the input array and results in the output array from the end to the beginning of the input array along the corresponding dimension.

Parameters:
  • x (Variable) – N-D array
  • start (repeated int64) – Start indices for each axis [default=``(0,) * len(x.shape)``]
  • stop (repeated int64) – Stop indices for each axis [default=``tuple(x.shape)``]
  • step (repeated int64) – Step indices for each axis [default=``(1,) * len(x.shape)``]
Returns:

Sliced N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.gather_nd(data, indices)[source]

Gather elements or slices from data according to indices, which must be at least two-dimensional with the first dimension \(M\) being less or equal to the \(N\) dimensions of data. Given data with shape \((X_0, X_1, ..., X_{N-1})\) and indices with shape \((M, Y_0, ..., Y_{K-1})\) output has shape \((Y_0, ..., Y_{K-1}, X_M, ..., X_{N-1})\). If \(M == N\), output shape is simply \((Y_0, ..., Y_{K-1})\).

The forward of gather_nd() is equivalent to:

def gather_nd(data, index):
    import numpy as np
    tmp_index = index.reshape(index.shape[0], -1)
    tmp_index = (idx + (Ellipsis,) for idx in zip(*new_index))
    out_shape = index.shape[1:] + data.shape[index.shape[0]:]
    return np.vstack(data[idx] for idx in tmp_index).reshape(*out_shape)

Examples:

>>> import numpy as np, nnabla as nn, nnabla.functions as F
>>> nn.set_auto_forward(True)
>>> data = F.arange(1, 11).reshape([2, 5])
>>> print(data.d)
[[ 1.  2.  3.  4.  5.]
 [ 6.  7.  8.  9. 10.]]
>>> F.gather_nd(data, [[1, 1, 0]]).shape
(3, 5)
>>> F.gather_nd(data, [[1, 1, 0], [0, 1, 0]]).shape
(3,)
>>> print(F.gather_nd(data, [[1, 1, 0], [0, 1, 0]]).d)
[6. 7. 1.]
>>> print(F.gather_nd(data, [[1, 1, 0]]).d)
[[ 6.  7.  8.  9. 10.]
 [ 6.  7.  8.  9. 10.]
 [ 1.  2.  3.  4.  5.]]

When indices is provided as a Variable it will be possible to change the actual index values after function creation. It is important to note that out-of-bound indices raise errors when running on CPU but are ignored when using an accelerated computation context.

>>> indices = nn.Variable((2, 1))
>>> indices.d = [[0], [0]]
>>> y = F.gather_nd(data, indices)
>>> print(y.d)
[1.]
>>> indices.d = [[1], [4]]
>>> y.forward()
>>> print(y.d)
[10.]
Parameters:

Returns: ~nnabla.Variable or ~nnabla.NdArray of gathered elements.

nnabla.functions.scatter_nd(data, indices, shape=None, out=None)[source]

Scatter data according to indices into a new array of given shape or an existing array provided as out. Exactly one of the shape or out argument must be given. Given output shape, or shape of out array, \((X_0,X_1,\ldots,X_{N-1})\) and indices shape \((M,Y_0,\ldots,Y_{K-1})\) the input data shape is \((Y_0,\ldots,Y_{K-1},X_M,\ldots,X_{N-1})\), where \(M<=N\). If \(M==N\) the data shape is simply \((Y_0,\ldots,Y_{K-1})\). Note that indices are treated as integers and potentially converted.

The forward of scatter_nd() is equivalent to:

def scatter_nd(data, indices, shape=None, out=None):
    assert (shape and not out) or (out and not shape)
    if isinstance(indices, numpy.ndarray)
        indices = indices.tolist()
    result = out if out else numpy.zeros(shape)
    result[indices] = data
    return result

Examples:

>>> import numpy as np, nnabla as nn, nnabla.functions as F
>>> nn.set_auto_forward(True)
>>> data = nn.Variable.from_numpy_array(np.array([9, 10, 11, 12]))
>>> indices = nn.Variable.from_numpy_array(np.array([[4, 3, 1, 7]]))
>>> scattered = F.scatter_nd(data, indices, shape=(8,))
>>> print(scatterd.d)
[ 0. 11.  0. 10.  9.  0.  0. 12.]
>>> print(F.gather_nd(scattered, indices).d)
[ 9. 10. 11. 12.]
Parameters:

Returns: ~nnabla.Variable or ~nnabla.NdArray of given shape.

nnabla.functions.pad(x, pad_width, mode='constant', constant_value=0, n_outputs=-1, outputs=None)[source]

Pad the input N-D array x over the number of dimensions given by half the length of the pad_width iterable, where every two values in pad_width determine the before and after pad size of an axis. The pad_width iterable must hold an even number of positive values which may cover all or fewer dimensions of the input variable x. If pad_width covers fewer dimensions then it applies to the innermost dimensions of x.

x = nn.Variable.from_numpy_array(np.ones((2, 3, 4)))
assert F.pad(x, (1, 1, 2, 2)).shape == (2, 5, 8)

Padding is performed according to the requested mode:

constant

Pads with a value given by the keyword argument constant_value.

x = nn.Variable.from_numpy_array(np.array([1, 2, 3, 4], dtype=np.int))
y = F.pad(x, (3, 3), 'constant', constant_value = -1)
y.forward()
assert np.all(y.d == np.array([-1, -1, -1, 1, 2, 3, 4, -1, -1, -1]))
reflect

Pads with the reflection of the vector mirrored on the first and last values of the vector along each axis.

x = nn.Variable.from_numpy_array(np.array([1, 2, 3, 4], dtype=np.int))
y = F.pad(x, (3, 3), 'reflect')
y.forward()
assert np.all(y.d == np.array([4, 3, 2, 1, 2, 3, 4, 3, 2, 1]))
Parameters:
  • x (Variable) – N-D array
  • pad_width (repeated int64) – Iterable of before and after pad values.
  • mode (string) – Padding mode string. [default=``’constant’``]
  • constant_value (float) – Fill value if mode is constant. [default=``0``]
Returns:

Padded N-D array with the same number of dimensions as the input.

x = nn.Variable((3, 3, 4, 2))  # a shape like (B, C, H, W)
# 1-D padding: last dim by 1 left and 2 on the right side
assert F.pad(x, (1, 2)).shape == (3, 3, 4, 5)
# 2-D padding: last dim by (1, 1) and 2nd to last by (2, 2)
assert F.pad(x, (2, 2, 1, 1)).shape == (3, 3, 8, 4)
# 3-D padding: dims C by (0, 1), H by (2, 1), and W by (3, 3)
assert F.pad(x, (0, 1, 2, 1, 3, 3)).shape == (3, 4, 7, 8)

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.transpose(x, axes, n_outputs=-1, outputs=None)[source]

Transposes tensor dimensions.

Parameters:
  • x (Variable) – N-D array
  • axes (repeated int64) – Source axis indices for each axis.
Returns:

Transposed N-D array.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.broadcast(x, shape, n_outputs=-1, outputs=None)[source]

Broadcasting ND-array to the specified shape.

Parameters:
  • x (Variable) – N-D array
  • shape (tuple of int) – Shape broadcasted to. The size must be the same in axis where x’s shape is not 1.
Returns:

Broadcasted N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.broadcast_to(x, y, axis=None, n_outputs=-1, outputs=None)[source]

Warning

This function is experimental support, so please do not actively use it.

Broadcasting ND-array to the specified buffer.

Parameters:
  • x (Variable) – N-D array
  • y (Variable) – N-D array
  • axis (int) – Target axis to start broadcasting. If this is not set, broadcast will try to fit y to x starting from the last dimension [default=``-1``]
Returns:

Broadcasted N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.tile(x, reps)[source]

Forward x repeated the number of times given by reps. If reps is a sequence, the output has dimension of d = max(len(reps), x.ndim) and either x is promoted to be d-dimensional by prepending new axes or reps is promoted to x.ndim by prepending 1’s.

Parameters:
  • x (Variable) – Input N-D array.
  • reps (int or sequence of int) – Repetitions of x along each axis.
Returns:

N-D array.

Return type:

Variable

>>> import numpy as np, nnabla as nn, nnabla.functions as F
>>> F.tile(nn.Variable([2, 3]), 3).shape    # reps is promoted to [1, 3]
(2, 9)
>>> F.tile(nn.Variable([3]), [2, 3]).shape  # x is promoted to shape (1, 3)
(2, 9)
>>> nn.set_auto_forward(True)
>>> x = nn.Variable.from_numpy_array(np.array([1, 2, 3]))
>>> print(F.tile(x, 3).d)
[1. 2. 3. 1. 2. 3. 1. 2. 3.]
>>> print(F.tile(x, [2, 3]).d)
[[1. 2. 3. 1. 2. 3. 1. 2. 3.]
 [1. 2. 3. 1. 2. 3. 1. 2. 3.]]
>>> x = nn.Variable.from_numpy_array(np.array([[1, 3], [2, 4]]))
>>> print(F.tile(x, 3).d)
[[1. 3. 1. 3. 1. 3.]
 [2. 4. 2. 4. 2. 4.]]
>>> print(F.tile(x, [2, 3]).d)
[[1. 3. 1. 3. 1. 3.]
 [2. 4. 2. 4. 2. 4.]
 [1. 3. 1. 3. 1. 3.]
 [2. 4. 2. 4. 2. 4.]]
nnabla.functions.flip(x, axes=None, n_outputs=-1, outputs=None)[source]

Reverses the order of elements of the specified dimension of an array.

Parameters:
  • x (Variable) – N-D array
  • axes (repeated int64) – The index of the dimension to reverse the order of the elements. Axis indices take on values 0, 1, 2, and so on from the left. For example, to flip a 32 (W) by 24 (H) 100 RGB image (100,3,24,32) vertically and horizontally, specify (2,3). [default=``[len(x.shape) - 1]``]
Returns:

N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.shift(x, shifts=None, border_mode='nearest', n_outputs=-1, outputs=None)[source]

Shifts the array elements by the specified amount.

Parameters:
  • x (Variable) – N-D array.
  • shifts (repeated int64) – The amount to shift elements. For example, to shift image data to the right by 2 pixels and up 3 pixels, specify (-3,2). [default=``(0,) * len(x.shape)``]
  • border_mode (string) – Specify how to process the ends of arrays whose values will be undetermined as a result of shifting. nearest: The data at the ends of the original array is copied and used. reflect: Original data reflected at the ends of the original array is used. [default=``’nearest’``]
Returns:

N-D array.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.sort(x, axis=-1, reverse=False, with_index=False, only_index=False)[source]

Sorts the elements of x along a given axis in ascending order by value. A negative axis counts from the last dimension of x, so the default of -1 sorts along the last dimension. If reverse is True, then the elements are soreted in descending order.

If with_index is True, result is a tuple (sorted, indices) or only indices if only_index is True. Setting only_index to True implies that with_index is also True.

import numpy as np
import nnabla as nn
import nnabla.functions as F

nn.set_auto_forward(True)
x = nn.Variable.from_numpy_array(np.random.rand(2, 3, 4))

sorted = F.sort(x)
assert np.allclose(sorted.d, np.sort(x.d))

sorted, indices = F.sort(x, with_index=True)
assert np.allclose(sorted.d, np.sort(x.d))
assert np.all(indices.d == np.argsort(x.d))

indices = F.sort(x, only_index=True)
assert np.all(indices.d == np.argsort(x.d))
Parameters:
  • x (Variable) – N-D array
  • axis (int) – Axis along which to sort.
  • reverse (bool) – Sort in descending order.
  • with_index (bool) – Return sorted values and index.
  • only_index (bool) – Return only the sort index.

Returns: ~nnabla.Variable sorted or ~nnabla.Variable indices or (~nnabla.Variable sorted, ~nnabla.Variable indices)

nnabla.functions.reshape(x, shape, inplace=True, n_outputs=-1, outputs=None)[source]

Reshapes the input variable in-place. It does not create a copy of the variable. The output variable (y) has a new shape but points to the same data as the input variable (x). This means that if the data in the output variable (y) is modified, the data in the input variable (x) also gets modified since the reshape was done in-place.

Note

This function has the same behavior as the nnabla.Variable.reshape() method.

Parameters:
  • x (Variable) – N-D array.
  • shape (tuple of int) – Dimensions for each axis. -1 can be specified only in one shape dimension. The value is calculated from the size of the array and remaining dimensions.
  • inplace (bool) – The output array is shared with the input array if True. [default=``True``]
Returns:

Reshaped N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.one_hot(x, shape, n_outputs=-1, outputs=None)[source]

This function creates one-hot vector based on input indices.

Example:

import nnabla as nn
import nnabla.functions as F
import numpy as np

labels = nn.Variable.from_numpy_array(np.array([[9], [4], [5], [1], [0]]))
print(labels.shape)  # (5, 1)

num_class = 10

y_train = F.one_hot(labels, shape=(num_class, ))
y_train.forward()

print(y_train.shape)  # (5, 10)
print(y_train.d)

# [[0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
#  [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
#  [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
#  [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
#  [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

# Can also be used for ndarray.

labels = nn.Variable.from_numpy_array(np.array([[1, 7], [4, 7], [8, 6], [5, 0], [2, 6]]))
print(labels.shape)  # (5, 2)

num_class_1, num_class_2  = 10, 8

y_train = F.one_hot(labels, shape=(num_class_1, num_class_2))
y_train.forward()

print(y_train.shape)  # (5, 10, 8)
print(y_train.d)

# [[[0. 0. 0. 0. 0. 0. 0. 0.]          [[0. 0. 0. 0. 0. 0. 0. 0.]
#   [0. 0. 0. 0. 0. 0. 0. 1.]           [0. 0. 0. 0. 0. 0. 0. 0.]
#   [0. 0. 0. 0. 0. 0. 0. 0.]           [0. 0. 0. 0. 0. 0. 1. 0.]
#   [0. 0. 0. 0. 0. 0. 0. 0.]           [0. 0. 0. 0. 0. 0. 0. 0.]
#   [0. 0. 0. 0. 0. 0. 0. 0.]           [0. 0. 0. 0. 0. 0. 0. 0.]
#   [0. 0. 0. 0. 0. 0. 0. 0.]    ...    [0. 0. 0. 0. 0. 0. 0. 0.]
#   [0. 0. 0. 0. 0. 0. 0. 0.]           [0. 0. 0. 0. 0. 0. 0. 0.]
#   [0. 0. 0. 0. 0. 0. 0. 0.]           [0. 0. 0. 0. 0. 0. 0. 0.]
#   [0. 0. 0. 0. 0. 0. 0. 0.]           [0. 0. 0. 0. 0. 0. 0. 0.]
#   [0. 0. 0. 0. 0. 0. 0. 0.]],         [0. 0. 0. 0. 0. 0. 0. 0.]]]
Parameters:
  • x (Variable) – N-D array representing label’s indice.
  • shape (tuple of int) – Number of classes. Note that it must be exactly the same as the number of classes included in label data. Passing incorrect numbers might cause an unexpected error and currently this function doesn’t check if the input is valid or not. Also, when nd-labels are given, dimensions must match. See the example above.
Returns:

N-D array one-hot vector/tensor.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.assign(dst, src, n_outputs=-1, outputs=None)[source]

Assign source array to destination array just like tf.assign. This is useful to synchronize or manually update parameters.

dst = nn.Variable((2, 3, 4))
src = nn.Variable((2, 3, 4))
assign = F.assign(dst, src)

assign.forward()
assert np.allclose(dst.d, src.d) # dst and src have identical values.
assert np.allclose(assign.d dst.d) # returned Variable is also identical to dst.

Unlike TensorFlow, the returned Variable has a backward path to dst:

\[g_{dst} = g_{y}\]
Parameters:
  • dst (Variable) – A destination N-D array
  • src (Variable) – A source N-D array
Returns:

An assigned array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Stochasticity

nnabla.functions.rand(low=0, high=1, shape=[], seed=-1, n_outputs=-1, outputs=None)[source]

Samples numbers from a uniform distribution \(x \sim U(low, high)\) given lowest value \(low\), upper bound \(high\), and shape of the returned Variable.

Parameters:
  • low (float) – \(low\) in definition. [default=``0``]
  • high (float) – \(high\) in definition. [default=``1``]
  • shape (tuple of int) – Shape of returned variable. [default=``[]``]
  • seed (int) – Random seed. When -1, seed is sampled from global random number generator. [default=``-1``]
Returns:

Variable with the shape specified in the argument.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.randint(low=0, high=1, shape=[], seed=-1, n_outputs=-1, outputs=None)[source]

Samples integer numbers from a uniform distribution \(x \sim U(low, high)\) given lowest value \(low\), upper bound \(high\), and shape of the returned Variable.

Parameters:
  • low (int) – \(low\) in definition. [default=``0``]
  • high (int) – \(high\) in definition. [default=``1``]
  • shape (tuple of int) – Shape of returned variable. [default=``[]``]
  • seed (int) – Random seed. When -1, seed is sampled from global random number generator. [default=``-1``]
Returns:

Variable with the shape specified in the argument. The dtype is int32.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.randn(mu=0, sigma=1, shape=[], seed=-1, n_outputs=-1, outputs=None)[source]

Samples numbers from a normal distribution \(x \sim N(\mu, \sigma)\) given mean \(\mu\), standard deviation \(\sigma\), and shape of the returned Variable.

Parameters:
  • mu (float) – \(\mu\) in definition. [default=``0``]
  • sigma (float) – \(\sigma\) in definition. [default=``1``]
  • shape (tuple of int) – Shape of returned variable. [default=``[]``]
  • seed (int) – Random seed. When -1, seed is sampled from global random number generator. [default=``-1``]
Returns:

Variable with the shape specified in the argument.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.dropout(x, p=0.5, seed=-1, n_outputs=-1, outputs=None)[source]

Dropout. Samples a number \(u\) from a uniform distribution in \([0, 1]\) , and ignores the input if \(u \leq p\).

\[\begin{split}y = \left\{ \begin{array}{ll} \frac{x}{1 - p} & (u > p) \\ 0 & ({\rm otherwise}) \end{array} \right.\end{split}\]

Note

Usually dropout only applied during training as below (except Bayesian dropout).

h = PF.affine(x, num_hidden)
if train:
    h = F.dropout(h, 0.5)
Parameters:
  • x (Variable) – N-D array
  • p (float) – \(p\) in definition. [default=``0.5``]
  • seed (int) – Random seed. When -1, seed is sampled from global random number generator. [default=``-1``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.top_k_data(x, k, abs=False, reduce=True, base_axis=1, n_outputs=-1, outputs=None)[source]

Select the k largest values from each sample in x to propagate unmodified and set all other values to 0. If abs is True, the k largest values are selected by magnitude. If reduce is True (the default), all feature dimensions are reduced to a single dimension of size k that propagates only the k largest values. Otherwise, if reduce is False, input and output dimensions are identical. Dimensions before base_axis are treated as number of sample dimensions and k values get selected from all elements of a sample (dimensions from base_axis) regardless of shape.

>>> import nnabla as nn, nnabla.functions as F
>>> x = nn.Variable((4, 5, 6))
>>> F.top_k_data(x, 3, reduce=False).shape
(4, 5, 6)
>>> F.top_k_data(x, 3, reduce=True).shape
(4, 3)
>>> F.top_k_data(x, 3, reduce=True, base_axis=2).shape
(4, 5, 3)
Parameters:
  • x (Variable) – N-D array
  • k (int) – Number of largest data values to propagate.
  • abs (bool) – Determine largest data values by magnitude. [default=``False``]
  • reduce (bool) – Reduce feature size to one dimension of size k. [default=``True``]
  • base_axis (int) – First dimension of the sample shape. [default=``1``]
Returns:

N-D array.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.top_k_grad(x, k, abs=False, base_axis=1, n_outputs=-1, outputs=None)[source]

Select the k largest gradients for each sample in x to back-propagate unmodified and set all other gradients to 0. If abs is True, the k largest gradients are selected by magnitude. Dimensions before base_axis are treated as number of sample dimensions and k gradients get selected from all gradients of a sample (dimensions from base_axis) regardless of shape.

Parameters:
  • x (Variable) – N-D array
  • k (int) – Number of largest gradients to propagate.
  • abs (bool) – Determine largest gradients by magnitude. [default=``False``]
  • base_axis (int) – First dimension of the sample shape. [default=``1``]
Returns:

N-D array with same shape and data as x.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.random_choice(x, w, shape=[], replace=True, seed=-1, n_outputs=-1, outputs=None)[source]

Generate random samples from population x with selection probabilities determined by the relative weights w. The number of samples to draw is given by the product of shape`s dimensions, and the samples are returned with the given `shape. By default, samples are drawn with replacement, i.e. selection of a specific population member is solely determined by its associated weight. Sampling without replacement, where any population member may be drawn only once, is used if replace is set to False.

For both x and w the innermost dimension corresponds to the individual populations and their weights from which samples are returned with the requested shape following all outermost dimensions of the input.

import nnabla as nn
import nnabla.functions as F
import numpy as np
nn.set_auto_forward(True)

# x holds two populations
x = nn.Variable.from_numpy_array(np.array([[11, 22, 33], [110, 220, 330]]))
# w holds the weights for each population
w = nn.Variable.from_numpy_array(np.array([[10, 20, 70], [70, 20, 10]]))

# draw one sample from each population
y = F.random_choice(x, w)  # y.shape => (2, 1)

# draw 12 samples with shape (3, 4) from each population
y = F.random_choice(x, w, shape=(3, 4))  # y.shape => (2, 3, 4)

Note that weights must not be less than zero and for each population the sum of weights must be greater than zero. Additionally, sampling without replacement requires that the number of non-zero weights is not less than the number of samples to be drawn. These conditions are verified in “cpu” computation context but not when using “cuda” or “cudnn” acceleration (this would require additional device synchronization steps penalizing performance).

Random sampling from an implicit array of index values (like categorical or multinomial) can be realized with input x constructed as indices.

w = nn.Variable.from_numpy_array(np.array([1, 2, 3, 2, 1]))
y = F.random_choice(F.arange(0, 5), w)
Parameters:
  • x (Variable) – N-D array from which a random sample is generated.
  • w (Variable) – N-D array of associated weights of elements in x.
  • shape (tuple of int) – Number and shape of generated samples. [default=``[]``]
  • replace (bool) – Whether sampling is with or without replacement. [default=``True``]
  • seed (int) – Random seed. [default=``-1``]
Returns:

N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.random_crop(x, shape=None, base_axis=1, seed=-1, n_outputs=-1, outputs=None)[source]

RandomCrop randomly extracts a portion of an array.

Parameters:
  • x (Variable) – N-D array
  • shape (tuple of int) – The data size to extract. For example, to randomly extract a portion of the image (3,48,48) from a 3,64,64 image, specify (3,48,48). [default=``x.shape``]
  • base_axis (int) – No Description [default=``1``]
  • seed (int) – Random seed. When -1, seed is sampled from global random number generator. [default=``-1``]
Returns:

N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.random_flip(x, axes=None, base_axis=1, seed=-1, n_outputs=-1, outputs=None)[source]

Reverses the order of elements of the specified dimension of an array at 50% probability.

Parameters:
  • x (Variable) – N-D array
  • axes (repeated int64) – The index of the axis to reverse the order of the elements. Axis indices take on values 0, 1, 2, and so on from the left. For example, to flip a 32 (W) by 24 (H) 100 RGB images (100, 3,24,32) vertically and horizontally at random, specify (2,3). [default=``[len(x.shape) - 1]``]
  • base_axis (int) – No Description [default=``1``]
  • seed (int) – Random seed. When -1, seed is sampled from global random number generator. [default=``-1``]
Returns:

N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.random_shift(x, shifts=None, border_mode='nearest', base_axis=1, seed=-1, n_outputs=-1, outputs=None)[source]

Randomly shifts the array elements within the specified range.

Parameters:
  • x (Variable) – N-D array.
  • shifts (repeated int64) – Max absolute amount to shift elements. For example, to shift image data horizontally by \(\pm 2\) pixels and vertically by \(\pm 3\) pixels, specify (3,2). [default=``(0,) * len(x.shape)``]
  • border_mode (string) – Specify how to process the ends of arrays whose values will be undetermined as a result of shifting. nearest: The data at the ends of the original array is copied and used. reflect: Original data reflected at the ends of the original array is used. [default=``’nearest’``]
  • base_axis (int) – No Description [default=``1``]
  • seed (int) – Random seed. When -1, seed is sampled from global random number generator. [default=``-1``]
Returns:

N-D array.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.image_augmentation(x, shape=None, pad=(0, 0), min_scale=1.0, max_scale=1.0, angle=0.0, aspect_ratio=1.0, distortion=0.0, flip_lr=False, flip_ud=False, brightness=0.0, brightness_each=False, contrast=1.0, contrast_center=0.0, contrast_each=False, noise=0.0, seed=-1, n_outputs=-1, outputs=None)[source]

ImageAugmentation randomly alters the input image.

Parameters:
  • x (Variable) – N-D array.
  • shape (tuple of int) – The output image data size. [default=``x.shape``]
  • pad (tuple of int) – Border padding values for each spatial axis. Padding will be added both sides of the dimension. [default=``(0, 0)``]
  • min_scale (float) – The minimum scale ratio when randomly scaling the image. For example, to scale down to 0.8 times the size of the original image, specify “0.8”. To not apply random scaling, set both min_scale and max_scale to “1.0”. [default=``1.0``]
  • max_scale (float) – The maximum scale ratio when randomly scaling the image. For example, to scale down to 2 times the size of the original image, specify “2.0”. [default=``1.0``]
  • angle (float) – The rotation angle range in radians when randomly rotating the image. The image is randomly rotated in the -Angle to +Angle range. For example, to rotate in a +-15 degree range, specify “0.26” (15 degrees/360 degrees * 2PI). To not apply random rotation, specify “0.0”. [default=``0.0``]
  • aspect_ratio (float) – The aspect ratio range when randomly deforming the image. For example, to deform aspect ratio of image from 1:1.3 to 1.3:1, specify “1.3”. To not apply random deforming, specify “1.0”. [default=``1.0``]
  • distortion (float) – The distortion range when randomly distorting the image. To not apply distortion, specify “0.0”. [default=``0.0``]
  • flip_lr (bool) – Whether to randomly flip the image horizontally at 50% probability. [default=``False``]
  • flip_ud (bool) – Whether to randomly flip the image vertically at 50% probability. [default=``False``]
  • brightness (float) – The absolute range of values to randomly add to the brightness. A random value in the -Brightness to +Brightness range is added to the brightness. For example, to vary the brightness in the -0.05 to +0.05 range, specify “0.05”. To not apply random addition to brightness, specify “0.0”. [default=``0.0``]
  • brightness_each (bool) – Whether to apply the random addition to brightness (as specified by brightness) to each color channel. True: brightness is added based on a different random number for each channel. False: brightness is added based on a random number common to all channels. [default=``False``]
  • contrast (float) – The range in which to randomly vary the image contrast. The contrast is varied in the 1/Contrast times to Contrast times range. The output brightness is equal to (input - contrast_center) * contrast + contrast_center. For example, to vary the contrast in the 0.91 times to 1.1 times range, specify “1.1”. To not apply random contrast variation, specify “1.0”. [default=``1.0``]
  • contrast_center (float) – Intensity center used for applying contrast. [default=``0.0``]
  • contrast_each (bool) – Whether to apply the random contrast variation (as specified by contrast) to each color channel. True: contrast is varied based on a different random number for each channel. False: contrast is varied based on a random number common to all channels. [default=``False``]
  • noise (float) – Sigma of normal random number to be added. [default=``0.0``]
  • seed (int) – Random seed. When -1, seed is sampled from global random number generator. [default=``-1``]
Returns:

N-D array.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Loss Functions

nnabla.functions.sigmoid_cross_entropy(x, target, n_outputs=-1, outputs=None)[source]

Element-wise cross entropy between x and the target variables, passed to a sigmoid function.

\[y_i = - \left(x^{(1)}_i \ln \left(\sigma \left(x^{(0)}_i \right)\right) + \ \left(1 - x^{(1)}_i\right) \ln \left(1 - \sigma \left(x^{(0)}_i \ \right)\right)\right)\]

where \(\sigma(s)=\frac{1}{1+\exp(-s)}\).

Note

SigmoidCrossEntropy is equivalent to Sigmoid+BinaryCrossEntropy, but computing them at once has the effect of reducing computational error.

Parameters:
  • x (Variable) – N-D array. Typically indicates a score. The value lies in \([-\infty, \infty]\) [parameter]
  • target (Variable) – N-D array of labels. Only 0 or 1 value is allowed. [parameter]
Returns:

N-D array of element-wise losses.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.binary_cross_entropy(x, target, n_outputs=-1, outputs=None)[source]

Element-wise cross entropy between x and the target variables.

\[y_i = - \left(x^{(1)}_i * \ln \left(x^{(0)}_i\right) + \left(1 - \ x^{(1)}_i\right) * \ln \left(1 - x^{(0)}_i\right)\right).\]
Parameters:
  • x (Variable) – Probabilities N-D array. \(-\infty\) to \(\infty\).
  • target (Variable) – N-D array of labels. Usually set as 0 or 1, but, unlike SigmoidCrossEntropy, it allows probability (0 to 1) as inputs and backpropagation can be done.
Returns:

N-D array of element-wise losses.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.softmax_cross_entropy(x, target, axis=None, n_outputs=-1, outputs=None)[source]

Element-wise cross entropy between the variables and the variables of a label given by a category index with Softmax normalization.

\[y_{j} = -\ln \left(\frac{\exp(x_{j,t_j})}{\sum_{i'} \exp(x_{j,i'})}\right)\]

along dimension specified by axis (\(i\) is the axis where normalization is performed on).

Note

SoftmaxCrossEntropy is equivalent to Softmax+CategoricalCrossEntropy, but computing them at once has the effect of reducing computational error.

Parameters:
  • x (Variable) – N-D array. Typically indicates a score. \((D_1 \times ... \times D_i \times ... \times D_N)\) [parameter]
  • target (Variable) – N-D array of labels. \((D_1 \times ... \times 1 \times ... \times D_N)\) [parameter]
  • axis (int) – Axis normalization is taken. [default=``len(x.shape) - 1``]
Returns:

N-D array of element-wise losses. \((D_1 \times ... \times 1 \times ... \times D_N)\)

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.categorical_cross_entropy(x, target, axis=None, n_outputs=-1, outputs=None)[source]

Element-wise cross entropy between x and the target t where targets are given by a category index.

\[y_{j} = -\ln \left( x_{j, t_j} \right)\]

along dimension specified by axis (\(i\) is the axis where normalization is performed on).

Parameters:
  • x (Variable) – N-D array. Typically indicates a score. \((D_1 \times ... \times D_i \times ... \times D_N)\) [parameter]
  • target (Variable) – N-D array of labels. \((D_1 \times ... \times 1 \times ... \times D_N)\) [parameter]
  • axis (int) – Axis normalization is taken. [default=``len(x.shape) - 1``]
Returns:

N-D array of element-wise losses. \((D_1 \times ... \times 1 \times ... \times D_N)\)

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.squared_error(x0, x1, n_outputs=-1, outputs=None)[source]

Element-wise squared error

\[y_i = \left(x^{(0)}_i - x^{(1)}_i\right)^2.\]
Parameters:
Returns:

N-D array.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.absolute_error(x0, x1, n_outputs=-1, outputs=None)[source]

Element-wise absolute error

\[y_i = | x^{(0)}_i - x^{(1)}_i |.\]
Parameters:
Returns:

N-D array.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.huber_loss(x0, x1, delta=1.0, n_outputs=-1, outputs=None)[source]

Element-wise Huber loss

\[\begin{split}y_i= \left\{ \begin{array}{ll} d^2 & (|d| < \delta)\\ \delta (2 |d| - \delta) & ({\rm otherwise}) \end{array} \right.\end{split}\]

where \(d = x^{(0)}_i - x^{(1)}_i\)

Parameters:
Returns:

N-D array of element-wise losses.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.epsilon_insensitive_loss(x0, x1, epsilon, n_outputs=-1, outputs=None)[source]

Element-wise Epsilon Insensitive Loss

\[\begin{split}y_i= \left\{ \begin{array}{ll} | x^{(0)}_i - x^{(1)}_i | - \epsilon & if \ \ | x^{(0)}_i - x^{(1)}_i | > \epsilon \\ 0 & otherwise \end{array} \right.\end{split}\]
Parameters:
Returns:

N-D array of element-wise losses.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.kl_multinomial(p, q, base_axis=1, n_outputs=-1, outputs=None)[source]

The Kullback Leibler Divergence for multinomial distributions.

\[D = \sum_i p_i \log \left( \frac{p_i}{q_i} \right)\]
Parameters:
  • p (Variable) – N-D array of the source categorical probabilities
  • q (Variable) – N-D array of the target categorical probabilities
  • base_axis (int) – Dimensions up to base_axis is treated as sample dimension. [default=``1``]
Returns:

Kullback Leibler divergence \(KL(p \parallel q)\).

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Signal Processing

nnabla.functions.interpolate(x, scale=None, output_size=None, mode='linear', align_corners=None)[source]

Resize an ND array with interpolation.

Scaling factors for spatial dimensions are determined by either scale or output_size.

nd = len(scale) or nd = len(output_size) determines the number of spatial dimensions, and the last nd dimensions of the input x are considered as the spatial dimensions to be resized.

If scale is given, the output_size is calculated by output_size[i] = floor(scale[i] * x.shape[i - len(scale)]).

Example:

import numpy as np
import nnabla as nn
import nnabla.functions as F

x_data = np.random.rand(64, 3, 224, 224)
x = nn.Variable.from_numpy_array(x_data)

# Resize by scales
y = F.interpolate(x, scale=(2, 2), mode='linear')
print(y.shape)  # (64, 3, 448, 448)
y.forward()
print(y.d)  # Print output

# Resize to a size
y2 = F.interpolate(x, output_size=(320, 257), mode='linear')
print(y2.shape)  # (64, 3, 320, 257)
y2.forward()
print(y2.d)  # Print output
Parameters:
  • x (Variable) – N-D array with an arbitrary number of dimensions.
  • scale (tuple of ints) – Scale factors along axes. The default is None, and if this is omitted, output_size must be specified.
  • output_size (tuple of ints) – The output sizes for axes. If this is given, the scale factors are determined by the output sizes and the input sizes. The default is None, and if this is omitted, scale must be specified.
  • mode (str) – Interpolation mode chosen from (‘linear’|’nearest’). The default is ‘linear’.
  • align_corners (bool) – If true, the corner pixels of input and output arrays are aligned, such that the output corner pixels have the same values with the input corner pixels. The default is None, and it becomes True if mode is ‘linear’, otherwise False.
Returns:

N-D array.

Return type:

Variable

nnabla.functions.fft(x, signal_ndim, normalized=False, n_outputs=-1, outputs=None)[source]

Complex-to-complex Discrete Fourier Transform,

\[X_{k_1, \ldots, k_d} = \sum_{n_1=0}^{N_1-1} \dots \sum_{n_d=0}^{N_d-1} x_{n_1, \ldots, n_d} \exp\left(-2 \pi j \left( \sum_{i=0}^{d} \frac{k_i n_i}{N_i} \right) \right),\]

where

\[k_i = 0, \ldots, N_i - 1.\]

This function now supports 1-D, 2-D, and 3-D DFT with or without the leading batch dimension(s).

The input is expected to be complex-valued with at least signal_ndim + 1 dimensions. The last dimension has a shape of two where x[…, 0] is the real part and x[…, 1] the imaginary part.

Example:

import numpy as np
import nnabla as nn
import nnabla.functions as F
from nnabla.ext_utils import get_extension_context

ctx = get_extension_context("cudnn")
nn.set_default_context(ctx)

# Example for a batched 2D-FFT and 2D-IFFT (batch-size: 2, data-size: 4x3)
x_data = np.random.rand(2, 4, 3) + 1j * np.random.rand(2, 4, 3)
x = nn.Variable.from_numpy_array(np.stack([np.real(x_data), np.imag(x_data)], axis=3))
y = F.fft(x, signal_ndim=2, normalized=True)
z = F.ifft(y, signal_ndim=2, normalized=True)
z.forward()

np.allclose(z.d[..., 0] + 1j*z.d[...,1], x_data)
Parameters:
  • x (Variable) – Input.
  • signal_ndim (int) – The number of dimensions for each signal. It must be 1, 2, or 3.
  • normalized (bool) – Use unitary normalization. If True, the normalization constant \(\sqrt{\frac{1}{\prod_{i=1}^{d} N_i}}\) is multiplied. [default=``False``]
Returns:

FFT transformed signal.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.ifft(x, signal_ndim, normalized=False, n_outputs=-1, outputs=None)[source]

Complex-to-complex inverse Discrete Fourier Transform,

\[X_{k_1, \ldots, k_d} = \frac{1}{\prod_{i=1}^{d} N_i} \sum_{n_1=0}^{N_1-1} \dots \sum_{n_d=0}^{N_d-1} x_{n_1, \ldots, n_d} \exp\left(2 \pi j \left( \sum_{i=0}^{d} \frac{k_i n_i}{N_i} \right) \right),\]

where

\[k_i = 0, \ldots, N_i - 1.\]

This function now supports 1-D, 2-D, and 3-D DFT with or without the leading batch dimension(s).

The input is expected to be complex-valued with at least signal_ndim + 1 dimensions. The last dimension has a shape of two where x[…, 0] is the real part and x[…, 1] the imaginary part.

Parameters:
  • x (Variable) – Input.
  • signal_ndim (int) – The number of dimensions for each signal. It must be 1, 2, or 3.
  • normalized (bool) – Use unitary normalization. If True, the normalization constant \(\frac{1}{\prod_{i=1}^{d} N_i}\) becomes \(\sqrt{\frac{1}{\prod_{i=1}^{d} N_i}}\). [default=``False``]
Returns:

IFFT transformed signal.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.stft(x, window_size, stride, fft_size, window_type='hanning', center=True, pad_mode='reflect')[source]

Computes the short-time Fourier transform

Parameters:
  • x (Variable) – Time domain sequence of size batch_size x sample_size.
  • window_size (int) – Size of STFT analysis window.
  • stride (int) – Number of samples that we shift the window, also called hop size.
  • fft_size (int) – Size of the FFT, the output will have fft_size // 2+ 1 frequency bins.
  • window_type (str) – Analysis window, can be either hanning, hamming or rectangular. For convenience, also window_type=None is supported which is equivalent to window_type='rectangular'.
  • center (bool) – If True, then the signal x is padded by half the FFT size using reflection padding.
  • pad_mode (str) – Padding mode, which can be 'constant' or 'reflect'. 'constant' pads with 0.
Returns:

Returns real and imaginary parts of STFT result.

  • Variable: Real part of STFT of size batch_size x fft_size//2 + 1 x frame_size.
  • Variable: Imaginary part of STFT of size batch x fft_size//2 + 1 x frame_size.

nnabla.functions.istft(y_r, y_i, window_size, stride, fft_size, window_type='hanning', center=True)[source]

Computes the inverse shoft-time Fourier transform

Note: We use a constant square inverse window for the reconstruction of the time-domain signal, therefore, the first and last window_size - stride are not perfectly reconstructed.

Parameters:
  • y_r (Variable) – Real part of STFT of size batch_size x fft_size//2 + 1 x frame_size.
  • y_i (Variable) – Imaginary part of STFT of size batch_size x fft_size//2 + 1 x frame_size.
  • window_size (int) – Size of STFT analysis window.
  • stride (int) – Number of samples that we shift the window, also called hop size.
  • fft_size (int) – Size of the FFT, (STFT has fft_size // 2 + 1 frequency bins).
  • window_type (str) – Analysis window, can be either hanning, hamming or rectangular. For convenience, also window_type=None is supported which is equivalent to window_type='rectangular'.
  • center (bool) – If True, then it is assumed that the time-domain signal has centered frames.
Returns:

Time domain sequence of size batch_size x sample_size.

Return type:

Variable

Quantized Neural Network Layers

nnabla.functions.binary_sigmoid(x, n_outputs=-1, outputs=None)[source]

Element-wise binary sigmoid function. In the forward pass, it computes

\[\begin{split}f(x) = \begin{cases} 1 & (x > 0) \\ 0 & ({\rm otherwise})\end{cases},\end{split}\]

but in the backward pass, a straight-through approximation of the gradient is used, i.e.,

\[\begin{split}\frac{\partial f(x)}{\partial x} = \begin{cases} 0 & (|x| \geq 1) \\ \frac{1}{2} & ({\rm otherwise}) \end{cases}.\end{split}\]

References

Parameters:x (Variable) – Input .
Returns:Output.
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.binary_tanh(x, n_outputs=-1, outputs=None)[source]

Element-wise binary tanh function. In the forward pass, it computes

\[\begin{split}f(x) = \begin{cases} 1 & (x > 0) \\ -1 & ({\rm otherwise}) \end{cases},\end{split}\]

but in the backward pass, a straight-through approximation of the gradient is used, i.e.,

\[\begin{split}\frac{\partial f(x)}{\partial x} = \begin{cases} 0 & (|x| \geq 1) \\ 1 & ({\rm otherwise}) \end{cases}.\end{split}\]

References

Parameters:x (Variable) – Input .
Returns:Output.
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.binary_connect_affine(x, weight, binary_weight, bias=None, base_axis=1, quantize_zero_to=1.0, n_outputs=-1, outputs=None)[source]

This function provides a BinaryConnect affine layer. It computes in the forward pass

\[y_j = \sum_{i} sign(w_{j,i}) x_i,\]

i.e., the weights \(w_{j,i}\) are binarized to \(sign(w_{j,i})\) and, hence, each weight is in \(\{-1,\,1\}\). By this weight binarization, the inner product computations do not require any multiplications anymore as they turn into additions/subtractions.

This function should be used together with batch_normalization().

Note

1) If you would like to share the binary weights between other layers, please use the standard, floating value weights (weight) and not the binary weights (binary_weight).

2) The weights and the binary weights become in sync only after a call to forward(), and not after a call to backward(). If you wish to store the parameters of the network, remember to call forward(), once before doing so, otherwise the weights and the binary weights will not be in sync.

3) CPU and GPU implementations now use floating values for binary_weight, since this function is for simulation purposes.

References

Parameters:
  • x (Variable) – Input .
  • weight (Variable) – Weight . [parameter]
  • binary_weight (Variable) – Binarized weight . [parameter]
  • bias (Variable) – Bias. [optional][parameter]
  • base_axis (int) – Dimensions up to base_axis is treated as sample dimension. [default=``1``]
  • quantize_zero_to (float) – Input value at zero is quantized to this value. [default=``1.0``]
Returns:

Output.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.binary_connect_convolution(x, weight, binary_weight, bias=None, base_axis=1, pad=None, stride=None, dilation=None, group=1, quantize_zero_to=1.0, n_outputs=-1, outputs=None)[source]

This function provides a BinaryConnect convolution layer. It computes in the forward pass

\[y_{n, a, b} = \sum_{m} \sum_{i} \sum_{j} sign(w_{n, m, i, j}) x_{m, a + i, b + j},\]

i.e., the weights \(w_{n, m, i, j}\) are binarized to \(sign(w_{n, m, i, j})\) and, hence, each weight is in \(\{-1,\,1\}\). By this weight binarization, the inner product computations do not require any multiplications anymore as they turn into additions/subtractions.

This function should be used together with batch_normalization().

Reference

Note

1) If you would like to share the binary weights between other layers, please use the standard, floating value weights (weight) and not the binary weights (binary_weight).

2) The weights and the binary weights become in sync only after a call to forward(), and not after a call to backward(). If you wish to store the parameters of the network, remember to call forward(), once before doing so, otherwise the weights and the binary weights will not be in sync.

3) CPU and GPU implementations now use floating values for binary_weight, since this function is for simulation purposes.

Parameters:
  • x (Variable) – Input.
  • weight (Variable) – Weight. [parameter]
  • binary_weight (Variable) – Binarized weight. [parameter]
  • bias (Variable) – Bias. [optional][parameter]
  • base_axis (int) – Dimensions up to base_axis is treated as sample dimension. [default=``1``]
  • pad (tuple of int) – Padding sizes for dimensions. [default=``(0,) * (len(x.shape) - (base_axis+1))``]
  • stride (tuple of int) – Stride sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • dilation (tuple of int) – Dilation sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • group (int) – Number of groups of channels. This makes the connection across channels sparser, by grouping connections along the mapping direction. [default=``1``]
  • quantize_zero_to (float) – Input value at zero is quantized to this value. [default=``1.0``]
Returns:

Output

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.binary_weight_affine(x, weight, binary_weight, alpha, bias=None, base_axis=1, quantize_zero_to=1.0, n_outputs=-1, outputs=None)[source]

This function provides a Binary Weight Network affine layer. It computes in the forward pass

\[y_j = \frac{1}{\|\mathbf{w}_j\|_{\ell_1}} \sum_{i} sign(w_{j,i}) x_i\]

i.e., the weights \(w_{j,i}\) are binarized to \(sign(w_{j,i})\) and, hence, each weight is in \(\{-1,\,1\}\). By this weight binarization, the inner product computations turn into additions/subtractions which are followed by multiplication with the scaling factor \(\alpha_j = \frac{1}{\|\mathbf{w}_j\|_{\ell_1}}\).

Reference

Note

1) If you would like to share the binary weights with other layers, please use the standard, floating value weights (weight) and not the binary weights (binary_weight).

2) The weights and the binary weights become in sync only after a call to forward(), and not after a call to backward(). If you wish to store the parameters of the network, remember to call forward(), once before doing so, otherwise the weights and the binary weights will not be in sync.

3) CPU and GPU implementations now use floating values for binary_weight, since this function is for simulation purposes.

Parameters:
  • x (Variable) – Input .
  • weight (Variable) – Weight. [parameter]
  • binary_weight (Variable) – Binarized weight. [parameter]
  • alpha (Variable) – Alpha. [parameter]
  • bias (Variable) – Bias. [optional][parameter]
  • base_axis (int) – Dimensions up to base_axis is treated as sample dimension. [default=``1``]
  • quantize_zero_to (float) – Input value at zero is quantized to this value. [default=``1.0``]
Returns:

Output.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.binary_weight_convolution(x, weight, binary_weight, alpha, bias=None, base_axis=1, pad=None, stride=None, dilation=None, group=1, quantize_zero_to=1.0, n_outputs=-1, outputs=None)[source]

This function provides a Binary Weight Network convolution layer. It computes in the forward pass

\[y_{n, a, b} = \frac{1}{\|\mathbf{w}_n\|_{\ell_1}} \sum_{m} \sum_{i} \sum_{j} sign(w_{n, m, i, j}) x_{m, a + i, b + j}.\]

i.e., the weights \(w_{n, m, i, j}\) are binarized to \(sign(w_{n, m, i, j})\) and, hence, each weight is in \(\{-1,\,1\}\). By this weight binarization, the inner product computations turn into additions/subtractions which are followed by multiplication with the scaling factor \(\alpha_n = \frac{1}{\|\mathbf{w}_n\|_{\ell_1}}\).

Reference

Note

1) If you would like to share the binary weights between other standard layers, please use the standard, floating value weights (weight) and not the binary weights (binary_weight).

2) The weights and the binary weights become in sync only after a call to forward(), and not after a call to backward(). If you wish to store the parameters of the network, remember to call forward(), once before doing so, otherwise the weights and the binary weights will not be in sync.

3) CPU and GPU implementations now use floating values for binary_weight, since this function is for simulation purposes.

Parameters:
  • x (Variable) – Input.
  • weight (Variable) – Weight. [parameter]
  • binary_weight (Variable) – Binarized weight. [parameter]
  • alpha (Variable) – Alpha. [parameter]
  • bias (Variable) – Bias. [optional][parameter]
  • base_axis (int) – Dimensions up to base_axis is treated as sample dimension. [default=``1``]
  • pad (tuple of int) – Padding sizes for dimensions. [default=``(0,) * (len(x.shape) - (base_axis+1))``]
  • stride (tuple of int) – Stride sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • dilation (tuple of int) – Dilation sizes for dimensions. [default=``(1,) * (len(x.shape) - (base_axis+1))``]
  • group (int) – Number of groups of channels. This makes the connection across channels sparser, by grouping connections along the mapping direction. [default=``1``]
  • quantize_zero_to (float) – Input value at zero is quantized to this value. [default=``1.0``]
Returns:

Output

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.fixed_point_quantize(x, sign=True, n=8, delta=0.0625, quantize=True, ste_fine_grained=True, outputs=None)[source]

Fixed Point Quantize

Parameters:
  • x (Variable) – An input variable.
  • sign (bool) – Indicate the signed number or the unsigned number. Default is true.
  • n (int) – Bit width used. Note that sign consumes one bit. \(n-1\) is used for number representation in signed case.
  • delta (float) – Step size.
  • quantize (bool) – If true, quantize input, otherwise not.
  • ste_fine_grained (bool) – If true, STE is not 1.
Returns:

N-D array.

Return type:

Variable

See also

nnabla.function_bases.fixed_point_quantize.

In the forward pass,

\[\begin{split}\begin{equation} q_i= \left\{ \begin{array}{ll} max & if \ \ \ x_i > max \\ sign(x_i) \times floor(|x_i| \delta^{-1} + 2^{-1}) \times \delta & if \ \ min \le x_i \le max \\ min & if \ \ x_i < min \\ \end{array} \right., \end{equation}\end{split}\]

where \(\delta\) is the step size, \((min, max) :=(- (2^{n-1} - 1)\delta, (2^{n-1} - 1)\delta)\) if \(sign\) is true, \((min, max) := (0, (2^n - 1) \delta)\) otherwise, and \(n\) is the total bit-width used.

In the backward pass when using ste_fine_grained as false,

\[\begin{equation} \frac{\partial q_i}{\partial x_i} = 1. \end{equation}\]

In the backward pass when using ste_fine_grained as true,

\[\begin{split}\begin{equation} \frac{\partial q_i}{\partial x_i}= \left\{ \begin{array}{ll} 0 & if \ \ \ x_i > max \\ 1 & if \ \ min \le x_i \le max \\ 0 & if \ \ x_i < min \\ \end{array} \right.. \end{equation}\end{split}\]

Note

Quantized values are stored as floating point number, since this function is for simulation purposes.

nnabla.functions.min_max_quantize(x, qr_min, qr_max, ql_min, ql_max, decay=0.999, x_min_max=False, ema=False, ste_fine_grained=True, eps=0.01, quantize=True, outputs=None)[source]

Min-max quantization.

This function uniformly quantizes values in the range of min and max quantization levels.

Min-max quantization is defined as the following equation

\[y = round \left(\frac{\min(\max(x, m), M) - m}{scale} \right) \times scale + m,\]

where the \(scale\) is defined as

\[scale = \frac{M - m}{M_q - m_q},\]

and

\[\begin{split}m_q = ql_{min}, \\ M_q = ql_{max}, \\ m = qr_{min}, \\ M = qr_{max}.\end{split}\]

In the backward pass when using ste_fine_grained as false,

\[\frac{\partial q_i}{\partial x_i} = 1.\]

In the backward pass when using ste_fine_grained as true,

\[\begin{split} \frac{\partial q_i}{\partial x_i}= \left\{ \begin{array}{ll} 0 & if \ \ \ x_i > M \\ 1 & if \ \ m \le x_i \le M \\ 0 & if \ \ x_i < m \\ \end{array} \right..\end{split}\]

\(qr_{min}\) and \(qr_{max}\) are treaded as follows.

  • x_min_max is True and ema is True: Exponential moving average are computed for each \(min(x)\) and \(max(x)\) then stored in \(qr_{min}\) and \(qr_{max}\).
  • x_min_max is True and ema is False: \(min(x)\) and \(max(x)\) are computed then stored in \(qr_{min}\) and \(qr_{max}\).
  • x_min_max is False and ema is True: Exponential moving average stored in \(qr_{min}\) and \(qr_{max}\) are used.
  • x_min_max is False and ema is False Gradients of \(qr_{min}\) and \(qr_{max}\) are computed in the backward pass.

More precisely, in inference of the min-max quantization, one has to consider zero-point (zp) which corresponds to the real value 0, and its data type is an integer. zero-point is defined as

\[\begin{split} && zp_f = ql_{min} -\frac{qr_{min}}{scale}, \\ && zp = \left\{ \begin{array}{ll} ql_{max} & if \ \ \ zp_f >= ql_{max} \\ round(zp_f) & if \ \ otherwise \\ ql_{min} & if \ \ zp_f <= ql_{min} \\ \end{array} \right..\end{split}\]

Accordingly, in order to simulate quantization effect of zero-point, during both forward and backward pass, \(qr_{min}\) and \(qr_{max}\) are adjusted as follows,

\[\begin{split}qr_{min}^{adj} = ql_{min} - zp * scale, \\ qr_{max}^{adj} = ql_{max} - zp * scale.\end{split}\]

These operations are often called nudge.

Finally, in the formulas of the min-max quantization, \(m\) and \(M\) are replaced by \(qr_{min}^{adj}\) and \(qr_{max}^{adj}\) respectively.

Parameters:
  • x (Variable) – Input N-D array.
  • qr_min (Variable) – Minimum quantization range (modified during forward execution).
  • qr_max (Variable) – Maximum quantization range (modified during forward execution).
  • ql_min (Variable) – Minimum quantization level, typically 0.
  • ql_max (Variable) – Maximum quantization level, typically 255.
  • decay (float) – The decay rate for the exponential moving average.
  • x_min_max (bool) – Use the min and max of x to compute quantization ranges. Default is False.
  • ema (bool) – Use the exponential moving average for the min and max quantization ranges. Default is False.
  • ste_fine_grained (bool) – If True, STE is not 1, the {0, 1}-mask computed from the min-max is applied to the gradient in the backward; otherwise, STE is 1.
  • eps (float) – Epsilon, or small value to ensure \(qr_{max} - qr_{min}\) must be greater than the epsilon.
  • quantize (bool) – Apply quantization or not.

References

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko, “Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference”, https://arxiv.org/abs/1712.05877

nnabla.functions.pow2_quantize(x, sign=True, with_zero=True, n=8, m=1, quantize=True, ste_fine_grained=True, outputs=None)[source]

Pow2 Quantize

Parameters:
  • x (Variable) – An input variable.
  • sign (bool) – Indicate the signed number or the unsigned number. Default is true.
  • with_zero (bool) – Indicate using zero as a quantized value. Default is true. Note that zero consumes one bit.
  • n (int) – Bit width used. Note that sign consumes one bit. \(n-1\) is used for number representation in signed case. Default is 8.
  • m (int) – \(2^m\) is the upper bound of the dynamic range and \(-2^m\) is the lower bound, \(m \in \mathcal{Z}\). Default is 1.
  • quantize (bool) – If true, quantize input, otherwise not.
  • ste_fine_grained (bool) – If true, STE is not 1.
Returns:

N-D array.

Return type:

Variable

See also

nnabla.function_bases.pow2_quantize.

In the forward pass of signed case,

\[\begin{split}q_i= \left\{ \begin{array}{ll} max_{+} & if \ \ \overline{q_i} > max_{+} \\ \overline{q_i} & if \ \ min_{+} \le \overline{q_i} \le max_{+} \\ min_{+} & if \ \ 0 \le \overline{q_i} < min_{+} \\ min_{-} & if \ \ min_{-} < \overline{q_i} < 0 \\ \overline{q_i} & if \ \ max_{-} \le \overline{q_i} \le min_{-}\\ max_{-} & if \ \ \overline{q_i} < max_{-} \\ \end{array} \right.,\end{split}\]

where

\[\begin{split}&& max_{+} = 2^{m}, min_{+} = 2^{m - (2^{n-1} - 1)},\\ && max_{-} = -2^{m}, min_{-} = -2^{m - (2^{n-1} - 1)},\\ && \overline{q_i} = sign(x_i) \times 2^{round(\log_2 |x_i|)}.\end{split}\]

This quantization uses the geometric mean between two power-of-two numbers as quantization threshold.

In the forward pass of unsigned case,

\[\begin{split}q_i= \left\{ \begin{array}{ll} max & if \ \ \overline{q_i} > max \\ \overline{q_i} & if \ \ min \le \overline{q_i} \le max \\ min & if \ \ 0 < \overline{q_i} < min \\ \end{array} \right.,\end{split}\]

where

\[\begin{split}&& max = 2^{m}, min = 2^{m - (2^{n} - 1)},\\ && \overline{q_i} = 2^{int(\log_2 |x_i|)}.\end{split}\]

When using with_zero as true, a pruning threshold is used to round an input to 0 or \(min\). The pruning threshold is defined in this function as the following,

\[pruning\ threshold = min \times 2^{-\frac{1}{2}}.\]

If an absolute value of the input is lesser than this value, the input is rounded to 0, otherwise \(min\).

In the backward pass when using ste_fine_grained as false,

\[\frac{\partial q_i}{\partial x_i} = 1.\]

In the backward pass when using ste_fine_grained as true,

\[\begin{split}\frac{\partial q_i}{\partial x_i}= \left\{ \begin{array}{ll} 0 & if \ \ \overline{q_i} > max_{+} \\ 1 & if \ \ otherwise \\ 0 & if \ \ \overline{q_i} < max_{-} \\ \end{array} \right..\end{split}\]
nnabla.functions.prune(x, rate=0.9, n_outputs=-1, outputs=None)[source]

Prune the input as the following equation,

\[\begin{split}q_i = \left \{ \begin{array}{ll} 0 & abs(x_i) < threshold \\ x_i & otherwise \end{array} \right.\end{split}\]

where \(threshold\) is determined by threshold = np.sort(np.abs(x))[int((x.size - 1) * rate)].

Parameters:
  • x (Variable) – N-D array
  • rate (float) – Sparse rate, or pruning rate. [default=``0.9``]
Returns:

N-D array with the same shape as x

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Unsupported, Special Use

nnabla.functions.vat_noise(x, w, base_axis=1, eps=1.0, n_outputs=-1, outputs=None)[source]

Noise for virtual adversarial training.

This layer is a special layer for GUI network designing, specialized for getting the noise of virtual adversarial training.

In the backward process, the weight parameter will be replaced with the gradient.

Forward

\[y_i = \frac{\epsilon x_i}{\sqrt{\sum_k x_k^2 + c}}\]

Backward

\[\delta x_i = 0\]
\[w_i = \epsilon \delta y_i\]

Note

This layer is a special layer for GUI network designing.

References

Parameters:
  • x (Variable) – N-D array of noise input. Noise is standard Gaussian noise initially, but the next step, fed back gradient variable.
  • w (Variable) – N-D array for keep gradient values.
  • base_axis (int) – Dimensions up to base_axis is treated as sample dimension. [default=``1``]
  • eps (float) – Noise norm (l2) factor. [default=``1.0``]
Returns:

N-D array

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

This function behaves as an identity function on the forward pass, and deletes the gradient for the background pass.

This layer is a special layer for GUI network designing, used for getting zero backward operation by adding this layer.

Forward

\[y_i = x_i\]

Backward

\[\delta x_i = 0\]

Note

This layer is a special layer for GUI network designing.

Parameters:x (Variable) – N-D array.
Returns:N-D array.
Return type:Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

nnabla.functions.sink(*x, **kw)[source]

Creates a dummy variable used to call forward or backward function of multiple variables at one place.

This takes any numbers of input variables with any shape, and creates a single 0-shape outputs. The forward pass does nothing. The backward pass set ones to the input grads if one_input_grad is set as true.

Note

sink can only be called at the very end of the graph, and grad of input variables are cleared

when y.backward(clear_buffer=True) is called.
Parameters:
  • *x (Variable) – Any number of inputs with any shape. [variadic]
  • one_input_grad (bool) – Set grads of inputs as one during backward. It is useful to set false if you want to set external gradients to the input variables. [default=``True``]
Returns:

Dummy variable.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Image Object Detection

nnabla.functions.nms_detection2d(x, thresh=None, nms=None, nms_per_class=None, n_outputs=-1, outputs=None)[source]

Non-Maximum Suppression (NMS) to 2D Object detector output. The input is a 3-dimensional tensor with shape of (B, N, 5 + C) where B denotes batch size, N denotes the number of detection box candidates, and C denotes the number of classes of object detection. 5 + C consists of the box coordinates x, y, w, h in normalized coordinates (size of each x and y are 1.0), objectness (learned to predict IoU value to ground truth box), and the class

probabilities of C classes.

It outputs a tensor with the same dimensions as the input, where all values are copied from the input to the output, except the class probabilities are multiplied by objectness, and possibly suppressed to 0 by NMS. During NMS, all of combination of pairs of bounding boxes is compared. For each pair, the bounding box with a lower detection score (described below) is suppressed if the overlap ratio (the IoU) is greater than the value of nms.

There are two suppression modes for NMS.

1. Suppress by class probability (nms_per_class is True): For each bounding box, the detection score is calculated by objectness * probability[class_id] for each class. The suppression is done for each class independently.

2. Suppress by objectness (nms_per_class is False): The suppression is done for each bounding box using objectness as a detection score. All class probabilities becomes 0 for every suppressed boxes.

References

Parameters:
  • x (Variable) – A 3-dimensional array.
  • thresh (float) – Detection score threshold. [default=``0.5``]
  • nms (float) – IoU threshold for Non-maximum suppression (NMS). [default=``0.45``]
  • nms_per_class (bool) – If true, NMS is applied for each class. [default=``True``]
Returns:

A 3-dim array with the same dimensions with the input.

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.

Validation

nnabla.functions.top_n_error(x, target, axis=None, n=1, n_outputs=-1, outputs=None)[source]

Top N error along the dimension specified by the axis, the element of outputs is

\[\begin{split}y_i = \left \{ \begin{array}{l} 1 \ (x_i \ is \ not \ within \ N-th \ place) \\ 0 \ (x_i \ is \ within \ N-th \ place) \end{array} \right.\end{split}\]
Parameters:
  • x (Variable) – Probabilities N-D array. \(D_1 \times ... \times D_i \times ... \times D_N\)
  • target (Variable) – N-D array of labels. \(D_1 \times ... \times 1 \times ... \times D_N\)
  • axis (int) – Axis on which the top N error is calculated. [default=``len(x.shape) - 1``]
  • n (int) – top N [default=``1``]
Returns:

Element-wise error N-D array. (\(D_1 \times ... \times 1 \times ... \times D_N\))

Return type:

Variable

Note

All nnabla functions in nnabla.functions are decorated with the nnabla.function_bases.function_api decorator, which queries the current context and passes it into the first argument of the original function. The original function always takes a context as the first argument.