Experimental

DynamicLossScalingUpdater

class nnabla.experimental.mixed_precision_training.DynamicLossScalingUpdater(solver, loss, data_feeder=<function <lambda>>, scale=8.0, scaling_factor=2.0, N=2000, clear_buffer=True, accum_grad=1, weight_decay=None, comm=None, grads=[])[source]

Dynamic Loss Scaling Updater for the mixed precision training.

Parameters:
  • solver (nnabla.solvers.Solver) – Solver object. E.g., Momentum or Adam.
  • loss (nnabla.Variable) – Loss variable from which the forward and the backward is called.
  • data_feeder (callable object, function, or lambda) – Data feeder
  • scale (float) – Loss scale constant. This is dynamically changing during training.
  • scaling_factor (float) – Scaling factor for the dynamic loss scaling.
  • N (int) – Interval, the number of iterations in training for increasing loss scale by scaling_factor.
  • clear_buffer (bool) – Clears the no longer referenced variables during backpropagation to save memory.
  • accum_grad (int) – Number of accumulation of gradients. Update method of the solver is called after the accum_grad number of the forward and backward is called.
  • weight_decay (float) – Decay constant. Default is None, not applying the weight decay.
  • comm (nnabla.communicators.Communicator) – Communicator when to do distributed training. Default is None.
  • grads (list of nnabla._nd_array.NdArray) – The list of gradients to be exchanged when to do distributed training. Default is the empty list.
solver

nnabla.solvers.Solver – Solver object. E.g., Momentum or Adam.

loss

nnabla.Variable – Loss variable from which the forward and the backward is called.

data_feeder

callable object, function, lambda – Data feeder

scale

float – Loss scale constant. This is dynamically changing during training.

scaling_factor

float – Scaling factor for the dynamic loss scaling.

N

int – Interval, the number of iterations in training for increasing loss scale by scaling_factor.

clear_buffer

bool – Clears the no longer referenced variables during backpropagation to save memory.

accum_grad

int – Number of accumulation of gradients. Update method of the solver is called after the accum_grad number of the forward and backward is called.

weight_decay

float – Decay constant. Default is None, not applying the weight decay.

comm

nnabla.communicators.Communicator – Communicator when to do distributed training.

grads

list of nnabla._nd_array.NdArray – The list of gradients to be exchanged when to do distributed training.

Example

Reference:

update()[source]

Monolithic update method.

This method calls the following methods with the dynamic loss scaling.

  1. solver.zerograd
  2. feed data
  3. loss.forward
  4. loss.backward
  5. comm.all_reduce (if it is specified)
  6. solver.update

SimpleGraph

class nnabla.experimental.viewers.SimpleGraph(format='png', verbose=False, fname_color_map=None, vname_color_map=None)[source]

Simple Graph with GraphViz.

Example:

import nnabla as nn
import nnabla.functions as F
import nnabla.parametric_functions as PF

import nnabla.experimental.viewers as V

# Model definition
def network(image, test=False):
    h = image
    h /= 255.0
    h = PF.convolution(h, 16, kernel=(3, 3), pad=(1, 1), name="conv")
    h = PF.batch_normalization(h, name="bn", batch_stat=not test)
    h = F.relu(h)
    pred = PF.affine(h, 10, name='fc')
    return pred

# Model
image = nn.Variable([4, 3, 32, 32])
pred = network(image, test=False)

# Graph Viewer
graph = V.SimpleGraph(verbose=False)
graph.view(pred)
graph.save(pred, "sample_grpah")
create_graphviz_digraph(vleaf, format=None)[source]

Create a graphviz.Digraph object given the leaf variable of a computation graph.

One of nice things of getting Digraph directly is that the drawn graph can be displayed inline in a Jupyter notebook as described in Graphviz documentation.

Parameters:
  • vleaf (nnabla.Variable) – End variable. All variables and functions which can be traversed from this variable are shown in the reuslt.
  • format (str) – Force overwrite format ('pdf', 'png', ...)) configuration.

Returns: graphviz.Digraph

save(vleaf, fpath, cleanup=False, format=None)[source]

Save the graph to a given file path.

Parameters:
  • vleaf (nnabla.Variable) – End variable. All variables and functions which can be traversed from this variable are shown in the reuslt.
  • fpath (str) – The file path used to save.
  • cleanup (bool) – Clean up the source file after rendering. Default is False.
  • format (str) – Force overwrite format ('pdf', 'png', ...)) configuration.
view(vleaf, fpath=None, cleanup=True, format=None)[source]

View the graph.

Parameters:
  • vleaf (nnabla.Variable) – End variable. All variables and functions which can be traversed from this variable are shown in the reuslt.
  • fpath (str) – The file path used to save.
  • cleanup (bool) – Clean up the source file after rendering. Default is True.
  • format (str) – Force overwrite format ('pdf', 'png', ...)) configuration.

GraphConverter

class nnabla.experimental.graph_converters.identity.IdentityConverter(black_list=[], params=None, name='identity')[source]

All functions are replaced with the same new function.

Parameters:
  • black_list (list) – Black list of the function list.
  • params (OrderedDict) – Result of nn.get_parameters().
  • name (str) – Prefix of the parameter scope.
convert(vroot, entry_variables)[source]

All functions are replaced with the same new function.

Parameters:
  • vroot (Variable) – NNabla Variable
  • entry_variables (Variable) – Entry variable from which the conversion starts.
class nnabla.experimental.graph_converters.batch_normalization_linear.BatchNormalizationLinearConverter(black_list=[], params=None, name='bn-linear')[source]

The parameters of the batch normalization replaced simple scale and bias.

Parameters:
  • black_list (list) – Black list of the function list.
  • params (OrderedDict) – Result of nn.get_parameters().
  • name (str) – Prefix of the parameter scope.
convert(vroot, entry_variables)[source]

All functions are replaced with the same new function.

Parameters:
  • vroot (Variable) – NNabla Variable
  • entry_variables (Variable) – Entry variable from which the conversion starts.
class nnabla.experimental.graph_converters.batch_normalization_folded.BatchNormalizationFoldedConverter(black_list=[], params=None, inner_prod_functions=None, name='bn-folded')[source]

Single Convolution -> BatchNormalization pass is folded into one Convolution.

If there is a Convolution -> BatchNormalization pass, fold the batch normalization paramters to the kernel and bias (if it exists) of the preceeding convolution, then skip the batch normalization following the convolution.

Parameters:
  • black_list (list) – Black list of the function list.
  • params (OrderedDict) – Result of nn.get_parameters().
  • name (str) – Prefix of the parameter scope.
convert(vroot, entry_variables)[source]

All functions are replaced with the same new function.

Parameters:
  • vroot (Variable) – NNabla Variable
  • entry_variables (Variable) – Entry variable from which the conversion starts.
class nnabla.experimental.graph_converters.fixed_point_weight.FixedPointWeightConverter(black_list=[], params=None, inner_prod_functions=None, call_forward=True, floor=False, args_fpq={'delta_b': 0.0002, 'delta_w': 0.0002, 'n_b': 8, 'n_w': 8, 'quantize_b': True, 'quantize_w': True, 'sign_b': True, 'sign_w': True}, name='fixed-point-weight-graph')[source]

All functions specified by inner_prod_functions are replaced with the fixed-point counter-part. The other functions are replaced with the same new function.

Parameters:
  • black_list (list) – Black list of the function list.
  • params (OrderedDict) – Result of nn.get_parameters().
  • inner_prod_functions (list of function name) – Function names to be replaced. Default is [“Affine”, “Convolution”, “Deconvolution”].
  • call_forward (bool) – Call forward function to obtain W_q. Default is “True”, so ones do not need to call the forward function to synch quantized weights. Take care that if the network contains the batch normalization or like other normalization which computes running stats (e.g., a running mean and variance), these stats can not help being updated by this call_forward. To avoid that, change the argument batch_stat of the batch normalization layer to False when using this call_foward option True.
  • floor (bool) – When computing the step size, it is coerced to be the power-of-2 by using either \(2^ceil(log_2(abs(W)_max / (2^n - 1)))\) or \(2^floor(log_2(abs(W)_max / (2^n - 1)))\). Default is False.
  • args_fpq (dict) – Argument into F.quantize. Default is {“sign_w”: True, “n_w”: 8, “delta_w”: 2e-4, “quantize_w”: True, “sign_b”: True, “n_b”: 8, “delta_b”: 2e-4, “quantize_b”: True}
  • name (str) – Prefix of the parameter scope.
convert(vroot, entry_variables)[source]

All functions are replaced with the same new function.

Parameters:
  • vroot (Variable) – NNabla Variable
  • entry_variables (Variable) – Entry variable from which the conversion starts.
class nnabla.experimental.graph_converters.fixed_point_activation.FixedPointActivationConverter(black_list=[], params=None, activation_functions=None, args_fpq={'delta': 0.0002, 'n': 8, 'quantize': True, 'sign': False}, name='fixed-point-activation-graph')[source]

All functions specified by activation_functions are replaced with the fixed-point quantization function. The other functions are replaced with the same new function.

Parameters:
  • black_list (list) – Black list of the function list.
  • params (OrderedDict) – Result of nn.get_parameters().
  • activation_functions (list of function name) – Function names to be replaced. Default is [“ReLU”].
  • args_fpq (dict) – Argument into F.quantize. Default is {“sign”: True, “n”: 8, “delta”: 2e-4, “quantize”: True}.
  • name (str) – Prefix of the parameter scope.
convert(vroot, entry_variables)[source]

All functions are replaced with the same new function.

Parameters:
  • vroot (Variable) – NNabla Variable
  • entry_variables (Variable) – Entry variable from which the conversion starts.

Trainer

class nnabla.experimental.trainers.Trainer(updater=None, evaluator=None, model_save_path=None, max_epoch=1, iter_per_epoch=None, callback_on_start=<function <lambda>>, callback_on_finish=<function <lambda>>, update_callback_on_start=<function <lambda>>, update_callback_on_finish=<function <lambda>>)[source]

Trainer API

Trainer class is the very basic class for training neural network. You can composite this class to your own trainer class and delegate the train method of this class to your class.

Parameters:
  • updater (Updater or list of Updater) – Updater object.
  • evaluator (Evaluator or list of Evaluator) – Evaluator object.
  • model_save_path (str) – Model save path.
  • max_epoch (int) – Max epoch to train.
  • iter_per_epoch (int, optional) – Iterations per one epoch.
  • callback_on_start (callable object, function, lambda, or list of these, optional) – Callback called before the trainer.train.
  • callback_on_finish (callable object, function, lambda, or list of these, optional) – Callback called after the trainer.train.
  • update_callback_on_start (callable object, function, lambda, or list of these, optional) – Callback called before the updater.update.
  • update_callback_on_finish (callable object, function, lambda, or list of these, optional) – Callback called after the updater.update.

The following example is a complete snippet to use this base trainer.

Example

import nnabla as nn
import nnabla.functions as F
import nnabla.parametric_functions as PF
import nnabla.solvers as S

from nnabla.monitor import Monitor, MonitorSeries, MonitorTimeElapsed

import numpy as np

from nnabla.experimental.trainers import Trainer, Updater, Evaluator

# Batch, channel, height, width
b, c, h, w = 32, 1, 128, 128

# Train Input
tinput = nn.Variable([b, c, h, w])
tlabel = nn.Variable([b, c, h, w])

# Train Model and Loss
tpred = <training model>.apply(persistent=True)
tloss = F.mean(F.softmax_cross_entropy(tpred, tlabel))

# Test Input
vinput = nn.Variable([b, c, h, w])
vlabel = nn.Variable([b, c, h, w])

# Test Model and Error
vpred = <evaluation model>.apply(persistent=True)
vloss = F.mean(F.softmax_cross_entropy(vpred, vlabel))
verror = F.mean(F.top_n_error(vpred.get_unlinked_variable(), vlabel))

# Solver
solver = S.Adam()
solver.set_parameters(nn.get_parameters())

# DataIterator
tdata = <training_data_iterator>
vdata = <validation_data_iterator>

# Monitor
monitor = Monitor(<monitor_path>)
monitor_loss = MonitorSeries("Training loss", monitor, interval=10)
monitor_err = MonitorSeries("Training error", monitor, interval=10)
monitor_time = MonitorTimeElapsed("Training time", monitor, interval=100)
monitor_verr = MonitorSeries("Valid error", monitor, interval=10)

# Updater
def tdata_feeder():
    tinput.d, tlabel.d = tdata.next()
def update_callback_on_finish(i):
    monitor_loss.add(i, tloss.d)
    monitor_time.add(i)
updater = Updater(solver, tloss,
                  data_feeder=tdata_feeder,
                  update_callback_on_finish=update_callback_on_finish)

# Evaluator
def vdata_feeder():
    vinput.d, vlabel.d = vdata.next()
def eval_callback_on_finish(i, ve):
    monitor_verr.add(i, ve)
evaluator = Evaluator(verror,
                      data_feeder=vdata_feeder,
                      val_iter=vdata.size // b,
                      callback_on_finish=eval_callback_on_finish)

# Trainer
trainer = Trainer(updater, evaluator, <model_save_path>,
                  max_epoch=<max_epoch>, iter_per_epoch=tdata.size // b)
trainer.train()
class nnabla.experimental.trainers.NaiveClassificationTrainer(solver, tinput=None, tlabel=None, tpred=None, tdata=None, vinput=None, vlabel=None, vpred=None, vdata=None, monitor_path=None, model_save_path=None, max_epoch=1, iter_per_epoch=None, val_iter=None)[source]

Naive Classification Trainer

Parameters:
  • solver (Solver) – Solver object.
  • tinput (Variable) – Input variable for input feature in training.
  • tlabel (Variable) – Label variable for lable in training.
  • tpred (Variable) – Root variable for prediction in the training graph.
  • tdata (nnabla.utils.data_iterator.DataIterator) – DataIterator for training.
  • vinput (Variable) – Input variable for input feature in evaluation.
  • vlabel (Variable) – Label variable for label in evaluation.
  • vpred (Variable) – Root variable for prediction in the evaluation graph.
  • vdata (DataIterator) – DataIterator for evaluation.
  • monitor_path (str) – Monitor path.
  • model_save_path (str) – Model save path.
  • max_epoch (int) – Max epoch to train.
  • iter_per_epoch (int, optional) – Iterations per one epoch. If not set, this value are determined by tdata.size // tdata.batch_size.
  • val_iter (int, optional) – Iterations for evaluation. If not set, this value are determined by vdata.size // vdata.batch_size.

The following example is a complete snippet to use this base trainer.

Example

import nnabla as nn
import nnabla.functions as F
import nnabla.parametric_functions as PF
import nnabla.solvers as S

import numpy as np

from nnabla.experimental.trainers import NaiveClassificationTrainer

# Batch, channel, height, width
b, c, h, w = 32, 1, 128, 128

# Train Input
tinput = nn.Variable([b, c, h, w])
tlabel = nn.Variable([b, c, h, w])

# Train Model and Loss
tpred = <training model>

# Test Input
vinput = nn.Variable([b, c, h, w])

# Test Model
vpred = <evaluation model>

# Solver
solver = S.Adam()
solver.set_parameters(nn.get_parameters())

# DataIterator
tdata = <training_data_iterator>
vdata = <validation_data_iterator>

# Trainer
trainer = NaiveClassificationTrainer(solver,
                                     tinput, tlabel, tpred, tdata,
                                     vinput, vlabel, vpred, vdata,
                                     <monitor_path>,
                                     <model_save_path>,
                                     max_epoch=<max_epoch>)
trainer.train()
class nnabla.experimental.trainers.NaiveRegressionTrainer(solver, tinput=None, tlabel=None, tpred=None, tdata=None, vinput=None, vlabel=None, vpred=None, vdata=None, monitor_path=None, model_save_path=None, max_epoch=1, iter_per_epoch=None, val_iter=None)[source]

Naive Regression Trainer

Parameters:
  • solver (Solver) – Solver object.
  • tinput (Variable) – Input variable for input feature in training.
  • tlabel (Variable) – Label variable for lable in training.
  • tpred (Variable) – Root variable for prediction in the training graph.
  • tdata (nnabla.utils.data_iterator.DataIterator) – DataIterator for training.
  • vinput (Variable) – Input variable for input feature in evaluation.
  • vlabel (Variable) – Label variable for label in evaluation.
  • vpred (Variable) – Root variable for prediction in the evaluation graph.
  • vdata (DataIterator) – DataIterator for evaluation.
  • monitor_path (str) – Monitor path.
  • model_save_path (str) – Model save path.
  • max_epoch (int) – Max epoch to train.
  • iter_per_epoch (int, optional) – Iterations per one epoch. If not set, this value are determined by tdata.size // tdata.batch_size.
  • val_iter (int, optional) – Iterations for evaluation. If not set, this value are determined by vdata.size // vdata.batch_size.

Example

import nnabla as nn
import nnabla.functions as F
import nnabla.parametric_functions as PF
import nnabla.solvers as S

import numpy as np

from nnabla.experimental.trainers import NaiveRegressionTrainer

# Batch, channel, height, width
b, c, h, w = 32, 1, 128, 128

# Train Input
tinput = nn.Variable([b, c, h, w])
tlabel = nn.Variable([b, c, h, w])

# Train Model and Loss
tpred = <training model>

# Test Input
vinput = nn.Variable([b, c, h, w])
vlabel = nn.Variable([b, c, h, w])

# Test Model
vpred = <evaluation model>

# Solver
solver = S.Adam()
solver.set_parameters(nn.get_parameters())

# DataIterator
tdata = <training_data_iterator>
vdata = <validation_data_iterator>

# Trainer
trainer = NaiveRegressionTrainer(solver,
                                 tinput, tlabel, tpred, tdata,
                                 vinput, vlabel, vpred, vdata,
                                 <monitor_path>,
                                 <model_save_path>,
                                 max_epoch=<max_epoch>)
trainer.train()
class nnabla.experimental.trainers.Updater(solver=None, loss=None, data_feeder=<function <lambda>>, forward_callback_on_start=<function <lambda>>, forward_callback_on_finish=<function <lambda>>, backward_callback_on_start=<function <lambda>>, backward_callback_on_finish=<function <lambda>>, comm_callback_on_start=<function <lambda>>, comm_callback_on_finish=<function <lambda>>, update_callback_on_start=<function <lambda>>, update_callback_on_finish=<function <lambda>>, clear_buffer=True, accum_grad=1, comm=None, grads=[])[source]
Parameters:
  • solver (nnabla.solvers.Solver) – Solver object. E.g., Momentum or Adam.
  • loss (nnabla.Variable) – Loss variable from which the forward and the backward is called.
  • data_feeder (callable object, function, or lambda) – Data feeder.
  • forward_callback_on_start (callable object, function, lambda, or list of these, optional) – Callback called before forward function.
  • forward_callback_on_finish (callable object, function, lambda, or list of these, optional) – Callback called after forward function.
  • backward_callback_on_start (callable object, function, lambda, or list of these, optional) – Callback called before backward function.
  • backward_callback_on_finish (callable object, function, lambda, or list of these, optional) – Callback called after backward function.
  • comm_callback_on_start (callable object, function, lambda, or list of these, optional) – Callback called before comm.all_reduce.
  • comm_callback_on_finish (callable object, function, lambda, or list of these, optional) – Callback called after comm.all_reduce.
  • update_callback_on_start (callable object, function, lambda, or list of these, optional) – Callback called before update function.
  • update_callback_on_finish (callable object, function, lambda, or list of these, optional) – Callback called after update function.
  • clear_buffer (bool, optional) – Clears the no longer referenced variables during backpropagation to save memory.
  • accum_grad (int, optional) – Number of accumulation of gradients. Update method of the solver is called after the accum_grad number of the forward and backward is called. Default is 1.
  • comm (nnabla.communicators.Communicator, optional) – Communicator when to do distributed training. Default is None.
  • grads (list of nnabla._nd_array.NdArray, optional) – The list of gradients to be exchanged when to do distributed training. Default is the empty list.

Example

from nnabla.experimental.trainers import Updater

solver = <Solver>
loss = <Loss Variable of Network>

def tdata_feeder():
    ...
def update_callback_on_finish(i):
    ...
updater = Updater(solver, loss, tdata_feeder, updater_callback_on_finish)

# Training iteration
for itr in range(<max_iter>):
    updater.update()
update(i)[source]

Monolithic update method.

This method calls the following methods with the dynamic loss scaling.

  1. solver.zerograd
  2. feed data
  3. loss.forward
  4. loss.backward
  5. comm.all_reduce (if it is specified)
  6. solver.update
class nnabla.experimental.trainers.Evaluator(vroot=None, data_feeder=None, val_iter=None, callback_on_start=<function <lambda>>, callback_on_finish=<function <lambda>>, clear_buffer=True, comm=None)[source]
Parameters:
  • vroot (Variable) – Root varible of the evaluation graph.
  • data_feeder (callable object, function, or lambda) – Data feeder.
  • val_iter (int, optional) – Iterations for evaluation.
  • callback_on_start (callable object, function, lambda, or list of these, optional) – Callback called before the evaluator.evalute.
  • callback_on_finish (callable object, function, lambda, or list of these, optional) – Callback called after the evaluator.evalute.
  • clear_buffer (bool, optional) – Clears the no longer referenced variables during backpropagation to save memory.
  • comm (nnabla.communicators.Communicator, optional) – Communicator when to do distributed training. Default is None.

Example

from nnabla.experimental.trainers import Evaluator

# Evaluator
def vdata_feeder():
    ...
def eval_callback_on_finish(i, ve):
    ...
evaluator = Evaluator(verror,
                      data_feeder=vdata_feeder,
                      val_iter=<val_iter>,
                      callback_on_finish=eval_callback_on_finish)

Parametric Function Classes

class nnabla.experimental.parametric_function_class.affine.Affine(n_inmaps, n_outmaps, base_axis=1, w_init=None, b_init=None, fix_parameters=False, rng=None, with_bias=True)[source]

The affine layer, also known as the fully connected layer. Computes

\[{\mathbf y} = {\mathbf A} {\mathbf x} + {\mathbf b}.\]

where \({\mathbf x}, {\mathbf y}\) are the inputs and outputs respectively, and \({\mathbf A}, {\mathbf b}\) are constants.

Parameters:
Returns:

\((B + 1)\)-D array. (\(M_0 \times \ldots \times M_{B-1} \times L\))f

Return type:

Variable

nnabla.experimental.parametric_function_class.affine.Linear

alias of nnabla.experimental.parametric_function_class.affine.Affine

class nnabla.experimental.parametric_function_class.convolution.Convolution(inmaps, outmaps, kernel, pad=None, stride=None, dilation=None, group=1, w_init=None, b_init=None, base_axis=1, fix_parameters=False, rng=None, with_bias=True)[source]

N-D Convolution with a bias term.

For Dilated Convolution (a.k.a. Atrous Convolution), refer to:

Note

Convolution is a computationally intensive operation that should preferrably be run with the cudnn backend. NNabla then uses CuDNN library functions to determine and cache the fastest algorithm for the given set of convolution parameters, which results in additional memory consumption which may pose a problem for GPUs with insufficient memory size. In that case, the NNABLA_CUDNN_WORKSPACE_LIMIT environment variable can be used to restrict the choice of algorithms to those that fit the given workspace memory limit, expressed in bytes. In some cases it may also be desired to restrict the automatic search to algorithms that produce deterministic (reproducable) results. This can be requested by setting the the environment variable NNABLA_CUDNN_DETERMINISTIC to a non-zero value.

Parameters:
  • inp (Variable) – N-D array.
  • outmaps (int) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.
  • kernel (tuple of int) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).
  • pad (tuple of int) – Padding sizes for dimensions.
  • stride (tuple of int) – Stride sizes for dimensions.
  • dilation (tuple of int) – Dilation sizes for dimensions.
  • group (int) – Number of groups of channels. This makes connections across channels more sparse by grouping connections along map direction.
  • w_init (nnabla.initializer.BaseInitializer or numpy.ndarray) – Initializer for weight. By default, it is initialized with nnabla.initializer.UniformInitializer within the range determined by nnabla.initializer.calc_uniform_lim_glorot.
  • b_init (nnabla.initializer.BaseInitializer or numpy.ndarray) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.
  • base_axis (int) – Dimensions up to base_axis are treated as the sample dimensions.
  • fix_parameters (bool) – When set to True, the weights and biases will not be updated.
  • rng (numpy.random.RandomState) – Random generator for Initializer.
  • with_bias (bool) – Specify whether to include the bias term.
Returns:

N-D array. See convolution for the output shape.

Return type:

Variable

nnabla.experimental.parametric_function_class.convolution.Conv1d

alias of nnabla.experimental.parametric_function_class.convolution.Convolution

nnabla.experimental.parametric_function_class.convolution.Conv2d

alias of nnabla.experimental.parametric_function_class.convolution.Convolution

nnabla.experimental.parametric_function_class.convolution.Conv3d

alias of nnabla.experimental.parametric_function_class.convolution.Convolution

nnabla.experimental.parametric_function_class.convolution.ConvNd

alias of nnabla.experimental.parametric_function_class.convolution.Convolution

class nnabla.experimental.parametric_function_class.deconvolution.Deconvolution(inmaps, outmaps, kernel, pad=None, stride=None, dilation=None, group=1, w_init=None, b_init=None, base_axis=1, fix_parameters=False, rng=None, with_bias=True)[source]

Deconvolution layer.

Parameters:
  • inp (Variable) – N-D array.
  • outmaps (int) – Number of deconvolution kernels (which is equal to the number of output channels). For example, to apply deconvolution on an input with 16 types of filters, specify 16.
  • kernel (tuple of int) – Convolution kernel size. For example, to apply deconvolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).
  • pad (tuple of int) – Padding sizes for dimensions.
  • stride (tuple of int) – Stride sizes for dimensions.
  • dilation (tuple of int) – Dilation sizes for dimensions.
  • group (int) – Number of groups of channels. This makes connections across channels sparser by grouping connections along map direction.
  • w_init (nnabla.initializer.BaseInitializer or numpy.ndarray) – Initializer for weight. By default, it is initialized with nnabla.initializer.UniformInitializer within the range determined by nnabla.initializer.calc_uniform_lim_glorot.
  • b_init (nnabla.initializer.BaseInitializer or numpy.ndarray) – Initializer for bias. By default, it is initialized with zeros if with_bias is True.
  • base_axis (int) – Dimensions up to base_axis are treated as the sample dimensions.
  • fix_parameters (bool) – When set to True, the weights and biases will not be updated.
  • rng (numpy.random.RandomState) – Random generator for Initializer.
  • with_bias (bool) – Specify whether to include the bias term.
Returns:

N-D array. See deconvolution for the output shape.

Return type:

Variable

nnabla.experimental.parametric_function_class.deconvolution.Deconv1d

alias of nnabla.experimental.parametric_function_class.deconvolution.Deconvolution

nnabla.experimental.parametric_function_class.deconvolution.Deconv2d

alias of nnabla.experimental.parametric_function_class.deconvolution.Deconvolution

nnabla.experimental.parametric_function_class.deconvolution.Deconv3d

alias of nnabla.experimental.parametric_function_class.deconvolution.Deconvolution

nnabla.experimental.parametric_function_class.deconvolution.DeconvNd

alias of nnabla.experimental.parametric_function_class.deconvolution.Deconvolution

class nnabla.experimental.parametric_function_class.batch_normalization.BatchNormalization(n_features, n_dims, axes=[1], decay_rate=0.9, eps=1e-05, batch_stat=True, output_stat=False, fix_parameters=False, param_init=None)[source]

Batch normalization layer.

\[\begin{split}\begin{array}{lcl} \mu &=& \frac{1}{M} \sum x_i\\ \sigma^2 &=& \frac{1}{M} \left(\sum x_i - \mu\right)^2\\ \hat{x}_i &=& \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon }}\\ y_i &= & \hat{x}_i \gamma + \beta. \end{array}\end{split}\]

where \(x_i, y_i\) are the inputs. In testing, the mean and variance computed by moving average calculated during training are used.

Parameters:
  • inp (Variable) – N-D array of input.
  • axes (tuple of int) – Mean and variance for each element in axes are calculated using elements on the rest axes. For example, if an input is 4 dimensions, and axes is [1], batch mean is calculated as np.mean(inp.d, axis=(0, 2, 3), keepdims=True) (using numpy expression as an example).
  • decay_rate (float) – Decay rate of running mean and variance.
  • eps (float) – Tiny value to avoid zero division by std.
  • batch_stat (bool) – Use mini-batch statistics rather than running ones.
  • output_stat (bool) – Output batch mean and variance.
  • fix_parameters (bool) – When set to True, the beta and gamma will not be updated.
  • param_init (dict) – Parameter initializers can be set with a dict. A key of the dict must be 'beta', 'gamma', 'mean' or 'var'. A value of the dict must be an Initializer or a numpy.ndarray. E.g. {'beta': ConstantIntializer(0), 'gamma': np.ones(gamma_shape) * 2}.
Returns:

N-D array.

Return type:

Variable

References

The shape of parameters has the same number of dimensions with the input data, and the shapes in axes has the same dimensions with the input, while the rest has 1. If an input is 4-dim and axes=[1], the parameter shape will be param_shape  = np.mean(inp.d, axis=(0, 2, 3), keepdims=True).shape (using numpy expression as an example).

class nnabla.experimental.parametric_function_class.batch_normalization.BatchNorm1d(n_features, axes=[1], decay_rate=0.9, eps=1e-05, batch_stat=True, output_stat=False, fix_parameters=False, param_init=None)[source]

Batch normalization layer for 3d-Array or 3d-Variable. This is typically used together with Conv1d.

\[\begin{split}\begin{array}{lcl} \mu &=& \frac{1}{M} \sum x_i\\ \sigma^2 &=& \frac{1}{M} \left(\sum x_i - \mu\right)^2\\ \hat{x}_i &=& \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon }}\\ y_i &= & \hat{x}_i \gamma + \beta. \end{array}\end{split}\]

where \(x_i, y_i\) are the inputs. In testing, the mean and variance computed by moving average calculated during training are used.

Parameters:
  • inp (Variable) – N-D array of input.
  • axes (tuple of int) – Mean and variance for each element in axes are calculated using elements on the rest axes. For example, if an input is 4 dimensions, and axes is [1], batch mean is calculated as np.mean(inp.d, axis=(0, 2, 3), keepdims=True) (using numpy expression as an example).
  • decay_rate (float) – Decay rate of running mean and variance.
  • eps (float) – Tiny value to avoid zero division by std.
  • batch_stat (bool) – Use mini-batch statistics rather than running ones.
  • output_stat (bool) – Output batch mean and variance.
  • fix_parameters (bool) – When set to True, the beta and gamma will not be updated.
  • param_init (dict) – Parameter initializers can be set with a dict. A key of the dict must be 'beta', 'gamma', 'mean' or 'var'. A value of the dict must be an Initializer or a numpy.ndarray. E.g. {'beta': ConstantIntializer(0), 'gamma': np.ones(gamma_shape) * 2}.
Returns:

N-D array.

Return type:

Variable

References

The shape of parameters has the same number of dimensions with the input data, and the shapes in axes has the same dimensions with the input, while the rest has 1. If an input is 4-dim and axes=[1], the parameter shape will be param_shape  = np.mean(inp.d, axis=(0, 2, 3), keepdims=True).shape (using numpy expression as an example).

class nnabla.experimental.parametric_function_class.batch_normalization.BatchNorm2d(n_features, axes=[1], decay_rate=0.9, eps=1e-05, batch_stat=True, output_stat=False, fix_parameters=False, param_init=None)[source]

Batch normalization layer for 4d-Array or 4d-Variable. This is typically used together with Conv2d.

\[\begin{split}\begin{array}{lcl} \mu &=& \frac{1}{M} \sum x_i\\ \sigma^2 &=& \frac{1}{M} \left(\sum x_i - \mu\right)^2\\ \hat{x}_i &=& \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon }}\\ y_i &= & \hat{x}_i \gamma + \beta. \end{array}\end{split}\]

where \(x_i, y_i\) are the inputs. In testing, the mean and variance computed by moving average calculated during training are used.

Parameters:
  • inp (Variable) – N-D array of input.
  • axes (tuple of int) – Mean and variance for each element in axes are calculated using elements on the rest axes. For example, if an input is 4 dimensions, and axes is [1], batch mean is calculated as np.mean(inp.d, axis=(0, 2, 3), keepdims=True) (using numpy expression as an example).
  • decay_rate (float) – Decay rate of running mean and variance.
  • eps (float) – Tiny value to avoid zero division by std.
  • batch_stat (bool) – Use mini-batch statistics rather than running ones.
  • output_stat (bool) – Output batch mean and variance.
  • fix_parameters (bool) – When set to True, the beta and gamma will not be updated.
  • param_init (dict) – Parameter initializers can be set with a dict. A key of the dict must be 'beta', 'gamma', 'mean' or 'var'. A value of the dict must be an Initializer or a numpy.ndarray. E.g. {'beta': ConstantIntializer(0), 'gamma': np.ones(gamma_shape) * 2}.
Returns:

N-D array.

Return type:

Variable

References

The shape of parameters has the same number of dimensions with the input data, and the shapes in axes has the same dimensions with the input, while the rest has 1. If an input is 4-dim and axes=[1], the parameter shape will be param_shape  = np.mean(inp.d, axis=(0, 2, 3), keepdims=True).shape (using numpy expression as an example).

class nnabla.experimental.parametric_function_class.batch_normalization.BatchNorm3d(n_features, axes=[1], decay_rate=0.9, eps=1e-05, batch_stat=True, output_stat=False, fix_parameters=False, param_init=None)[source]

Batch normalization layer for 5d-Array or 5d-Variable. This is typically used together with Conv3d.

\[\begin{split}\begin{array}{lcl} \mu &=& \frac{1}{M} \sum x_i\\ \sigma^2 &=& \frac{1}{M} \left(\sum x_i - \mu\right)^2\\ \hat{x}_i &=& \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon }}\\ y_i &= & \hat{x}_i \gamma + \beta. \end{array}\end{split}\]

where \(x_i, y_i\) are the inputs. In testing, the mean and variance computed by moving average calculated during training are used.

Parameters:
  • inp (Variable) – N-D array of input.
  • axes (tuple of int) – Mean and variance for each element in axes are calculated using elements on the rest axes. For example, if an input is 4 dimensions, and axes is [1], batch mean is calculated as np.mean(inp.d, axis=(0, 2, 3), keepdims=True) (using numpy expression as an example).
  • decay_rate (float) – Decay rate of running mean and variance.
  • eps (float) – Tiny value to avoid zero division by std.
  • batch_stat (bool) – Use mini-batch statistics rather than running ones.
  • output_stat (bool) – Output batch mean and variance.
  • fix_parameters (bool) – When set to True, the beta and gamma will not be updated.
  • param_init (dict) – Parameter initializers can be set with a dict. A key of the dict must be 'beta', 'gamma', 'mean' or 'var'. A value of the dict must be an Initializer or a numpy.ndarray. E.g. {'beta': ConstantIntializer(0), 'gamma': np.ones(gamma_shape) * 2}.
Returns:

N-D array.

Return type:

Variable

References

The shape of parameters has the same number of dimensions with the input data, and the shapes in axes has the same dimensions with the input, while the rest has 1. If an input is 4-dim and axes=[1], the parameter shape will be param_shape  = np.mean(inp.d, axis=(0, 2, 3), keepdims=True).shape (using numpy expression as an example).

class nnabla.experimental.parametric_function_class.embed.Embed(n_inputs, n_features, w_init=None, fix_parameters=False)[source]

Embed.

Embed slices a matrix/tensor with indexing array/tensor. Weights are initialized with nnabla.initializer.UniformInitializer within the range of \(-\sqrt{3}\) and \(\sqrt{3}\).

Parameters:
  • x (Variable) – [Integer] Indices with shape \((I_0, ..., I_N)\)
  • n_inputs – number of possible inputs, words or vocabraries
  • n_features – number of embedding features
  • fix_parameters (bool) – When set to True, the embedding weight matrix will not be updated.
Returns:

Output with shape \((I_0, ..., I_N, W_1, ..., W_M)\)

Return type:

Variable