Debug Utils

Graph Profiler

class nnabla.utils.profiler.GraphProfiler(graph, device_id, ext_name, solver=None, n_run=100, max_measure_execution_time=1, time_scale='m', backward_accum=False)[source]

Class for measuring calculation time of each functions which compose nnabla computation graph.

You can check some performances of your nnabla network. This can measure the calculation times of :

  • function-wise forward

  • function-wise backward

  • whole graph forward

  • whole graph backward

  • training (forward + backward + update) (if solver is not None)


import nnabla as nn
import nnabla.functions as F
import nnabla.solvers as S
from nnabla.utils.profiler import GraphProfiler

# Set up nnabla context
device = "cpu"  # you can also use GPU ("cudnn")
ctx = get_extension_context(device)

# Network building
x = nn.Variable(shape=...)
t = nn.Variable(shape=...)
y = CNN(x) # you can build not only CNN but any networks
loss = F.mean(F.softmax_cross_entropy(y, t)) # any loss functions or variables can be used

# solver setting
solver = S.Sgd()

# SOME CODE (data loading or so on)

B = GraphProfiler(loss, solver=solver, device_id=0, ext_name=device, n_run=1000)
  • graph (nnabla.Variable) – Instance of nnabla.Variable class. GraphProfiler find all functions which compose network graph from root nnabla.Variable to this nnabla.Variable.

  • device_id (str) – gpu device id.

  • ext_name (str) – Extension name. e.g. ‘cpu’, ‘cuda’, ‘cudnn’ etc.

  • solver (nnabla.solvers.Solver) – Instance of nnabla.solvers.Solver for optimizing the parameters of the computation graph. if None, the training process is ignored. Default value is None.

  • n_run (int) – This argument specifies how many times the each functions` execution time are measured. Default value is 100.

  • max_measure_execution_time (float) – Maximum time of executing measurement for each functions. This argument has higher priority than n_run. When the measurement time for each functions get bigger than this argument, this class stops measuring and goes to next function, unless the total times of measurement are less than n_run. Default value is 1 [sec].

  • time_scale (str) – Time scale to display. [‘m’, ‘u’, ‘n’] (which stands for ‘mili’, ‘micro’ and ‘nano’)

  • backward_accum (bool) – Accumulation flag passed to the each backward function. The flag will fulfill the all accumulation flags with the same value of backward_accum. This flag is only valid for the time measurement of each function. For whole graph comutation, the NNabla graph engine set the appropriate accumulation flags to functions. Pay attention to inplace flag for your graph because accumulation and inplace flags cannot be set at the same time. If even one inplace flag is true in your graph, this backward_accum must be false. Default value is False.


Execute profiling.

This executes the 5 types of measurement:

  • function-wise forward

  • function-wise backward

  • whole graph forward

  • whole graph backward

  • training (forward + backward + update) (if solver is not None.)

class nnabla.utils.profiler.GraphProfilerCsvWriter(gb, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]

csv writer for GraphProfiler class.


from nnabla.utils.profiler import GraphProfiler, GraphProfilerCsvWriter

# Network building comes above

B = GraphProfiler(variable, solver=solver, device_id=0, ext_name=device, n_run=1000)

with open("./profile.csv", "w") as f:
    writer = GraphProfilerCsvWriter(B, file=f)
  • gb (GraphProfiler) – Instance of GraphProfiler class which is main executor of profiling.

  • file (Python file object) – Output file object. Profile results will be written to the file which is specified by this argument.


Write result to the file. The output file is specified by file.

Time Profiler

class nnabla.utils.inspection.profile.TimeProfiler(ext_name, device_id)[source]

An utility API to create function_hook callbacks to profile the execution time of each function. Passing ext_name and device_id, you can define which device time you want to profile. If ext_name = “cuda” or “cudnn”, then cudaEvent will be used to measure the execution time. For more information about cudaEvent, see the CUDA document. If `ext_name`=”cpu” , then wall-clock-time on host will be used.


ext_name = "cpu"
device_id = "0"

from nnabla.ext_utils import get_extension_context
ctx = get_extension_context(ext_name, device_id=device_id)

y = model(...)

from nnabla.utils.inspection import TimeProfiler
tp = TimeProfiler(ext_name=ext_name, device_id=device_id)

for i in range(max_iter):
    # All results of executions under "forward" scope are registered as "forward" execution.
    with tp.scope("forward"):
        y.forward(function_pre_hook=tp.pre_hook, function_post_hook=tp.post_hook)

    # All results of executions under "backward" scope are registered as "backward" execution.
    with tp.scope("backward") as tp:
        y.backward(function_pre_hook=tp.pre_hook, function_post_hook=tp.post_hook)

    # All results are evaluated by passing scopes to .calc_elapsed_time().
    # Be sure to call calc_elapsed_time at each iteration, otherwise nothing is measured.
    tp.calc_elapsed_time(["forward", "backward", "summary"])

# To output results on stdout, call instance as a function.

# To write out as csv file, call .to_csv().

Evaluate all elapsed times. Note that elapsed time is not recorded until calc_elapsed_time is called.


names (str or list of str) – Scope name(s) to evaluate elapsed time.

property post_hook

Get a callback for function_post_hook. This function can be used like the example below:

tp = TimeProfiler(..)
with tp.scope("forward"):

with tp.scope("backward"):
property pre_hook

Get a callback for function_pre_hook. This function can be used like the example below:

tp = TimeProfiler(..)
with tp.scope("forward"):

with tp.scope("backward"):

Change a scope to aggregate results. This function is used as context (The with statement statement),

and all results under the context are labeled by scope_name.

In adttion to the execution time of each function, the elapsed times between entering and exiting the each context are also recorded

and they are aggregated as “summary” scope.


scope_name (str) – Scope name.

to_csv(out_dir='./', ignore_init=True)[source]

Writes out to csv file. Output directory can be specified by out_dir. As default, the elapsed times of first iteration will be omitted. If you evaluate the first iteration as well, pass True to ignore_init.

  • out_dir (str) – Output directory.

  • ignore_init (bool) – Ignore the result of the first iteration or not.

Nan/Inf Tracer

class nnabla.utils.inspection.value_trace.NanInfTracer(trace_nan=True, trace_inf=True, need_details=True)[source]

An utility API to create function_hook callbacks to check whether the outputs of all layers have NaN or inf as their values. During forward and backward execution, passed as function_hook, this API reports ValueError if at least one of all layer outputs has Nan or inf as its values. Otherwise, all tensors passed to next layer or function as is.


pred = model(...)

from nnabla.utils.inspection import NanInfTracer
nit = NanInfTracer(trace_inf=True, trace_nan=True, need_details=True)

with nit.trace():
property backward_post_hook

Create callback function object which can be used as a function_post_hook argument of backward().


Checks nan/inf existence at all outputs of all layers and raises ValueError only if exist.

property forward_post_hook

Create callback function object which can be used as a function_post_hook argument of forward().


Create context manager to check nan/inf existence by using with statement. Using this context manager, checking nan/inf is performed automatically just before exiting with scope. Unless you use this context manager, be sure to call .check() explicitly to check nan/inf.


nit = NanInfTracer()
with nit.trace():

Pretty Printer

class nnabla.utils.inspection.pretty_print.PrettyPrinter(summary=False, hidden=False)[source]

Pretty printer to print the graph structure used with the visit method of a Variable.


List of functions of which element is the dictionary. The (key, value) pair is the (name, function name), (inputs, list of input variables), and (outputs, list of output variables) of a function.


list of dict

nnabla.utils.inspection.pretty_print.pprint(v, forward=False, backward=False, summary=False, hidden=False, printer=False)[source]

Pretty print information of a graph from a root variable v.

Note that in order to print the summary statistics, this function stores, i.e., does not reuse the intermediate buffers of a computation graph, increasing the memory usage if either the forward or backward is True.

  • v (nnabla.Variable) – Root variable.

  • forward (bool) – Call the forward method of a variable v.

  • backward (bool) – Call the backward method of a variable v.

  • summary (bool) – Print statictis of a intermediate variable.

  • hidden (bool) – Store the intermediate input and output variables if True.

  • printer (bool) – Return the printer object if True.


pred = Model(...)

from nnabla.utils.inspection import pprint

pprint(pred, summary=True, forward=True, backward=True)