class nnabla.utils.profiler.GraphProfiler(graph, device_id, ext_name, solver=None, n_run=100, max_measure_execution_time=1, time_scale='m')[source]

Class for measuring calculation time of each functions which compose nnabla computation graph.

You can check some performances of your nnabla network. This can measure the calculation times of :

  • function-wise forward
  • function-wise backward
  • whole graph forward
  • whole graph backward
  • training (forward + backward + update) (if solver is not None)


import nnabla as nn
import nnabla.functions as F
import nnabla.solvers as S
from nnabla.utils.profiler import GraphProfiler

# Set up nnabla context
device = "cpu"  # you can also use GPU ("cudnn")
ctx = get_extension_context(device)

# Network building
x = nn.Variable(shape=...)
t = nn.Variable(shape=...)
y = CNN(x) # you can build not only CNN but any networks
loss = F.mean(F.softmax_cross_entropy(y, t)) # any loss functions or variables can be used

# solver setting
solver = S.Sgd()

# SOME CODE (data loading or so on)

B = GraphProfiler(loss, solver=solver, device_id=0, ext_name=device, n_run=1000)
  • graph (nnabla.Variable) – Instance of nnabla.Variable class. GraphProfiler find all functions which compose network graph from root nnabla.Variable to this nnabla.Variable.
  • device_id (str) – gpu device id.
  • ext_name (str) – Extension name. e.g. ‘cpu’, ‘cuda’, ‘cudnn’ etc.
  • solver (nnabla.solvers.Solver) – Instance of nnabla.solvers.Solver for optimizing the parameters of the computation graph. if None, the training process is ignored. Default value is None.
  • n_run (int) – This argument specifies how many times the each functions` execution time are measured. Default value is 100.
  • max_measure_execution_time (float) – Maximum time of executing measurement for each functions. This argument has higher priority than n_run. When the measurement time for each functions get bigger than this argument, this class stops measuring and goes to next function, unless the total times of measurement are less than n_run. Default value is 1 [sec].
  • time_scale (str) – Time scale to display. [‘m’, ‘u’, ‘n’] (which stands for ‘mili’, ‘micro’ and ‘nano’)

Execute profiling.

This executes the 5 types of measurement:

  • function-wise forward
  • function-wise backward
  • whole graph forward
  • whole graph backward
  • training (forward + backward + update) (if solver is not None.)
class nnabla.utils.profiler.GraphProfilerCsvWriter(gb, file=<open file '<stdout>', mode 'w'>)[source]

csv writer for GraphProfiler class.


from nnabla.utils.profiler import GraphProfiler, GraphProfilerCsvWriter

# Network building comes above

B = GraphProfiler(variable, solver=solver, device_id=0, ext_name=device, n_run=1000)

with open("./profile.csv", "w") as f:
    writer = GraphProfilerCsvWriter(B, file=f)
  • gb (GraphProfiler) – Instance of GraphProfiler class which is main executor of profiling.
  • file (Python file object) – Output file object. Profile results will be written to the file which is specified by this argument.

Write result to the file. The output file is specified by file.