Computation Graph
Computation Graph
- nnabla.forward_all(variables, bool clear_buffer=False, bool clear_no_need_grad=False, function_pre_hook=None, function_post_hook=None)
Performs a forward propagation up to variables specified as the 1st argument. See also
forward
.- Parameters:
clear_buffer (bool) –
Clear the no longer referenced variables during forward propagation to save memory. This is usually set as True in an inference or a validation phase. Default is False. Note that starting variable and destination variable of the input graph will not be cleared, regardless of their
persistent
flag. All intermediate variables will be cleared unless set explicitly aspersistent=True
. For example,forward_all([h_i, y], clear_buffer=True)
will clear all intermediate variables between
h_i
andy
unless set explicitly aspersistent=True
, buth_i
andy
will not be cleared regardless of theirpersistent
flag.clear_no_need_grad (bool) – Clear the unreferenced variables with need_grad=False during forward propagation. True is usually used when calling this during training. This is ignored when clear_buffer=True.
function_pre_hook (callable) – This callable object is called immediately before each function is executed. It must take
Function
as an input. The default is None.function_post_hook (callable) – This callable object is called immediately after each function is executed. It must take
Function
as an input. The default is None.
Example
import numpy as np import nnabla as nn import nnabla.parametric_functions as PF # Create a graph which has two outputs x = nn.Variable.from_numpy_array(np.array([[1, 2], [3, 4]])) y = PF.affine(x, 4, name="y") z = PF.affine(x, 8, name="z") # Execute a forward propagation recursively up to y and z nn.forward_all([y, z], clear_buffer)
- nnabla.no_grad(no_grad_=True)[source]
No gradients for the whole network.
No gradients are required when creating a network, such that when the forward pass is executed, all intermediate buffers except for the leafs in the network are gone at the same time, resulting in memory optimization.
This is useful for example when an output of a pre-trained network is used for an input to another network, where the first pre-trained network does not need to be fine-tuned, but the other network is optimized.
- Parameters:
no_grad (bool) – No gradient flag. Default is True.
Example:
with nn.no_grad(): output0 = <Network0>(<input0>) output1 = <Network1>(<input1>, output0) loss = <Loss>(output1, <ground_truth>) loss.forward(clear_no_need_grad=True)
This context also works in the dynamic mode.
with nn.auto_forward(), nn.no_grad(): output0 = <Network0>(<input0>)
Note
When working with the static network, the need_grad property of the input (e.g., input image) must be False and do not forget to add
<root>.forward(clear_no_need_grad=True)
; otherwise, all intermediate buffers are not gone as expected.