# Computation Graph¶

## Computation Graph¶

nnabla.forward_all(variables, bool clear_buffer=False, bool clear_no_need_grad=False, function_pre_hook=None, function_post_hook=None)

Performs a forward propagation up to variables specified as the 1st argument. See also `forward`.

Parameters:
• clear_buffer (bool) –

Clear the no longer referenced variables during forward propagation to save memory. This is usually set as True in an inference or a validation phase. Default is False. Note that starting variable and destination variable of the input graph will not be cleared, regardless of their `persistent` flag. All intermediate variables will be cleared unless set explicitly as `persistent=True`. For example,

```forward_all([h_i, y], clear_buffer=True)
```

will clear all intermediate variables between `h_i` and `y` unless set explicitly as `persistent=True`, but `h_i` and `y` will not be cleared regardless of their `persistent` flag.

• clear_no_need_grad (bool) – Clear the unreferenced variables with need_grad=False during forward propagation. True is usually used when calling this during training. This is ignored when clear_buffer=True.

• function_pre_hook (callable) – This callable object is called immediately before each function is executed. It must take `Function` as an input. The default is None.

• function_post_hook (callable) – This callable object is called immediately after each function is executed. It must take `Function` as an input. The default is None.

Example

```import numpy as np
import nnabla as nn
import nnabla.parametric_functions as PF

# Create a graph which has two outputs
x = nn.Variable.from_numpy_array(np.array([[1, 2], [3, 4]]))
y = PF.affine(x, 4, name="y")
z = PF.affine(x, 8, name="z")

# Execute a forward propagation recursively up to y and z
nn.forward_all([y, z], clear_buffer)
```

No gradients for the whole network.

No gradients are required when creating a network, such that when the forward pass is executed, all intermediate buffers except for the leafs in the network are gone at the same time, resulting in memory optimization.

This is useful for example when an output of a pre-trained network is used for an input to another network, where the first pre-trained network does not need to be fine-tuned, but the other network is optimized.

Parameters:

Example:

```with nn.no_grad():
output0 = <Network0>(<input0>)

output1 = <Network1>(<input1>, output0)
loss = <Loss>(output1, <ground_truth>)
```with nn.auto_forward(), nn.no_grad():
When working with the static network, the need_grad property of the input (e.g., input image) must be False and do not forget to add `<root>.forward(clear_no_need_grad=True)`; otherwise, all intermediate buffers are not gone as expected.