Grad

nnabla.grad.grad(outputs, inputs, grad_outputs=None, persistent_outputs=[], bind_grad_output=False)[ソース]

入力に対する出力の勾配関数。

grad 関数は入力に対する出力の勾配の和を計算します。

\[g_i = \sum_{j} {\frac{\partial y_j}{\partial x_i}},\]

\(y_j\) は各出力、 \(x_i\) は各入力、 \(g_i\) は \(j\) 全体の \(x_i\) に対する \(y_j\) の勾配の和です。

パラメータ:

outputs (list of Variable or Variable) -- 微分可能な関数の出力。
inputs (list of Variable or Variable) -- 計算される出力の勾配に対応する入力。
grad_outputs (None, scalar, numpy.ndarray, nnabla.NdArray, or list of scalar, numpy.ndarray, or nnabla.NdArray,) -- Gradient outputs corresponding to outputs. This is same as the grad argument of backward(). Default is None, so 1 is used as the in-coming gradient at the very beginning of the Variable in the gradient graph.
persistent_outputs (list of bool) -- 出力の persistent フラグを指定します。指定がない場合は、すべての出力は persistent になります。
bind_grad_output (bool) -- Bind data to grad of input variable. This is useful for the case where one wants to use the gradient graph for training a neural network using the first-order gradients only. Default is False.

戻り値

Variable のリスト。

If the backpropagation does not reach input(s), the corresponding returned value(s) are zero (i.e., the gradients w.r.t. inputs are zero) and not connected as a part of the gradient graph.

Example (Gradient Penalty):

import nnabla as nn
import nnabla.functions as F
import nnabla.parametric_functions as PF
import numpy as np
from nnabla.ext_utils import get_extension_context

# Context
extension_module = "cudnn"
ctx = get_extension_context(extension_module)
nn.set_default_context(ctx)

# Input and label
x = nn.Variable.from_numpy_array(np.random.randn(4, 3, 32, 32))
y = nn.Variable.from_numpy_array(np.random.randint(0, 10, 4).reshape(4, 1))

# Network
h = PF.convolution(x, 8, (3, 3), (1, 1), name="conv1")
h = F.relu(h)
h = F.max_pooling(h, (2, 2))
h = PF.convolution(h, 16, (3, 3), (1, 1), name="conv2")
h = F.relu(h)
h = F.max_pooling(h, (2, 2))
p = PF.affine(h, 10, name="pred")
loss = F.mean(F.softmax_cross_entropy(p, y))

# Grad
outputs = [loss]
inputs = nn.get_parameters().values()
grads = nn.grad(outputs, inputs)  # gradients of the parameters

# Backward of the outputs w.r.t. the parameters by constraining the gradient norms.
t = 0 # or 1
gp = sum([(F.sum(g ** 2) ** 0.5 - t) ** 2 for g in grads])
loss += gp
loss.forward()
loss.backward()

Example (Higer-order Gradients):

import nnabla as nn
import nnabla.functions as F
import numpy as np

x = nn.Variable.from_numpy_array(np.random.randn(2, 2)).apply(need_grad=True)
x.grad.zero()
y = F.sin(x)
def grad(y, x, n=1):
    dx = [y]
    for _ in range(n):
        dx = nn.grad([dx[0]], [x])
    return dx[0]
dnx = grad(y, x, n=10)
dnx.forward()
print(np.allclose(-np.sin(x.d), dnx.d))
dnx.backward()
print(np.allclose(-np.cos(x.d), x.g))

# Show the supported status for each function
from nnabla.backward_functions import show_registry
show_registry()

nnabla.backward_functions.register(func_name, func)[ソース]

Register the backward function to a function.

パラメータ:

func_name (str) -- The function class name, for example, Affine.
func (function) -- The function to be called as the backward function to the function func_name.. Arguments of the func must be (ctx: nn.Context, inputs: list of nn.Variable, **kwargs).. The inputs are the ones to the function of the func_name. The kwargs are the arguments of the function. For example, if the func_name is Affine, func is affine_backward, the inputs are data, weights, and bias if necessary, and kwargs = dict(base_axis=base_axis).

nnabla.backward_functions.show_registry()[ソース]: Show all backward fuctions registry