Variable

class nnabla.Variable

ベースクラス: object

nnabla.Variable は、関数やパラメトリック関数リストのリストにある関数とともに ( ニューラルネットワークの ) 計算グラフを作成するために用いられます。また、 nnabla.Variable はネットワークのフォーワード・バックワード計算を実行する機能も提供します。 nnabla.Variable クラスは以下の機能を提供します。

計算グラフにおける親関数への参照。この参照により、計算グラフのすべての接続をたどることができます。
Both data and error signal (gradient) containers as nnabla.NdArray s.
その他、いくつかの計算グラフのに関する付加的な情報。

Variable は、算術演算子 (+, -, *, /, **) をオーバーライドします。左右のオペランドとしては、スカラー数、 NdArray 、または Variable のいずれかになりますが利用可能です。 NdArray に対して左オペランドまたは右オペランドが指定された場合は、算術演算は、すぐに呼び出された計算の出力を格納する NdArray を返します。それ以外の場合は、グラフの連結を保持する Variable を返します。この演算は、 nnabla.auto_forward がTrue、または nnabla.set_auto_forward(True) が使われると指定されている場合はすぐに呼び出されます。

注釈

2つの Variable の関係演算子 == と != は、ベースとなる C++ インスタンス ( nbla::Variable ) のアドレスの比較を行います。また、 set や dict に対するキーでよく使われる hash() 関数もまた、このアドレスに基づいています。

参考

併せて Python API Tutorial を参照してください。

パラメータ:

shape (Iterable of int) -- 変数の形状 Shape。
need_grad (bool) -- この変数までのバックプロパゲーションが必要か否かのフラグ。

apply(self, **kwargs): プロパティをセットするための helper です。それ自身を返します。

backward(self, grad=1, bool clear_buffer=False, communicator_callbacks=None, function_pre_hook=None, function_post_hook=None)

この変数から関数グラフのルート変数に到達するまで、バックプロパゲーションを行います。このプロパゲーションは need_grad=False が指定された変数で終了します。

パラメータ:

grad (scalar, numpy.ndarray, nnabla.NdArray, or None) -- この変数の勾配の値。通常のニューラルネットワークの学習では、デフォルト値に 1 を用います。勾配計算を NNabla の外部で行い、勾配としてその結果を使いたい場合に、このオプションが役に立ちます。このオプションは、この変数の勾配値を変更するのではなく、受け取った値を一時的にその勾配として割り当てることに注意してください。また、 nnabla._variable.Variable.backward を実行したい Variable が他の変数との接続が切れている変数であり、対応する Variable が事前計算された勾配値を保持する場合は、 grad=None を指定する必要があります。 この指定が行われない場合、そのバックワードパス ( 結合されていない Variable からの伝播 ) では、事前計算された勾配値は 無視されます 。
clear_buffer (bool) -- Clears the no longer referenced variables during backpropagation to save memory. Note that all unnecessary intermediate variables will be cleared unless set explicitly as persistent=True.
communicator_callbacks (nnabla.CommunicatorBackwardCallback or list of nnabla.CommunicatorBackwardCallback) -- 1) 各関数の方向のバックワード計算が終了したときと、 2) すべてのバックワード計算が終了したときに、コールバック関数を呼び出します。
function_pre_hook (callable) -- 各関数の実行直前に、この呼び出し可能なオブジェクトを呼び出します。引数として Function を取る必要があります。デフォルトは None です。
function_post_hook (callable) -- 各関数の実行直後に、この呼び出し可能なオブジェクトを呼び出します。引数として Function を取る必要があります。デフォルトは None です。

例

はじめに、簡単な backward の使用法について説明します。

import nnabla as nn
import nnabla.functions as F
import nnabla.parametric_functions as PF
import numpy as np
import nnabla.initializer as I

rng = np.random.seed(217)
initializer = I.UniformInitializer((-0.1, 0.1), rng=rng)

x = nn.Variable((8, 3, 32, 32))
x.d = np.random.random(x.shape)  # random input, just for example.

y0 = PF.convolution(x, outmaps=64, kernel=(3, 3), pad=(1, 1), stride=(2, 2), w_init=initializer, name="conv1", with_bias=False)
y1 = F.relu(y0)
y2 = PF.convolution(y1, outmaps=128, kernel=(3, 3), pad=(1, 1), stride=(2, 2), w_init=initializer, name="conv2", with_bias=False)
y3 = F.relu(y2)
y4 = F.average_pooling(y3, kernel=y3.shape[2:])
y5 = PF.affine(y4, 1, w_init=initializer)
loss = F.mean(F.abs(y5 - 1.))
loss.forward()  # Execute forward

# We can check the current gradient of parameter.
print(nn.get_parameters()["conv1/conv/W"].g)

出力 :

[[[[0. 0. 0.]
   [0. 0. 0.]
   [0. 0. 0.]]
      ...

最初に、すべての勾配をゼロにする必要があります。次に、 backward を呼び出すとどうなるかを見てみましょう。

loss.backward()
print(nn.get_parameters()["conv1/conv/W"].g)

出力 :

[[[[ 0.00539637  0.00770839  0.0090611 ]
   [ 0.0078223   0.00978992  0.00720569]
   [ 0.00879023  0.00578172  0.00790895]]
                     ...

ここで、 backward を呼び出すことによって、勾配が計算され、 g に値が登録されることがわかります。 backward を連続的に呼び出すことで結果が加算されていくことに注意してください。つまり、 backward を再度実行すると、結果は 2 倍になります。

loss.backward()  # execute again.
print(nn.get_parameters()["conv1/conv/W"].g)

以下で、累積されていることがわかります。

[[[[ 0.01079273  0.01541678  0.0181222 ]
   [ 0.01564459  0.01957984  0.01441139]
   [ 0.01758046  0.01156345  0.0158179 ]]
                     ...

次は、接続されていない Variable を用いた高度な使用法です ( get_unlinked_variable を参照してください ) 。ここでは同様のネットワークを使用しますが、結合していない Variable により分離されたネットワークとなります。

import nnabla as nn
import nnabla.functions as F
import nnabla.parametric_functions as PF
import numpy as np
import nnabla.initializer as I

rng = np.random.seed(217)  # use the same random seed.
initializer = I.UniformInitializer((-0.1, 0.1), rng=rng)

x = nn.Variable((8, 3, 32, 32))
x.d = np.random.random(x.shape)  # random input, just for example.

y0 = PF.convolution(x, outmaps=64, kernel=(3, 3), pad=(1, 1), stride=(2, 2), w_init=initializer, name="conv1", with_bias=False)
y1 = F.relu(y0)
y2 = PF.convolution(y1, outmaps=128, kernel=(3, 3), pad=(1, 1), stride=(2, 2), w_init=initializer, name="conv2", with_bias=False)
y3 = F.relu(y2)
y3_unlinked = y3.get_unlinked_variable()  # the computation graph is cut apart here.
y4 = F.average_pooling(y3_unlinked, kernel=y3_unlinked.shape[2:])
y5 = PF.affine(y4, 1, w_init=initializer)
loss = F.mean(F.abs(y5 - 1.))

# Execute forward.
y3.forward()  # you need to execute forward at the unlinked variable first.
loss.forward()  # Then execute forward at the leaf variable.

# Execute backward.
loss.backward()  # works, but backpropagation stops at y3_unlinked.
print(nn.get_parameters()["conv1/conv/W"].g)  # no gradient registered yet.

出力 :

[[[[0. 0. 0.]
   [0. 0. 0.]
   [0. 0. 0.]]
      ...

バックプロパゲーションが y3_unlinked で停止することが確認できます。次に、ルート変数 (x) までのバックプロパゲーションを実行する方法を見てみましょう。これは少々複雑なため、最初によく陥りやすい誤りの一例を示します。 これはあくまで間違った方法での backward の動作を示した例であることにご注意ください。

y3.backward()  # this works, but computed gradient values are not correct.
print(nn.get_parameters()["conv1/conv/W"].g)

出力 :

[[[[ 17.795254    23.960905    25.51168   ]
   [ 20.661646    28.484127    19.406212  ]
   [ 26.91042     22.239697    23.395714  ]]
                     ...

これは間違った結果であることに注意してください。 y3_unlinked が保持するしていた勾配は完全に無視されています。上記のとおり、 backward を呼び出すだけで ( backward の呼び出し元のリーフ変数の) 勾配が 1 と見なされます。

2 つの個別のグラフに対して 正しく バックプロパゲーションを実行するためには、以下のように grad=None を指定する必要があります。次に、変数が保持する現在の勾配を使って計算を行います。( y3.backward(grad=y3_unlinked.g) は同じ動作となります。)

#reset all the gradient values.
for v in nn.get_parameters().values():
    v.g = 0.
for v in [y0, y1, y2, y3, y4, y5]:
    v.g = 0.  # need to reset all the gradient values.

loss.backward()  # backpropagation starts from the leaf variable again.
y3.backward(grad=None)  # By this, it can take over the gradient held by y3_unlinked.
print(nn.get_parameters()["conv1/conv/W"].g)  # correct result.

今回は同じ結果となります。

[[[[ 0.00539637  0.00770839  0.0090611 ]
   [ 0.0078223   0.00978992  0.00720569]
   [ 0.00879023  0.00578172  0.00790895]]
                     ...

bool_fill_(self, mask, value)

Return a new but inplaced nnabla.Variable filled with value where mask is non-zero.

パラメータ:

mask (nnabla.NdArray) -- Mask with which to fill. Non-zero/zero elements are supposed to be a binary mask as 1/0. No gradients are computed with respect to mask.
value (float) -- The value to fill.

戻り値:

nnabla.Variable

clear_all_graph_links(self)

すべての中間関数や Variable を消去します。

このメソッドは、順方向計算におけるこの Variable までのすべての中間関数と Variable を消去するため、動的グラフにおいて時間によるバックプロパゲーションの打ち切り ( truncated BPTT ) に役立ちます。

d

Returns the values held by this variable, as a numpy.ndarray. Note that the values are referenced (not copied). Therefore, the modification of the returned ndarray will affect the data of the NNabla array. This method can be called as a setter to set the value held by this variable. Refer to the documentation of the setter nnabla.NdArray.data for detailed behaviors of the setter.

パラメータ:: value (numpy.ndarray) (optional) --
戻り値:: numpy.ndarray

data

Returns the data held by this variable, as a NdArray. This can also be used as a setter.

パラメータ:: ndarray (NdArray) -- NdArray オブジェクト。 Variable と同じである必要があります。
戻り値:: NdArray

forward(self, bool clear_buffer=False, bool clear_no_need_grad=False, function_pre_hook=None, function_post_hook=None)

Performs a forward propagation from the root node to this variable. The forward propagation is performed on a subset of variables determined by the dependency of this variable. The subset is recursively constructed by tracking variables that the variables in the subset depend on, starting from this variable, until it reaches the root variable(s) in the function graph. See also forward_all, which performs forward computations for all variables within the input graph.

パラメータ:

clear_buffer (bool) -- Clear the no longer referenced variables during forward propagation to save memory. This is usually set as True in an inference or a validation phase. Default is False. Note that all unnecessary intermediate variables will be cleared unless set explicitly as persistent=True.
clear_no_need_grad (bool) -- フォワードプロパゲーション中に need_grad=False で参照されない Variable の data を解放します。学習中にこの関数を呼ぶ場合、通常 True を用います。clear_buffer=True の場合は無視されます。
function_pre_hook (callable) -- 各関数の実行直前に、この呼び出し可能なオブジェクトを呼び出します。引数として Function を取る必要があります。デフォルトは None です。
function_post_hook (callable) -- 各関数の実行直後に、この呼び出し可能なオブジェクトを呼び出します。引数として Function を取る必要があります。デフォルトは None です。

static from_numpy_array(data, grad=None, need_grad=None)

Numpy Array ( ndarray ) から Variable オブジェクトを作成します。

data は、指定された Numpy 配列で初期化されます。また、 grad も指定された場合は同様です。

Shape も指定された ndarrayによって決まります。

パラメータ:

data (ndarray) -- 作成された Variable の data にコピーされる値。
grad (ndarray) -- 作成された Variable の grad へコピーされる値。
need_grad (bool) -- この変数までのバックプロパゲーションが必要か否かのフラグ。

戻り値:

Variable

function_references

入力としてこの Variable を取る関数のリストを返します。このメソッドはゲッターとしてのみ呼び出すことができます。

戻り値:: nnabla.function.Function のリスト

g

Returns the gradient values held by this variable, as a numpy.ndarray. Note that the values are referenced (not copied). Therefore, the modification of the returned ndarray will affect the data of the NNabla array. This method can be called as a setter to set the gradient held by this variable. Refer to the documentation of the setter nnabla.NdArray.data for detailed behaviors of the setter.

パラメータ:: value (numpy.ndarray) --
戻り値:: numpy.ndarray

get_number_of_references

Gets the number of referneces to the same memory objects.

戻り値:: int

get_unlinked_variable(self, need_grad=None)

Variable のバッファーのインスタンスを共有する、接続の切れた ( 親との接続が切れた ) Variable を取得します。

パラメータ:: need_grad (bool, optional) -- デフォルトでは、接続を解除された Variable はこの Variable のインスタンスと同じ need_grad フラグを持ちます。 bool 値を指定することで、連結を解除された変数に新しい need_grad フラグをセットすることができます。意図しない動作を避けるために、このオプションは明示的に指定することを推奨します。

戻り値 : Variable

注釈

接続を解除された Variable は、 need_grad が変更されるかどうかにかかわらず、比較演算子とハッシュ関数において、元の Variable と同等に動作します。 Variable クラスのドキュメントの注意事項を参照してください。また、接続していない Variable を使った backward の実行については、 backward とその例を参照ください。

例

import numpy as np
import nnabla as nn
import nnabla.parametric_functions as PF

x = nn.Variable.from_numpy_array(np.array([[1, 2], [3, 4]]))
y = PF.affine(x, 4, name="y")

# Create a new variable of which graph connection is unlinked.
# Recommend to specify need_grad option explicitly .
z = y.get_unlinked_variable(need_grad=False)

print(y.parent)
# Affine
print(z.parent)  # z is unlinked from the parent x but shares the buffers of y.
# None

grad

Returns the gradient held by this variable, as a NdArray. This can also be used as a setter.

パラメータ:: ndarray (NdArray) -- NdArray オブジェクト。 Variable と同じである必要があります。
戻り値:: NdArray

info

object

変数の情報。

Type:: info

masked_fill_()

Variable.bool_fill_(self, mask, value)

Return a new but inplaced nnabla.Variable filled with value where mask is non-zero.

パラメータ:

mask (nnabla.NdArray) -- Mask with which to fill. Non-zero/zero elements are supposed to be a binary mask as 1/0. No gradients are computed with respect to mask.
value (float) -- The value to fill.

戻り値:

nnabla.Variable

ndim

この Variable の次元数を取得します。

戻り値:: int

need_grad

この Variable でバックプロパゲーションが行われるかどうかを示す bool 値を取得、または指定します。

パラメータ:: b (bool) -- この変数でバックプロパゲーションが行われるかどうか。
戻り値:: この変数が勾配を要求するかどうか。
戻り値の型:: bool

no_grad(self)

No gradients for the whole network.

This method is like nnabla.no_grad but can be used for the static network only, and useful for the case where the network is loaded from NNP format.

例

x = nn.Variable.from_numpy_array([2, 3])
y = <Network>(x).no_grad()

parent

この Variable の親関数を返します。このメソッドはセッターと呼ばれることもあります。

パラメータ:: func (nnabla.function.Function) --
戻り値:: nnabla.function.Function

persistent

この Variable の永続性を示すフラグを返します。 True の場合、 nnabla._variable.Variable.forward() と nnabla._variable.Variable.backward() で消去オプションが有効であっても、 Variable の data 、 grad は解放されません。これは、 Variable の値をデバッグしたり、ログを取る場合に役立ちます。このメソッドはセッターとして呼ぶこともできます。

パラメータ:: b (bool) --
戻り値:: bool

recompute

Gets or sets a boolean indicating whether its data is cleared during forward propagation and recomputation is performed during backward propagation.

パラメータ:: b (bool) -- Whether recomputation is performed during backward propagation.
戻り値:: Whether this variable is recomputed during backward propagation.
戻り値の型:: bool

reset_shape(self, shape, force=False)

Variable の Shape を指定された Shape へ変更します。

パラメータ:

shape (Iterable of int) -- ターゲットの Shape 。
force (bool) -- 強制的に Shape を変更するためのフラグ。

注釈

このメソッドはターゲットの Shape の形状を破壊的に変更します。安全のためには、代わりに reshape() を使うべきです。

戻り値:: None

reshape(self, shape, unlink=False)

指定された Shape に変更された、新しい Variable を返します。

パラメータ:

shape (Iterable of int) -- ターゲットの Shape 。
unlink (bool) -- グラフ結合のリンクを解除するかどうかを指定します ( 指定のない場合、デフォルトではグラフ結合を維持します。つまり、勾配は元の Variable へバックプロパゲーションされます ) 。

戻り値:

Variable

rewire_on(self, var)

この Variable の後続のグラフを var の先頭に挿入し、グラフを再構築します。

パラメータ:: var (nnabla.Variable) -- 配列の要素と var の親関数が参照として self にコピーされます。 var の親関数は削除されることにご注意ください。

例

# A. Create a graph A.
xa = nn.Variable((2, 8), need_grad=True)
ya = F.tanh(PF.affine(xa, 10, name='a'))

# B. Create a graph B.
xb = nn.Variable((2, 16), need_grad=True)
yb = F.tanh(PF.affine(
    F.tanh(PF.affine(xb, 8, name='b1')),
    8, name='b2'))

# C. Rewire the graph A on top of B such that
#    `xb->B->(yb->)xa->A->ya`. Note `yb` is gone.
xa.rewire_on(yb)

# D. Execute the rewired graph.
xb.d = 1
ya.forward()
ya.backward()

shape

Variable の Shape を取得します。

戻り値:: tuple of int

size

変数のサイズを取得します。

戻り値:: int

size_from_axis(self, axis=-1)

指定された axis のサイズを取得します。

例

a = nnabla.Variable([10,9])
a.size_from_axis()
# ==> 90
a.size_from_axis(0)
# ==> 90
a.size_from_axis(1)
# ==> 9
a.size_from_axis(2)
# ==> 1

パラメータ:: axis (int, optional) -- デフォルトは -1
戻り値:: int

unlinked(self, need_grad=None): この関数の利用は 非推奨 です。代わりに get_unlinked_variable を使用してください。

visit(self, f)

順方向で再帰的に関数にアクセスします。

パラメータ:: f (function) -- 引数として nnabla._function.Function オブジェクトを取る Function オブジェクト。
戻り値:: None

例

import nnabla as nn
import nnabla.functions as F
import nnabla.parametric_functions as PF

# Define a simple network-graph
def network_graph(x, maps=16, test=False):
    h = x
    h = PF.convolution(h, maps, kernel=(3, 3), pad=(1, 1), name="first-conv", with_bias=False)
    h = F.average_pooling(h, h.shape[2:])
    pred = PF.affine(h, 10, name="pred")
    return pred

# You can modify this PrintFunc to get the other information like inputs(nnabla_func.inputs), outputs and arguments(nnabla_func.info.args) of nnabla functions.
class PrintFunc(object):
    def __call__(self, nnabla_func):
        print(nnabla_func.info.type_name)

x = nn.Variable([1, 3, 16, 16])
output = network_graph(x)
output.visit(PrintFunc())

出力 :

Convolution
AveragePooling
Affine

visit_check(self, f)

順方向で再帰的に関数にアクセスします。

注釈

関数オブジェクトのいずれかが True を返した場合、順方向のプロパゲーションはすぐに停止し、 True を返します。

パラメータ:: f (function) -- 引数として nnabla._function.Function オブジェクトを取る Function オブジェクト。
戻り値:: bool Returns True if any of the function object call returns True.

例

以下のように、 AveragePooling 関数を明示的に加えることができる簡単なネットワークグラフを定義します。

def network_graph(x, add_avg_pool=False, maps=16, test=False):
    h = x
    h = PF.convolution(h, maps, kernel=(3, 3), pad=(1, 1), name="first-conv", with_bias=False)
    if add_avg_pool :
        h = F.average_pooling(h, h.shape[2:])
    else :
        h = F.relu(h)
    pred = PF.affine(h, 10, name="pred")
    return pred

# Define 'PrintFunc()' to check whether "AveragePooling" function exists in the network-graph
class PrintFunc(object):
    def __call__(self, nnabla_func):
        if nnabla_func.info.type_name =="AveragePooling" :
            print("{} exists in the graph".format(nnabla_func.info.type_name))
            return True
        else :
            return False

AveragePooling 関数を持ち、 visit_check() メソッドを呼び出すネットワークグラフを作成します。

x = nn.Variable([1, 3, 16, 16])
output = network_graph(x, add_avg_pool=True)  #Adding AveragePooling function to the graph
print("The return value of visit_check() method is : {}".format(output.visit_check(PrintFunc())))

出力 :

AveragePooling exists in the graph
The return value of visit_check() method is : True

AveragePooling 関数を持たずに、 visit_check() メソッドを呼び出すネットワークグラフを作成します。

nn.clear_parameters()                         # call this in case you want to run the following code again
output = network_graph(x, add_avg_pool=False) # Exclusion of AveragePooling function in the graph
print("The return value of visit_check() method is : {}".format(output.visit_check(PrintFunc())))

出力 :

The return value of visit_check() method is : False