Solvers

nnabla.solvers.Solver クラスは、計算グラフのパラメータを最適化するための確率的勾配降下法ベースのオプティマイザーを表します。NNabla は、以下にリストされた様々な solver を提供します。

Solver

class nnabla.solvers.Solver

Solver インターフェイスクラス。

このクラスに提供される同じ API を使用して、様々な種類の solver を実装できます。

例:

# Network building comes above
import nnabla.solvers as S
solver = S.Sgd(lr=1e-3)
solver.set_parameters(nn.get_parameters())

for itr in range(num_itr):
    x.d = ... # set data
    t.d = ... # set label
    loss.forward()
    solver.zero_grad()  # All gradient buffer being 0
    loss.backward()
    solver.weight_decay(decay_rate)  # Apply weight decay
    solver.clip_grad_by_norm(clip_norm)  # Apply clip grad by norm
    solver.update()  # updating parameters

注釈

NNable で提供される全ての solver は、 Solver の継承クラスに属します。solver は、このクラス自体からインスタンス化されることはありません。

check_inf_grad(self, pre_hook=None, post_hook=None): 設定された勾配に inf があるかを確認します。

check_inf_or_nan_grad(self, pre_hook=None, post_hook=None): 設定された勾配に inf または nan があるかを確認します。

check_nan_grad(self, pre_hook=None, post_hook=None): 設定された勾配に nan があるかを確認します。

clear_parameters(self): 登録されているすべてのパラメータと状態をクリアします。

clip_grad_by_norm(self, float clip_norm, pre_hook=None, post_hook=None)

ノルムにより勾配をクリッピングします。この関数が呼び出されると、指定されたノルムで勾配をクリッピングします。

パラメータ:: clip_norm (float) -- クリッピングノルムの値。

get_parameters(self): 登録されている全てのパラメータを取得します。

get_states(self): 全ての状態を取得します。

info

オブジェクト

型:: info

learning_rate(self): 学習率を取得します。

load_states(self, path)

solver の状態を読み込みます。

パラメータ:: path -- 読み込む state ファイルへのパス。

name: solver の名前を取得します。

remove_parameters(self, vector[string] keys): keys の vector で指定された、登録済みのパラメータを削除します。

save_states(self, path)

solver の状態を保存します。

パラメータ:: path -- パスまたはファイルオブジェクト。

scale_grad(self, scale, pre_hook=None, post_hook=None): 勾配のスケールを定数倍にします。

set_learning_rate(self, learning_rate): 学習率を設定します。

set_parameters(self, param_dict, bool reset=True, bool retain_state=False)

辞書で指定されたキーとパラメータ Variable でパラメータを設定します。

パラメータ:

param_dict (dict) -- キー: 文字列, 値: Variable。
reset (bool) -- true の場合、パラメータを設定する前に、全てのパラメータをクリアします。false の場合、パラメータは上書きされるか、(新しい場合は) 追加されます 。
retain_state (bool) -- この値は、reset が false の場合のみ考慮されます。この値が true 、かつキーが既に存在する場合 ( 上書き ) 、キーに関連付けられた状態 ( モメンタムなど ) は、パラメータの形状と新しいパラメータの形状が一致する場合に保持されます。

set_states(self, states): 状態を設定します。 set_parameters を呼び出し、最初に solver の状態を初期化してください。それを行わずにこのメソッドを呼び出すと、値エラーが発生します。

set_states_from_protobuf(self, optimizer_proto)

protobuf ファイルから solver に状態を設定します。

内部的に使用される helper メソッド。

set_states_to_protobuf(self, optimizer)

Optimizer から protobuf ファイルに状態を保存します。

内部的に使用される helper メソッド。

setup(self, params): 非推奨です。 param_dict で set_parameters を呼び出してください。

update(self, update_pre_hook=None, update_post_hook=None)

この関数を呼び出すと、パラメータ値は、パラメータ Variable の grad に保存されている backpropagation で蓄積された勾配を使用して更新されます。更新ルールは、Solver の派生クラスにおいて C++ コアで実装されています。更新されたパラメータ値は、パラメータ Variable の data フィールドに保存されます。

パラメータ:

update_pre_hook (callable) -- This callable object is called immediately before each update of parameters. The default is None.
update_post_hook (callable) -- This callable object is called immediately after each update of parameters. The default is None.

weight_decay(self, float decay_rate, pre_hook=None, post_hook=None)

Apply weight decay to gradients.

When called, the gradient weight will be decayed by a rate of the current parameter value.

パラメータ:: decay_rate (float) -- weight decay の係数。

注釈

In solvers which weight_decay_is_fused() returns true, the weight decay is not immediately performed when called. Instead, the specified decay_rate is stored in the solver instance, and lazily evaluated when update() method is called. The stored decay rate will expire after update() and revert to 0 or a default value specified at initialization of Solver class (if exists, ex. SgdW). The definition of weight decay operation depends on each of solver classes. Please refer to the documentation of each solver class.

weight_decay_is_fused(self)

Returns a boolean which represents whether weight decay is fused into update(), hence lazily evaluated.

See weight_decay() for more details.

zero_grad(self): 登録されている全てのパラメータの勾配をゼロで初期化します。

Solver のリスト

nnabla.solvers.Sgd(lr=0.001)

確率的勾配降下法（SGD）オプティマイザー。

\[w_{t+1} \leftarrow w_t - \eta \Delta w_t\]

パラメータ:: lr (float) -- 学習率 (\(\eta\))。
戻り値:: Solver クラスのインスタンス。詳しくは Solver API ガイドを参照してください。
戻り値の型:: Solver

注釈

コンテキストを指定して、指定されたタイプの Solver の優先ターゲット実装（CUDA など）をインスタンス化できます。コンテキストは nnabla.set_default_context(ctx) または nnabla.context_scope(ctx) で設定できます。API ドキュメントを参照してください。

nnabla.solvers.Momentum(lr=0.001, momentum=0.9)

モメンタムを使った SGD。

\[\begin{split}v_t &\leftarrow \gamma v_{t-1} + \eta \Delta w_t\\ w_{t+1} &\leftarrow w_t - v_t\end{split}\]

パラメータ:

lr (float) -- Initial learning rate (\(\eta_0\)).
momentum (float) -- モメンタムの減衰率。

戻り値:

Solver クラスのインスタンス。詳しくは Solver API ガイドを参照してください。

戻り値の型: