class nbla::SwapInOutScheduler

class SwapInOutScheduler

A class which manages GPU memory usage and schedules swap in/out throughout network computation.

If GPU memory is insufficient to train your model, SwapInOutScheduler enables you to compute it fast and effectively by using memory swapping strategy.

This class schedules the timing to swap out tensors to avoid out of memory and to swap in them before they are reused in computation.

The schedule is made to be based on the usage order of tensors in the first training iteration. It indicates the disorder of the usage order in the rest iteration will cause speed-down.

A scheduler takes the size of GPU memory which you want to manage. For example, when your GPU memory is up to 4 GB, the initialization is

SwapInOutScheduler scheduler(cpu_ctx, gpu_ctx, 4e9);

If out-of-memory error occurs in this configuration, the gradul reduction of 4e9 could solve the problem; for example, let the next size be 3.5e9.

This scheduler can be used easily as extension by enclosing a training block between SwapInOutScheduler::start_scheduling() and SwapInOutScheduler::end_scheduling(). And also you need set the callback functions into the arguments of forward, backward, and update. For example, in a training loop,

shceduler.start_scheduling();
// Input next data and label in this line.
loss->forward(false, true, nullptr,
              [&](const CgFunctionPtr &ptr) {
      scheduler.pre_function_callback(ptr); },
              [&](const CgFunctionPtr &ptr) {
      scheduler.post_function_callback(ptr); });
loss->variable()->grad()->fill(1.0);
loss->backward(nullptr, true, {},
              [&](const CgFunctionPtr &ptr) {
      scheduler.pre_function_callback(ptr); },
              [&](const CgFunctionPtr &ptr) {
      scheduler.post_function_callback(ptr); });
adam->update([&]() { swap_in_out_scheduler.pre_update_callback(); },
             [&]() { swap_in_out_scheduler.post_update_callback(); });
scheduler.end_scheduling();

Public Functions

NBLA_API SwapInOutScheduler(const Context &h_ctx, const Context &d_ctx, const size_t max, const size_t prefetch_max = 0, const bool save_host_mem = true, const bool save_host_mem_no_abort = false)

Constructor.

@params h_ctx Host context used as the destination of swap-out. @params d_ctx Device context. @params max Maximum GPU memory size managed by this class [bytes]. @params prefetch_max Maximum prefetch length. @params save_host_mem The flag to switch prefetch scheme to save host memory. @params save_host_mem_no_abort Irregular off-schedule does not abort program if cast prefetch irreversibly change the type of array.

NBLA_API ~SwapInOutScheduler(): Destructor.

NBLA_API void start_scheduling (): This initializes the scheduler and starts the management of GPU memory in this iteration.

NBLA_API void end_scheduling (): This finalizes the scheduler and stops the management of GPU memory in this iteration.

NBLA_API void pre_function_callback (const CgFunctionPtr &ptr): To use the scheduler, this callback must be set in the pre-function-hook arguments of forward and backward function.

NBLA_API void post_function_callback (const CgFunctionPtr &ptr): To use the scheduler, this callback must be set in the post-function-hook arguments of forward and backward function.

NBLA_API void pre_update_callback (): To use the scheduler, this callback must be set in the pre-update-hook argument of the update method of a solver.

NBLA_API void post_update_callback (): To use the scheduler, this callback must be set in the post-update-hook argument of the update method of a solver.