class nbla::ReduceSetup

class ReduceSetup

This class can be used in Function::setup to prepare for the reduction CUDA kernel called in Function::forward or Function::backward.

The input shape are integrated to the two dimension (size_y, size_x) according to reduction axes; one part is the collection of reduction axes and the other part is the collection of the other axes. However the terminology x and y are determined for easier understanding of the implemented algorithm as follow.

x: the integrated dimensions including memory continuous dimension.
y: otherwise

For example, let an input shape (2, 3, 4, 5) and an reduction axes (0, 2). The dimensional part of x is (3, 5). That of y is (2, 4). Then

(ndim_y, ndim_x) = (2, 2)
(size_y, size_x) = (8, 15).

The original strides are (60, 20, 5, 1). Then

strides_x_input = (20, 1)
strides_y_input = (60, 5)
strides_x = (5, 1), which is the strides of the x-part shape (3, 5)
strides_y = (4, 1), which is the strides of the y-part shape (2, 4)

Public Functions

void operator()(const Shape_t &shape_input, const Shape_t &reduce_axes)

Setup operator.

Empty reduce_axes is acceptable. It makes the just copy of the input without reduction.
Negative values in reduce_axes are acceptable. The negative axis counts from the last to the first axis.