class nbla::ReduceSetup
-
class ReduceSetup
This class can be used in Function::setup to prepare for the reduction CUDA kernel called in Function::forward or Function::backward.
The input shape are integrated to the two dimension (size_y, size_x) according to reduction axes; one part is the collection of reduction axes and the other part is the collection of the other axes. However the terminology x and y are determined for easier understanding of the implemented algorithm as follow.
x: the integrated dimensions including memory continuous dimension.
y: otherwise
For example, let an input shape (2, 3, 4, 5) and an reduction axes (0, 2). The dimensional part of x is (3, 5). That of y is (2, 4). Then
(ndim_y, ndim_x) = (2, 2)
(size_y, size_x) = (8, 15).
The original strides are (60, 20, 5, 1). Then
strides_x_input = (20, 1)
strides_y_input = (60, 5)
strides_x = (5, 1), which is the strides of the x-part shape (3, 5)
strides_y = (4, 1), which is the strides of the y-part shape (2, 4)