class nbla::CachingAllocatorWithBucketsBase

class CachingAllocatorWithBucketsBase : public nbla::Allocator 

A base class of CachingAllocatorWithBuckets.

This implements a caching logic but it leaves an instantiation of Memory class as a virtual function CachingAllocatorWithBucketsBase::make_memory_impl. It enables an easy realization of this allocator with any Memory class implementation such as CpuMemory and CudaMemory. The caching algorithm is described as following.

## Caching a previously requested memory into a memory pool

This allocator maintains a memory pool as a map from a requested memory configuration to a Memory instance previously created. A created memory block is re-used without allocation and free, which significantly reduces overhead due to memory allocation and deallocation, and implicit synchronization of device execution queues in CUDA for example.

## Size dependent memory pool

A memory pool is maintained as two separate pools for small size and large size memory respectively. By default, memory size less than 1MB is considered as a small block, otherwise large.

## Rounding rules of memory size

A requested memory size is rounded to a multiple of round_small_ (512B by default) or round_large_ (128KB by default) for small or large blocks respectively.

## Creation rules

If any of previously created memory block larger than a requested size is not found, a new Memory instance is created. If found, a minimum size memory block is used after applying the following split rule.

## Split rules

If the size of the found memory block is greater than or equal to round_small_ (512B by default) or small_alloc_ (1MB by default) + 1 for small or large respectively, the found memory block is split into two at an offset position by a requested size after rounding, then the second one is returned to the pool, and the first one is used.