ImageNet Models

This subpackage provides a variety of pre-trained state-of-the-art models which is trained on ImageNet dataset.

The pre-trained models can be used for both inference and training as following:

# Create ResNet-18 for inference
from nnabla.models.imagenet import ResNet
model = ResNet(18)
batch_size = 1
# model.input_shape returns (3, 224, 224) when ResNet-18
x = nn.Variable((batch_size,) + model.input_shape)
y = model(x, training=False)

# Execute inference
# Load input image as uint8 array with shape of (3, 224, 224)
from nnabla.utils.image_utils import imread
img = imread('example.jpg', size=model.input_shape[1:], channel_first=True)
x.d[0] = img
y.forward()
predicted_label = np.argmax(y.d[0])
print('Predicted label:', model.category_names[predicted_label])


# Create ResNet-18 for fine-tuning
batch_size=32
x = nn.Variable((batch_size,) + model.input_shape)
# * By training=True, it sets batch normalization mode for training
#   and gives trainable attributes to parameters.
# * By use_up_to='pool', it creats a network up to the output of
#   the final global average pooling.
pool = model(x, training=True, use_up_to='pool')

# Add a classification layer for another 10 category dataset
# and loss function
num_classes = 10
y = PF.affine(pool, num_classes, name='classifier10')
t = nn.Variable((batch_size, 1))
loss = F.sum(F.softmax_cross_entropy(y, t))

# Training...

Available models are summarized in the following table. Error rates are calculated using single center crop.

Available ImageNet models
Name Class Top-1 error Top-5 error Trained by/with
ResNet-18 ResNet 30.28 10.90 Neural Network Console
ResNet-32 ResNet 26.72 8.89 Neural Network Console
ResNet-50 ResNet 24.59 7.48 Neural Network Console
ResNet-101 ResNet 23.81 7.01 Neural Network Console
ResNet-152 ResNet 23.48 7.09 Neural Network Console
MobileNet MobileNet 29.51 10.34 Neural Network Console
MobileNetV2 MobileNetV2 29.94 10.82 Neural Network Console
SENet-154 SENet 22.04 6.29 Neural Network Console
SqueezeNet v1.1 SqueezeNet 41.23 19.18 Neural Network Console

Common interfaces

class nnabla.models.imagenet.base.ImageNetBase[source]

Most of ImageNet pretrained models are inherited from this class so that it provides some common interfaces.

__call__(input_var=None, use_from=None, use_up_to='classifier', training=False, force_global_pooling=False, check_global_pooling=True, returns_net=False, verbose=0)[source]

Create a network (computation graph) from a loaded model.

Parameters:
  • input_var (Variable, optional) – If given, input variable is replaced with the given variable and a network is constructed on top of the variable. Otherwise, a variable with batch size as 1 and a default shape from self.input_shape.
  • use_up_to (str) – Network is constructed up to a variable specified by a string. A list of string-variable correspondences in a model is described in documentation for each model class.
  • training (bool) – This option enables additional training (fine-tuning, transfer learning etc.) for the constructed network. If True, the batch_stat option in batch normalization is turned True, and need_grad attribute in trainable variables (conv weights and gamma and beta of bn etc.) is turned True. The default is False.
  • force_global_pooling (bool) – Regardless the input image size, the final average pooling before classification layer will be automatically transformed to a global average pooling. The default is False.
  • check_global_pooling (bool) – If True, and if the stride configuration of the final average pooling is not for global pooling, it raises an exception. The default is True. Use False when user want to do the pooling with the trained stride (7, 7) regardless the input spatial size.
  • returns_net (bool) – When True, it returns a NnpNetwork object. Otherwise, It only returns the last variable of the constructed network. The default is False.
  • verbose (bool, or int) – Verbose level. With 0, it says nothing during network construction.
category_names

Returns category names of 1000 ImageNet classes.

input_shape

Should returns default image size (channel, height, width) as a tuple.

List of models

class nnabla.models.imagenet.ResNet(num_layers=18)[source]

ResNet architectures for 18, 34, 50, 101, and 152 of number of layers.

Parameters:num_layers (int) – Number of layers chosen from 18, 34, 50, 101, and 152.

The following is a list of string that can be specified to use_up_to option in __call__ method;

  • 'classifier' (default): The output of the final affine layer for classification.
  • 'pool': The output of the final global average pooling.
  • 'lastconv': The input of the final global average pooling without ReLU activation..
  • 'lastconv+relu': Network up to 'lastconv' followed by ReLU activation.

References

class nnabla.models.imagenet.MobileNet[source]

MobileNet architecture.

The following is a list of string that can be specified to use_up_to option in __call__ method;

  • 'classifier' (default): The output of the final affine layer for classification.
  • 'pool': The output of the final global average pooling.
  • 'lastconv': The input of the final global average pooling without ReLU activation..
  • 'lastconv+relu': Network up to 'lastconv' followed by ReLU activation.

References

class nnabla.models.imagenet.MobileNetV2[source]

MobileNetV2 architecture.

The following is a list of string that can be specified to use_up_to option in __call__ method;

  • 'classifier' (default): The output of the final affine layer for classification.
  • 'pool': The output of the final global average pooling.
  • 'lastconv': The input of the final global average pooling without ReLU activation..
  • 'lastconv+relu': Network up to 'lastconv' followed by ReLU activation.

References

class nnabla.models.imagenet.SENet[source]

SENet-154 model which integrates SE blocks with a modified ResNeXt architecture.

The following is a list of string that can be specified to use_up_to option in __call__ method;

  • 'classifier' (default): The output of the final affine layer for classification.
  • 'pool': The output of the final global average pooling.
  • 'lastconv': The input of the final global average pooling without ReLU activation..
  • 'lastconv+relu': Network up to 'lastconv' followed by ReLU activation.

References

class nnabla.models.imagenet.SqueezeNet[source]

SqueezeNet v1.1 model.

The following is a list of string that can be specified to use_up_to option in __call__ method;

  • 'classifier' (default): The output of the final affine layer for classification.
  • 'pool': The output of the final global average pooling.
  • 'lastconv': The input of the final global average pooling without ReLU activation..
  • 'lastconv+relu': Network up to 'lastconv' followed by ReLU activation.

References