Parameter and Block Naming

In gluon, each Parameter or Block has a name. Parameter names and Block names can be automatically created.

In this tutorial we talk about the best practices on naming. First, let’s import MXNet and Gluon:

[1]:
from __future__ import print_function
import mxnet as mx
from mxnet import gluon

Naming Blocks

When creating a block, you can simply do as follows:

[2]:
mydense = gluon.nn.Dense(100)
print(mydense.__class__.__name__)
Dense

When you create more Blocks of the same kind, they will be named with incrementing suffixes to avoid collision:

[3]:
dense1 = gluon.nn.Dense(100)
print(dense1.__class__.__name__)
Dense

Naming Parameters

Parameters will be named automatically by a unique name in the format of param_{uuid4}_{name}:

[4]:
param = gluon.Parameter(name = 'bias')
print(param.name)
bias

param.name is used as the name of a parameter’s symbol representation. And it can not be changed once the parameter is created.

When getting parameters within a Block, you should use the structure based name as the key:

[5]:
print(dense1.collect_params())
{'weight': Parameter (shape=(100, -1), dtype=float32), 'bias': Parameter (shape=(100,), dtype=float32)}

Nested Blocks

In MXNet 2, we don’t have to define children blocks within a name_scope any more. Let’s demonstrate this by defining and initiating a simple neural net:

[6]:
class Model(gluon.HybridBlock):
    def __init__(self):
        super(Model, self).__init__()
        self.dense0 = gluon.nn.Dense(20)
        self.dense1 = gluon.nn.Dense(20)
        self.mydense = gluon.nn.Dense(20)

    def forward(self, x):
        x = mx.npx.relu(self.dense0(x))
        x = mx.npx.relu(self.dense1(x))
        return mx.npx.relu(self.mydense(x))

model0 = Model()
model0.initialize()
model0.hybridize()
model0(mx.np.zeros((1, 20)))
[04:46:00] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU
[6]:
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0.]])

The same principle also applies to container blocks like Sequential. We can simply do as follows:

[7]:
net = gluon.nn.Sequential()
net.add(gluon.nn.Dense(20))
net.add(gluon.nn.Dense(20))

Saving and loading

For HybridBlock, we use save_parameters/load_parameters, which uses model structure, instead of parameter name, to match parameters.

[8]:
model1 = Model()
model0.save_parameters('model.params')
model1.load_parameters('model.params')
print(mx.npx.load('model.params').keys())
dict_keys(['dense0.bias', 'dense0.weight', 'dense1.bias', 'dense1.weight', 'mydense.bias', 'mydense.weight'])

For SymbolBlock.imports, we use export, which uses parameter name param.name, to save parameters.

[9]:
model0.export('model0')
model2 = gluon.SymbolBlock.imports('model0-symbol.json', ['data'], 'model0-0000.params')
/work/mxnet/python/mxnet/gluon/block.py:1943: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:
        data: None
  input_sym_arg_type = in_param.infer_type()[0]

Replacing Blocks from networks and fine-tuning

Sometimes you may want to load a pretrained model, and replace certain Blocks in it for fine-tuning.

For example, the alexnet in model zoo has 1000 output dimensions, but maybe you only have 100 classes in your application.

To see how to do this, we first load a pretrained ResNet.

  • In Gluon model zoo, all image classification models follow the format where the feature extraction layers are named features while the output layer is named output.

  • Note that the output layer is a dense block with 1000 dimension outputs.

[10]:
resnet = gluon.model_zoo.vision.resnet50_v2()
print(resnet.output)
Dense(2048 -> 1000, linear)

To change the output to 100 dimension, we replace it with a new block.

[11]:
resnet.output = gluon.nn.Dense(100)
resnet.output.initialize()