Parameter and Block Naming¶
In gluon, each Parameter or Block has a name. Parameter names and Block names can be automatically created.
In this tutorial we talk about the best practices on naming. First, let’s import MXNet and Gluon:
[1]:
from __future__ import print_function
import mxnet as mx
from mxnet import gluon
Naming Blocks¶
When creating a block, you can simply do as follows:
[2]:
mydense = gluon.nn.Dense(100)
print(mydense.__class__.__name__)
Dense
When you create more Blocks of the same kind, they will be named with incrementing suffixes to avoid collision:
[3]:
dense1 = gluon.nn.Dense(100)
print(dense1.__class__.__name__)
Dense
Naming Parameters¶
Parameters will be named automatically by a unique name in the format of param_{uuid4}_{name}
:
[4]:
param = gluon.Parameter(name = 'bias')
print(param.name)
bias
param.name
is used as the name of a parameter’s symbol representation. And it can not be changed once the parameter is created.
When getting parameters within a Block, you should use the structure based name as the key:
[5]:
print(dense1.collect_params())
{'weight': Parameter (shape=(100, -1), dtype=float32), 'bias': Parameter (shape=(100,), dtype=float32)}
Nested Blocks¶
In MXNet 2, we don’t have to define children blocks within a name_scope
any more. Let’s demonstrate this by defining and initiating a simple neural net:
[6]:
class Model(gluon.HybridBlock):
def __init__(self):
super(Model, self).__init__()
self.dense0 = gluon.nn.Dense(20)
self.dense1 = gluon.nn.Dense(20)
self.mydense = gluon.nn.Dense(20)
def forward(self, x):
x = mx.npx.relu(self.dense0(x))
x = mx.npx.relu(self.dense1(x))
return mx.npx.relu(self.mydense(x))
model0 = Model()
model0.initialize()
model0.hybridize()
model0(mx.np.zeros((1, 20)))
[04:46:00] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU
[6]:
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0.]])
The same principle also applies to container blocks like Sequential. We can simply do as follows:
[7]:
net = gluon.nn.Sequential()
net.add(gluon.nn.Dense(20))
net.add(gluon.nn.Dense(20))
Saving and loading¶
For HybridBlock
, we use save_parameters
/load_parameters
, which uses model structure, instead of parameter name, to match parameters.
[8]:
model1 = Model()
model0.save_parameters('model.params')
model1.load_parameters('model.params')
print(mx.npx.load('model.params').keys())
dict_keys(['dense0.bias', 'dense0.weight', 'dense1.bias', 'dense1.weight', 'mydense.bias', 'mydense.weight'])
For SymbolBlock.imports
, we use export
, which uses parameter name param.name
, to save parameters.
[9]:
model0.export('model0')
model2 = gluon.SymbolBlock.imports('model0-symbol.json', ['data'], 'model0-0000.params')
/work/mxnet/python/mxnet/gluon/block.py:1943: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:
data: None
input_sym_arg_type = in_param.infer_type()[0]
Replacing Blocks from networks and fine-tuning¶
Sometimes you may want to load a pretrained model, and replace certain Blocks in it for fine-tuning.
For example, the alexnet in model zoo has 1000 output dimensions, but maybe you only have 100 classes in your application.
To see how to do this, we first load a pretrained ResNet.
In Gluon model zoo, all image classification models follow the format where the feature extraction layers are named
features
while the output layer is namedoutput
.Note that the output layer is a dense block with 1000 dimension outputs.
[10]:
resnet = gluon.model_zoo.vision.resnet50_v2()
print(resnet.output)
Dense(2048 -> 1000, linear)
To change the output to 100 dimension, we replace it with a new block.
[11]:
resnet.output = gluon.nn.Dense(100)
resnet.output.initialize()