Sets an individual learning rate multiplier for each parameter.
Sets an individual learning rate multiplier for each parameter.
If you specify a learning rate multiplier for a parameter, then
the learning rate for the parameter will be set as the product of
the global learning rate and its multiplier.
note:: The default learning rate multiplier of a Variable
can be set with lr_mult
argument in the constructor.
Sets an individual weight decay multiplier for each parameter.
Sets an individual weight decay multiplier for each parameter.
By default, the weight decay multipler is set as 0 for all
parameters whose name don't end with
or _weight
, if
you call the _gamma
setIdx2Name
method to set idx2name.
note:: The default weight decay multiplier for a Variable
can be set with its wd_mult
argument in the constructor.
Update the parameters.
update num_update
Use setLrMult instead.
DCASGD optimizer with momentum and weight regularization. Implementation of paper "Asynchronous Stochastic Gradient Descent with Delay Compensation for Distributed Deep Learning"