Step size.
rescaling factor of gradient.
A small float number to make the updating processing stable. Default value is set to 1e-7.
L2 regularization coefficient add to all the weights
Step size.
Sets an individual learning rate multiplier for each parameter.
Sets an individual learning rate multiplier for each parameter.
If you specify a learning rate multiplier for a parameter, then
the learning rate for the parameter will be set as the product of
the global learning rate and its multiplier.
note:: The default learning rate multiplier of a Variable
can be set with lr_mult
argument in the constructor.
Sets an individual weight decay multiplier for each parameter.
Sets an individual weight decay multiplier for each parameter.
By default, the weight decay multipler is set as 0 for all
parameters whose name don't end with
or _weight
, if
you call the _gamma
setIdx2Name
method to set idx2name.
note:: The default weight decay multiplier for a Variable
can be set with its wd_mult
argument in the constructor.
Update the parameters.
update num_update
(Since version 0.10.0) Use setLrMult instead.
AdaGrad optimizer as described in Matthew D. Zeiler, 2012. http://arxiv.org/pdf/1212.5701v1.pdf