Float, Step size.
Float, rescaling factor of gradient.
Float, decay factor of moving average for gradient, gradient^^2.
Float, momentum factor of moving average for gradient.
Float, L2 regularization coefficient add to all the weights
The learning rate scheduler
Float, clip gradient in range [-clip_gradient, clip_gradient]
Float, Step size.
Sets an individual learning rate multiplier for each parameter.
Sets an individual learning rate multiplier for each parameter.
If you specify a learning rate multiplier for a parameter, then
the learning rate for the parameter will be set as the product of
the global learning rate and its multiplier.
note:: The default learning rate multiplier of a Variable
can be set with lr_mult
argument in the constructor.
Sets an individual weight decay multiplier for each parameter.
Sets an individual weight decay multiplier for each parameter.
By default, the weight decay multipler is set as 0 for all
parameters whose name don't end with
or _weight
, if
you call the _gamma
setIdx2Name
method to set idx2name.
note:: The default weight decay multiplier for a Variable
can be set with its wd_mult
argument in the constructor.
Update the parameters.
update num_update
Use setLrMult instead.
RMSProp optimizer as described in Tieleman & Hinton, 2012. http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.