Weights, gradients, learning rates and weight decays
Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
Number of updated weights.
Rescale gradient to grad = rescale_grad*grad.
This Param Object is specifically used for preloaded_multi_mp_sgd_update