Scales the gradient by a float factor.
The instances whose labels == ignore_label will be ignored during backward, if use_ignore is set to ).true
If set to , the softmax function will be computed along axis true. This is applied when the shape of input array differs from the shape of label array.1
Normalizes the gradient.
Multiplies gradient with output gradient element-wise.
If set to , the softmax function will be computed along the last axis (true).-1
Constant for computing a label smoothed version of cross-entropyfor the backwards pass. This constant gets subtracted from theone-hot encoding of the gold label and distributed uniformly toall other labels.
If set to , the trueignore_label value will not contribute to the backward gradient.
This Param Object is specifically used for SoftmaxOutput