Scales the gradient by a float factor.
The instances whose labels
== ignore_label
will be ignored during backward, if use_ignore
is set to
).
true
If set to
, the softmax function will be computed along axis true
. This is applied when the shape of input array differs from the shape of label array.
1
Normalizes the gradient.
Multiplies gradient with output gradient element-wise.
If set to
, the softmax function will be computed along the last axis (true
).
-1
Constant for computing a label smoothed version of cross-entropyfor the backwards pass. This constant gets subtracted from theone-hot encoding of the gold label and distributed uniformly toall other labels.
If set to
, the true
ignore_label
value will not contribute to the backward gradient.
This Param Object is specifically used for SoftmaxOutput