gluon.metric

Online evaluation metric module.

Classes

Accuracy([axis, name, output_names, label_names])

Computes accuracy classification score.

BinaryAccuracy([name, output_names, …])

Computes the accuracy of a binary or multilabel classification problem.

CompositeEvalMetric([metrics, name, …])

Manages multiple evaluation metrics.

CrossEntropy([eps, ignore_label, axis, …])

Computes Cross Entropy loss.

CustomMetric(feval[, name, …])

Computes a customized evaluation metric.

EvalMetric(name[, output_names, label_names])

Base class for all evaluation metrics.

F1([name, output_names, label_names, …])

Computes the F1 score of a binary classification problem.

Fbeta([name, output_names, label_names, …])

Computes the Fbeta score of a binary classification problem.

Loss([name, output_names, label_names])

Dummy metric for directly printing loss.

MAE([name, output_names, label_names])

Computes Mean Absolute Error (MAE) loss.

MCC([name, output_names, label_names])

Computes the Matthews Correlation Coefficient of a binary classification problem.

MSE([name, output_names, label_names])

Computes Mean Squared Error (MSE) loss.

MeanCosineSimilarity([name, output_names, …])

Computes Mean Cosine Similarity.

MeanPairwiseDistance([name, output_names, …])

Computes Mean Pairwise Distance.

PCC([name, output_names, label_names])

PCC is a multiclass equivalent for the Matthews correlation coefficient derived from a discrete solution to the Pearson correlation coefficient.

PearsonCorrelation([name, output_names, …])

Computes Pearson correlation.

Perplexity([eps, ignore_label, axis, …])

Computes perplexity.

RMSE([name, output_names, label_names])

Computes Root Mean Squred Error (RMSE) loss.

TopKAccuracy([top_k, name, output_names, …])

Computes top k predictions accuracy.

Torch([name, output_names, label_names])

Dummy metric for torch criterions.

Functions

check_label_shapes(labels, preds[, wrap, shape])

Helper function for checking shape of label and prediction

create(metric, *args, **kwargs)

Creates evaluation metric from metric names or instances of EvalMetric or a custom metric function.

np(numpy_feval[, name, allow_extra_outputs])

Creates a custom evaluation metric that receives its inputs as numpy arrays.

predict_with_threshold(pred[, threshold])

Do thresholding of predictions in binary and multilabel cases.

class Accuracy(axis=1, name='accuracy', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes accuracy classification score.

The accuracy score is defined as

\[\text{accuracy}(y, \hat{y}) = \frac{1}{n} \sum_{i=0}^{n-1} \text{1}(\hat{y_i} == y_i)\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • axis (int, default=1) – The axis that represents classes

  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Examples

>>> predicts = [mx.np.array([[0.3, 0.7], [0, 1.], [0.4, 0.6]])]
>>> labels   = [mx.np.array([0, 1, 1])]
>>> acc = mx.gluon.metric.Accuracy()
>>> acc.update(preds = predicts, labels = labels)
>>> acc.get()
('accuracy', 0.6666666666666666)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data with class indices as values, one per sample.

  • preds (list of NDArray) – Prediction values for samples. Each prediction value can either be the class index, or a vector of likelihoods for all classes.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class BinaryAccuracy(name='binary_accuracy', output_names=None, label_names=None, threshold=0.5)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes the accuracy of a binary or multilabel classification problem.

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

  • threshold (float or ndarray, default 0.5) – threshold for deciding whether the predictions are positive or negative.

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Examples

>>> predicts = [mx.np.array([0.7, 1, 0.55])]
>>> labels   = [mx.np.array([0., 1., 0.])]
>>> bacc = mx.gluon.metric.BinaryAccuracy(threshold=0.6)
>>> bacc.update(preds = predicts, labels = labels)
>>> bacc.get()
('binary_accuracy', 0.6666666666666666)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – Each label denotes positive/negative for each class.

  • preds (list of NDArray) – Each prediction value is a confidence value of being positive for each class.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class CompositeEvalMetric(metrics=None, name='composite', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

Manages multiple evaluation metrics.

Parameters
  • metrics (list of EvalMetric) – List of child metrics.

  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Methods

add(metric)

Adds a child metric.

get()

Returns the current evaluation result.

get_config()

Save configurations of metric.

get_metric(index)

Returns a child metric.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(labels, preds)

Update the internal evaluation with named label and pred

Examples

>>> predicts = [mx.np.array([[0.3, 0.7], [0, 1.], [0.4, 0.6]])]
>>> labels   = [mx.np.array([0, 1, 1])]
>>> eval_metrics_1 = mx.gluon.metric.Accuracy()
>>> eval_metrics_2 = mx.gluon.metric.F1()
>>> eval_metrics = mx.gluon.metric.CompositeEvalMetric()
>>> for child_metric in [eval_metrics_1, eval_metrics_2]:
>>>     eval_metrics.add(child_metric)
>>> eval_metrics.update(labels = labels, preds = predicts)
>>> eval_metrics.get()
(['accuracy', 'f1'], [0.6666666666666666, 0.8])
add(metric)[source]

Adds a child metric.

Parameters

metric – A metric instance.

get()[source]

Returns the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()[source]

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_metric(index)[source]

Returns a child metric.

Parameters

index (int) – Index of child metric in the list of metrics.

reset()[source]

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(labels, preds)[source]

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class CrossEntropy(eps=1e-12, ignore_label=None, axis=-1, from_logits=False, name='cross-entropy', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes Cross Entropy loss.

The cross entropy over a batch of sample size \(N\) is given by

\[-\sum_{n=1}^{N}\sum_{k=1}^{K}t_{nk}\log (y_{nk}),\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

where \(t_{nk}=1\) if and only if sample \(n\) belongs to class \(k\). \(y_{nk}\) denotes the probability of sample \(n\) belonging to class \(k\).

Parameters
  • eps (float, default 1e-12) – Use small constant for the case that predicted value is 0.

  • ignore_label (int or None, default None) – Index of invalid label to ignore when counting. By default, sets to -1. If set to None, it will include all entries.

  • axis (int, default -1) – The axis from prediction that was used to compute softmax. By default use the last axis.

  • from_logits (boolean, default False) – Whether pred is expected to be a logits tensor. By default, we assume that pred encodes a probability distribution.

  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Examples

>>> predicts = [mx.np.array([[0.3, 0.7], [0, 1.], [0.4, 0.6]])]
>>> labels   = [mx.np.array([0, 1, 1])]
>>> ce = mx.gluon.metric.CrossEntropy()
>>> ce.update(labels, predicts)
>>> ce.get()
('cross-entropy', 0.57159948348999023)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class CustomMetric(feval, name=None, allow_extra_outputs=False, output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes a customized evaluation metric.

The feval function can return a tuple of (sum_metric, num_inst) or return an int sum_metric.

Parameters
  • feval (callable(label, pred)) – Customized evaluation function.

  • name (str) – The name of the metric. (the default is None).

  • allow_extra_outputs (bool, optional) – If true, the prediction outputs can have extra outputs. This is useful in RNN, where the states are also produced in outputs for forwarding. (the default is False).

  • name – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Examples

>>> predicts = [mx.np.array(np.array([3, -0.5, 2, 7]).reshape(4,1))]
>>> labels = [mx.np.array(np.array([2.5, 0.0, 2, 8]).reshape(4,1))]
>>> feval = lambda x, y : (x + y).mean()
>>> eval_metrics = mx.gluon.metric.CustomMetric(feval=feval)
>>> eval_metrics.update(labels, predicts)
>>> eval_metrics.get()
('custom(<lambda>)', 6.0)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()[source]

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class EvalMetric(name, output_names=None, label_names=None, **kwargs)[source]

Bases: object

Base class for all evaluation metrics.

Note

This is a base class that provides common metric interfaces. One should not use this class directly, but instead create new metric classes that extend it.

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

get()[source]

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()[source]

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()[source]

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()[source]

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)[source]

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class F1(name='f1', output_names=None, label_names=None, class_type='binary', threshold=0.5, average='micro')[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes the F1 score of a binary classification problem.

The F1 score is equivalent to harmonic mean of the precision and recall, where the best value is 1.0 and the worst value is 0.0. The formula for F1 score is:

F1 = 2 * (precision * recall) / (precision + recall)

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

The formula for precision and recall is:

precision = true_positives / (true_positives + false_positives)
recall    = true_positives / (true_positives + false_negatives)

Note

This F1 score only supports binary classification.

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

  • class_type (str, default "binary") – “binary”: f1 for binary classification. “multiclass”: f1 for multiclassification problem. “multilabel”: f1 for multilabel classification.

  • threshold (float, default 0.5) – threshold for postive confidence value.

  • average (str, default 'micro') –

    Strategy to be used for aggregating across mini-batches.

    ”macro”: Calculate metrics for each label and return unweighted mean of f1. “micro”: Calculate metrics globally by counting the total TP, FN and FP. None: Return f1 scores for each class (numpy.ndarray) .

Examples

>>> predicts = [mx.np.array([[0.3, 0.7], [0., 1.], [0.4, 0.6]])]
>>> labels   = [mx.np.array([0., 1., 1.])]
>>> f1 = mx.gluon.metric.F1()
>>> f1.update(preds = predicts, labels = labels)
>>> f1.get()
('f1', 0.8)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()[source]

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class Fbeta(name='fbeta', output_names=None, label_names=None, class_type='binary', beta=1, threshold=0.5, average='micro')[source]

Bases: mxnet.gluon.metric.F1

Computes the Fbeta score of a binary classification problem.

The Fbeta score is equivalent to harmonic mean of the precision and recall, where the best value is 1.0 and the worst value is 0.0. The formula for Fbeta score is:

Fbeta = (1 + beta ** 2) * (precision * recall) / (beta ** 2 * precision + recall)

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

The formula for precision and recall is:

precision = true_positives / (true_positives + false_positives)
recall    = true_positives / (true_positives + false_negatives)

Note

This Fbeta score only supports binary classification.

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

  • class_type (str, default "binary") – “binary”: f1 for binary classification. “multiclass”: f1 for multiclassification problem. “multilabel”: f1 for multilabel classification.

  • beta (float, default 1) – weight of precision in harmonic mean.

  • threshold (float, default 0.5) – threshold for postive confidence value.

  • average (str, default 'micro') –

    Strategy to be used for aggregating across mini-batches.

    ”macro”: Calculate metrics for each label and return unweighted mean of f1. “micro”: Calculate metrics globally by counting the total TP, FN and FP. None: Return f1 scores for each class.

Examples

>>> predicts = [mx.np.array([[0.3, 0.7], [0., 1.], [0.4, 0.6]])]
>>> labels   = [mx.np.array([0., 1., 1.])]
>>> fbeta = mx.gluon.metric.Fbeta(beta=2)
>>> fbeta.update(preds = predicts, labels = labels)
>>> fbeta.get()
('fbeta', 0.9090909090909091)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class Loss(name='loss', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

Dummy metric for directly printing loss.

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(_, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(_, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class MAE(name='mae', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes Mean Absolute Error (MAE) loss.

The mean absolute error is given by

\[\frac{\sum_i^n |y_i - \hat{y}_i|}{n}\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Examples

>>> predicts = [mx.np.array([3, -0.5, 2, 7])]
>>> labels = [mx.np.array([2.5, 0.0, 2, 8])]
>>> mean_absolute_error = mx.gluon.metric.MAE()
>>> mean_absolute_error.update(labels = labels, preds = predicts)
>>> mean_absolute_error.get()
('mae', 0.5)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class MCC(name='mcc', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes the Matthews Correlation Coefficient of a binary classification problem.

While slower to compute than F1 the MCC can give insight that F1 or Accuracy cannot. For instance, if the network always predicts the same result then the MCC will immeadiately show this. The MCC is also symetric with respect to positive and negative categorization, however, there needs to be both positive and negative examples in the labels or it will always return 0. MCC of 0 is uncorrelated, 1 is completely correlated, and -1 is negatively correlated.

\[\text{MCC} = \frac{ TP \times TN - FP \times FN } {\sqrt{ (TP + FP) ( TP + FN ) ( TN + FP ) ( TN + FN ) } }\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

where 0 terms in the denominator are replaced by 1.

Note

This version of MCC only supports binary classification. See PCC.

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Examples

>>> # In this example the network almost always predicts positive
>>> false_positives = 1000
>>> false_negatives = 1
>>> true_positives = 10000
>>> true_negatives = 1
>>> predicts = [mx.np.array(
    [[.3, .7]]*false_positives +
    [[.7, .3]]*true_negatives +
    [[.7, .3]]*false_negatives +
    [[.3, .7]]*true_positives
)]
>>> labels  = [mx.np.array(
    [0.]*(false_positives + true_negatives) +
    [1.]*(false_negatives + true_positives)
)]
>>> f1 = mx.gluon.metric.F1()
>>> f1.update(preds = predicts, labels = labels)
>>> mcc = mx.gluon.metric.MCC()
>>> mcc.update(preds = predicts, labels = labels)
>>> f1.get()
('f1', 0.95233560306652054)
>>> mcc.get()
('mcc', 0.01917751877733392)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()[source]

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class MSE(name='mse', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes Mean Squared Error (MSE) loss.

The mean squared error is given by

\[\frac{\sum_i^n (y_i - \hat{y}_i)^2}{n}\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Examples

>>> predicts = [mx.np.array([3, -0.5, 2, 7])]
>>> labels = [mx.np.array([2.5, 0.0, 2, 8])]
>>> mean_squared_error = mx.gluon.metric.MSE()
>>> mean_squared_error.update(labels = labels, preds = predicts)
>>> mean_squared_error.get()
('mse', 0.375)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class MeanCosineSimilarity(name='cos_sim', output_names=None, label_names=None, eps=1e-08)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes Mean Cosine Similarity.

The mean cosine similarity is given by

\[cos_sim(label, pred) = \frac{{label}.{pred}}{max(||label||.||pred||, eps)}\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Calculation happens on the last dimension of label and pred.

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

  • eps (float, default 1e-8) – small vale to avoid division by zero.

Examples

>>> predicts = [mx.np.array([[1., 0.], [1., 1.]])]
>>> labels = [mx.np.array([[3., 4.], [2., 2.]])]
>>> mcs = mx.gluon.metric.MeanCosineSimilarity()
>>> mcs.update(labels = labels, preds = predicts)
>>> mcs.get()
('cos_sim', 0.8)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class MeanPairwiseDistance(name='mpd', output_names=None, label_names=None, p=2)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes Mean Pairwise Distance.

The mean pairwise distance is given by

\[\sqrt{\frac{(\sum_i^n (y_i - \hat{y}_i)^p)^\frac{1}{p}}{n}}\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

  • p (float, default 2) – calculating distance using the p-norm

Examples

>>> predicts = [mx.np.array([[1., 2.], [3., 4.]])]
>>> labels = [mx.np.array([[1., 0.], [4., 2.]])]
>>> mpd = mx.gluon.metric.MeanPairwiseDistance()
>>> mpd.update(labels = labels, preds = predicts)
>>> mpd.get()
('mpd', 2.1180338859558105)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class PCC(name='pcc', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

PCC is a multiclass equivalent for the Matthews correlation coefficient derived from a discrete solution to the Pearson correlation coefficient.

\[\text{PCC} = \frac {\sum _{k}\sum _{l}\sum _{m}C_{kk}C_{lm}-C_{kl}C_{mk}} {{\sqrt {\sum _{k}(\sum _{l}C_{kl})(\sum _{k'|k'\neq k}\sum _{l'}C_{k'l'})}} {\sqrt {\sum _{k}(\sum _{l}C_{lk})(\sum _{k'|k'\neq k}\sum _{l'}C_{l'k'})}}}\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Attributes

sum_metric

Return an attribute of instance, which is of type owner.

defined in terms of a K x K confusion matrix C.

When there are more than two labels the PCC will no longer range between -1 and +1. Instead the minimum value will be between -1 and 0 depending on the true distribution. The maximum value is always +1.

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Examples

>>> # In this example the network almost always predicts positive
>>> false_positives = 1000
>>> false_negatives = 1
>>> true_positives = 10000
>>> true_negatives = 1
>>> predicts = [mx.np.array(
    [[.3, .7]]*false_positives +
    [[.7, .3]]*true_negatives +
    [[.7, .3]]*false_negatives +
    [[.3, .7]]*true_positives
)]
>>> labels  = [mx.np.array(
    [0]*(false_positives + true_negatives) +
    [1]*(false_negatives + true_positives)
)]
>>> f1 = mx.gluon.metric.F1()
>>> f1.update(preds = predicts, labels = labels)
>>> pcc = mx.gluon.metric.PCC()
>>> pcc.update(preds = predicts, labels = labels)
>>> f1.get()
('f1', 0.95233560306652054)
>>> pcc.get()
('pcc', 0.01917751877733392)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()[source]

Resets the internal evaluation result to initial state.

property sum_metric

Return an attribute of instance, which is of type owner.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class PearsonCorrelation(name='pearsonr', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes Pearson correlation.

The pearson correlation is given by

\[\frac{cov(y, \hat{y})}{\sigma{y}\sigma{\hat{y}}}\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Examples

>>> predicts = [mx.np.array([[0.3, 0.7], [0, 1.], [0.4, 0.6]])]
>>> labels   = [mx.np.array([[1, 0], [0, 1], [0, 1]])]
>>> pr = mx.gluon.metric.PearsonCorrelation()
>>> pr.update(labels, predicts)
>>> pr.get()
('pearsonr', 0.42163704544016178)
get()[source]

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()[source]

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class Perplexity(eps=1e-12, ignore_label=None, axis=-1, from_logits=False, name='perplexity', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.CrossEntropy

Computes perplexity.

Perplexity is a measurement of how well a probability distribution or model predicts a sample. A low perplexity indicates the model is good at predicting the sample.

The perplexity of a model q is defined as

\[b^{\big(-\frac{1}{N} \sum_{i=1}^N \log_b q(x_i) \big)} = \exp \big(-\frac{1}{N} \sum_{i=1}^N \log q(x_i)\big)\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

where we let b = e.

\(q(x_i)\) is the predicted value of its ground truth label on sample \(x_i\).

For example, we have three samples \(x_1, x_2, x_3\) and their labels are \([0, 1, 1]\). Suppose our model predicts \(q(x_1) = p(y_1 = 0 | x_1) = 0.3\) and \(q(x_2) = 1.0\), \(q(x_3) = 0.6\). The perplexity of model q is \(exp\big(-(\log 0.3 + \log 1.0 + \log 0.6) / 3\big) = 1.77109762852\).

Parameters
  • eps (float, default 1e-12) – Use small constant for the case that predicted value is 0.

  • ignore_label (int or None, default None) – Index of invalid label to ignore when counting. By default, sets to -1. If set to None, it will include all entries.

  • axis (int (default -1)) – The axis from prediction that was used to compute softmax. By default use the last axis.

  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Examples

>>> predicts = [mx.np.array([[0.3, 0.7], [0, 1.], [0.4, 0.6]])]
>>> labels   = [mx.np.array([0, 1, 1])]
>>> perp = mx.gluon.metric.Perplexity(ignore_label=None)
>>> perp.update(labels, predicts)
>>> perp.get()
('Perplexity', 1.7710976285155853)
get()[source]

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class RMSE(name='rmse', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.MSE

Computes Root Mean Squred Error (RMSE) loss.

The root mean squared error is given by

\[\sqrt{\frac{\sum_i^n (y_i - \hat{y}_i)^2}{n}}\]

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Examples

>>> predicts = [mx.np.array([3, -0.5, 2, 7])]
>>> labels = [mx.np.array([2.5, 0.0, 2, 8])]
>>> root_mean_squared_error = mx.gluon.metric.RMSE()
>>> root_mean_squared_error.update(labels = labels, preds = predicts)
>>> root_mean_squared_error.get()
('rmse', 0.612372457981)
get()[source]

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class TopKAccuracy(top_k=1, name='top_k_accuracy', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.EvalMetric

Computes top k predictions accuracy.

TopKAccuracy differs from Accuracy in that it considers the prediction to be True as long as the ground truth label is in the top K predicated labels.

If top_k = 1, then TopKAccuracy is identical to Accuracy.

Parameters
  • top_k (int) – Whether targets are in top k predictions.

  • name (str) – Name of this metric instance for display.

  • output_names (list of str, or None) – Name of predictions that should be used when updating with update_dict. By default include all predictions.

  • label_names (list of str, or None) – Name of labels that should be used when updating with update_dict. By default include all labels.

Methods

get()

Gets the current evaluation result.

get_config()

Save configurations of metric.

get_name_value()

Returns zipped name and value pairs.

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)

Updates the internal evaluation result.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Examples

>>> np.random.seed(999)
>>> top_k = 3
>>> labels = [mx.np.array([2, 6, 9, 2, 3, 4, 7, 8, 9, 6])]
>>> predicts = [mx.np.array(np.random.rand(10, 10))]
>>> acc = mx.gluon.metric.TopKAccuracy(top_k=top_k)
>>> acc.update(labels, predicts)
>>> acc.get()
('top_k_accuracy', 0.3)
get()

Gets the current evaluation result.

Returns

  • names (list of str) – Name of the metrics.

  • values (list of float) – Value of the evaluations.

get_config()

Save configurations of metric. Can be recreated from configs with metric.create(**config)

get_name_value()

Returns zipped name and value pairs.

Returns

A (name, value) tuple list.

Return type

list of tuples

reset()

Resets the internal evaluation result to initial state.

update(labels, preds)[source]

Updates the internal evaluation result.

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

update_dict(label, pred)

Update the internal evaluation with named label and pred

Parameters
  • labels (OrderedDict of str -> NDArray) – name to array mapping for labels.

  • preds (OrderedDict of str -> NDArray) – name to array mapping of predicted outputs.

class Torch(name='torch', output_names=None, label_names=None)[source]

Bases: mxnet.gluon.metric.Loss

Dummy metric for torch criterions.

check_label_shapes(labels, preds, wrap=False, shape=False)[source]

Helper function for checking shape of label and prediction

Parameters
  • labels (list of NDArray) – The labels of the data.

  • preds (list of NDArray) – Predicted values.

  • wrap (boolean) – If True, wrap labels/preds in a list if they are single NDArray

  • shape (boolean) – If True, check the shape of labels and preds; Otherwise only check their length.

create(metric, *args, **kwargs)[source]

Creates evaluation metric from metric names or instances of EvalMetric or a custom metric function.

Parameters
  • metric (str or callable) –

    Specifies the metric to create. This argument must be one of the below:

    • Name of a metric.

    • An instance of EvalMetric.

    • A list, each element of which is a metric or a metric name.

    • An evaluation function that computes custom metric for a given batch of labels and predictions.

  • *args (list) – Additional arguments to metric constructor. Only used when metric is str.

  • **kwargs (dict) – Additional arguments to metric constructor. Only used when metric is str

Examples

>>> def custom_metric(label, pred):
...     return np.mean(np.abs(label - pred))
...
>>> metric1 = mx.gluon.metric.create('acc')
>>> metric2 = mx.gluon.metric.create(custom_metric)
>>> metric3 = mx.gluon.metric.create([metric1, metric2, 'rmse'])
np(numpy_feval, name=None, allow_extra_outputs=False)[source]

Creates a custom evaluation metric that receives its inputs as numpy arrays.

Parameters
  • numpy_feval (callable(label, pred)) – Custom evaluation function that receives labels and predictions for a minibatch as numpy arrays and returns the corresponding custom metric as a floating point number.

  • name (str, optional) – Name of the custom metric.

  • allow_extra_outputs (bool, optional) – Whether prediction output is allowed to have extra outputs. This is useful in cases like RNN where states are also part of output which can then be fed back to the RNN in the next step. By default, extra outputs are not allowed.

Returns

Custom metric corresponding to the provided labels and predictions.

Return type

float

Example

>>> def custom_metric(label, pred):
...     return np.mean(np.abs(label-pred))
...
>>> metric = mx.gluon.metric.np(custom_metric)
predict_with_threshold(pred, threshold=0.5)[source]

Do thresholding of predictions in binary and multilabel cases.

Parameters
  • preds (float or ndarray) – predictions in shape of (batch_size, …) or (batch_size, …, num_categories)

  • preds – threshold(s) in shape of float or (num_categories)