eval_metric

Calculate the specified metric on raw approximated values of the formula and label values.

Method call format

eval_metric(label,
            approx,
            metric,
            weight=None,
            group_id=None,
            thread_count=-1)

Parameters

Parameter Possible types Description Default value
label
  • list
  • numpy.array
  • pandas.DataFrame
  • pandas.Series

A list of target variables (in other words, the label values of the objects).

Required parameter
approx
  • list
  • numpy.array
  • pandas.DataFrame
  • pandas.Series

A list of approximate values for all input objects.

Required parameter
metric string

The evaluation metric to calculate.

Supported values
  • RMSE
  • Logloss
  • MAE
  • CrossEntropy
  • Quantile
  • LogLinQuantile
  • Lq
  • MultiClass
  • MultiClassOneVsAll
  • MAPE
  • Poisson
  • PairLogit
  • PairLogitPairwise
  • QueryRMSE
  • QuerySoftMax
  • SMAPE
  • Recall
  • Precision
  • F1
  • TotalF1
  • Accuracy
  • BalancedAccuracy
  • BalancedErrorRate
  • Kappa
  • WKappa
  • LogLikelihoodOfPrediction
  • AUC
  • R2
  • NumErrors
  • MCC
  • BrierScore
  • HingeLoss
  • HammingLoss
  • ZeroOneLoss
  • MSLE
  • MedianAbsoluteError
  • Huber
  • PairAccuracy
  • AverageGain
  • PFound
  • NDCG
  • PrecisionAt
  • RecallAt
  • MAP
  • CtrFactor
  • YetiRank
  • YetiRankPairwise
Required parameter
weight
  • list
  • numpy.array
  • pandas.DataFrame
  • pandas.Series
The weights of objects. None
group_id
  • list
  • numpy.array
  • pandas.DataFrame
  • pandas.Series
Group identifiers for all input objects. Supported identifier types are:
  • int
  • string types (string or unicode for Python 2 and bytes or string for Python 3).
None
thread_count int

The number of threads to use.

Optimizes the speed of execution. This parameter doesn't affect results.

-1 (the number of threads is equal to the number of processor cores)
Parameter Possible types Description Default value
label
  • list
  • numpy.array
  • pandas.DataFrame
  • pandas.Series

A list of target variables (in other words, the label values of the objects).

Required parameter
approx
  • list
  • numpy.array
  • pandas.DataFrame
  • pandas.Series

A list of approximate values for all input objects.

Required parameter
metric string

The evaluation metric to calculate.

Supported values
  • RMSE
  • Logloss
  • MAE
  • CrossEntropy
  • Quantile
  • LogLinQuantile
  • Lq
  • MultiClass
  • MultiClassOneVsAll
  • MAPE
  • Poisson
  • PairLogit
  • PairLogitPairwise
  • QueryRMSE
  • QuerySoftMax
  • SMAPE
  • Recall
  • Precision
  • F1
  • TotalF1
  • Accuracy
  • BalancedAccuracy
  • BalancedErrorRate
  • Kappa
  • WKappa
  • LogLikelihoodOfPrediction
  • AUC
  • R2
  • NumErrors
  • MCC
  • BrierScore
  • HingeLoss
  • HammingLoss
  • ZeroOneLoss
  • MSLE
  • MedianAbsoluteError
  • Huber
  • PairAccuracy
  • AverageGain
  • PFound
  • NDCG
  • PrecisionAt
  • RecallAt
  • MAP
  • CtrFactor
  • YetiRank
  • YetiRankPairwise
Required parameter
weight
  • list
  • numpy.array
  • pandas.DataFrame
  • pandas.Series
The weights of objects. None
group_id
  • list
  • numpy.array
  • pandas.DataFrame
  • pandas.Series
Group identifiers for all input objects. Supported identifier types are:
  • int
  • string types (string or unicode for Python 2 and bytes or string for Python 3).
None
thread_count int

The number of threads to use.

Optimizes the speed of execution. This parameter doesn't affect results.

-1 (the number of threads is equal to the number of processor cores)

Type of return value

list with metric values.

Usage examples

The following is an example of usage with a regression metric:

from catboost.utils import eval_metric

labels = [0.2, -1, 0.4]
predictions = [0.4, 0.1, 0.9]

rmse = eval_metric(labels, predictions, 'RMSE')

The following is an example of usage with a classification metric:

from catboost.utils import eval_metric
from math import log

labels = [1, 0, 1]
probabilities = [0.4, 0.1, 0.9]

# In binary classification it is necessary to apply the logit function
# to the probabilities to get approxes.

logit = lambda x: log(x / (1 - x))
approxes = list(map(logit, probabilities))

accuracy = eval_metric(labels, approxes, 'Accuracy')

The following is an example of usage with a ranking metric:

from catboost.utils import eval_metric

# The dataset consists of five objects. The first two belong to one group
# and the other three to another.
group_ids = [1, 1, 2, 2, 2]

labels = [0.9, 0.1, 0.5, 0.4, 0.8]

# In ranking tasks it is not necessary to predict the same labels.
# It is important to predict the right order of objects.
good_predictions = [0.5, 0.4, 0.2, 0.1, 0.3]
bad_predictions = [0.4, 0.5, 0.2, 0.3, 0.1]

good_ndcg = eval_metric(labels, good_predictions, 'NDCG', group_id=group_ids)
bad_ndcg = eval_metric(labels, bad_predictions, 'NDCG', group_id=group_ids)