eval_metric

Calculate the specified metric on raw approximated values of the formula and label values.

Method call format

eval_metric(label,
            approx,
            metric,
            weight=None,
            group_id=None,
            subgroup_id=None,
            pairs=None,
            thread_count=-1)

Parameters

label

Description

A list of target variables (in other words, the label values of the objects).

Possible types

list
numpy.ndarray
pandas.DataFrame
pandas.Series

Default value

Required parameter

approx

Description

A list of approximate values for all input objects.

Possible types

list
numpy.ndarray
pandas.DataFrame
pandas.Series

Default value

Required parameter

metric

Description

The evaluation metric to calculate.

Supported metrics

RMSE
Logloss
MAE
CrossEntropy
Quantile
LogLinQuantile
Lq
MultiRMSE
MultiClass
MultiClassOneVsAll
MultiLogloss
MultiCrossEntropy
MAPE
Poisson
PairLogit
PairLogitPairwise
QueryRMSE
QuerySoftMax
GroupQuantile
Tweedie
SMAPE
Recall
Precision
F
F1
TotalF1
Accuracy
BalancedAccuracy
BalancedErrorRate
Kappa
WKappa
LogLikelihoodOfPrediction
AUC
QueryAUC
R2
FairLoss
NumErrors
MCC
BrierScore
HingeLoss
HammingLoss
ZeroOneLoss
MSLE
MedianAbsoluteError
Cox
Huber
Expectile
MultiRMSE
PairAccuracy
AverageGain
PFound
NDCG
DCG
FilteredDCG
NormalizedGini
PrecisionAt
RecallAt
MAP
CtrFactor
YetiRank
YetiRankPairwise
StochasticFilter
StochasticRank
LambdaMart

Possible types

string

Default value

Required parameter

weight

Description

The weights of objects.

Possible types

list
numpy.ndarray
pandas.DataFrame
pandas.Series

Default value

None

group_id

Description

Group identifiers for all input objects. Supported identifier types are:

int
string types (string or unicode for Python 2 and bytes or string for Python 3).

Possible types

list
numpy.ndarray
pandas.DataFrame
pandas.Series

Default value

None

subgroup_id

Description

Subgroup identifiers for all input objects.

Possible types

list
numpy.ndarray

Default value

None

pairs

Description

The description is different for each group of possible types.

Possible types

list, numpy.ndarray, pandas.DataFrame

The pairs description in the form of a two-dimensional matrix of shape N by 2:

N is the number of pairs.
The first element of the pair is the zero-based index of the winner object from the input dataset for pairwise comparison.
The second element of the pair is the zero-based index of the loser object from the input dataset for pairwise comparison.

This information is used for calculation and optimization of Pairwise metrics.

string

The path to the input file that contains the pairs description.

This information is used for calculation and optimization of Pairwise metrics.

Default value

None

thread_count

Description

The number of threads to use.

Optimizes the speed of execution. This parameter doesn't affect results.

Possible types

int

Default value

-1 (the number of threads is equal to the number of processor cores)

Type of return value

list with metric values.

Usage examples

The following is an example of usage with a regression metric:

from catboost.utils import eval_metric

labels = [0.2, -1, 0.4]
predictions = [0.4, 0.1, 0.9]

rmse = eval_metric(labels, predictions, 'RMSE')

The following is an example of usage with a classification metric:

from catboost.utils import eval_metric
from math import log

labels = [1, 0, 1]
probabilities = [0.4, 0.1, 0.9]

# In binary classification it is necessary to apply the logit function
# to the probabilities to get approxes.

logit = lambda x: log(x / (1 - x))
approxes = list(map(logit, probabilities))

accuracy = eval_metric(labels, approxes, 'Accuracy')

The following is an example of usage with a ranking metric:

from catboost.utils import eval_metric

# The dataset consists of five objects. The first two belong to one group
# and the other three to another.
group_ids = [1, 1, 2, 2, 2]

labels = [0.9, 0.1, 0.5, 0.4, 0.8]

# In ranking tasks it is not necessary to predict the same labels.
# It is important to predict the right order of objects.
good_predictions = [0.5, 0.4, 0.2, 0.1, 0.3]
bad_predictions = [0.4, 0.5, 0.2, 0.3, 0.1]

good_ndcg = eval_metric(labels, good_predictions, 'NDCG', group_id=group_ids)
bad_ndcg = eval_metric(labels, bad_predictions, 'NDCG', group_id=group_ids)

eval_metric

Method call format

Parameters

label

Description

approx

Description

metric

Description

weight

Description

group_id

Description

subgroup_id

Description

pairs

Description

thread_count

Description

Type of return value

Usage examples

Was the article helpful?