eval_metric
Calculate the specified metric on raw approximated values of the formula and label values.
Method call format
eval_metric(label,
approx,
metric,
weight=None,
group_id=None,
subgroup_id=None,
pairs=None,
thread_count=-1)
Parameters
label
Description
A list of target variables (in other words, the label values of the objects).
Possible types
- list
- numpy.ndarray
- pandas.DataFrame
- pandas.Series
Default value
Required parameter
approx
Description
A list of approximate values for all input objects.
Possible types
- list
- numpy.ndarray
- pandas.DataFrame
- pandas.Series
Default value
Required parameter
metric
Description
The evaluation metric to calculate.
Supported metrics
-
RMSE
-
Logloss
-
MAE
-
CrossEntropy
-
Quantile
-
LogLinQuantile
-
Lq
-
MultiRMSE
-
MultiClass
-
MultiClassOneVsAll
-
MultiLogloss
-
MultiCrossEntropy
-
MAPE
-
Poisson
-
PairLogit
-
PairLogitPairwise
-
QueryRMSE
-
QuerySoftMax
-
GroupQuantile
-
Tweedie
-
SMAPE
-
Recall
-
Precision
-
F
-
F1
-
TotalF1
-
Accuracy
-
BalancedAccuracy
-
BalancedErrorRate
-
Kappa
-
WKappa
-
LogLikelihoodOfPrediction
-
AUC
-
QueryAUC
-
R2
-
FairLoss
-
NumErrors
-
MCC
-
BrierScore
-
HingeLoss
-
HammingLoss
-
ZeroOneLoss
-
MSLE
-
MedianAbsoluteError
-
Cox
-
Huber
-
Expectile
-
MultiRMSE
-
PairAccuracy
-
AverageGain
-
PFound
-
NDCG
-
DCG
-
FilteredDCG
-
NormalizedGini
-
PrecisionAt
-
RecallAt
-
MAP
-
CtrFactor
-
YetiRank
-
YetiRankPairwise
-
StochasticFilter
-
StochasticRank
-
LambdaMart
Possible types
string
Default value
Required parameter
weight
Description
The weights of objects.
Possible types
- list
- numpy.ndarray
- pandas.DataFrame
- pandas.Series
Default value
None
group_id
Description
Group identifiers for all input objects. Supported identifier types are:
- int
- string types (string or unicode for Python 2 and bytes or string for Python 3).
Possible types
- list
- numpy.ndarray
- pandas.DataFrame
- pandas.Series
Default value
None
subgroup_id
Description
Subgroup identifiers for all input objects.
Possible types
- list
- numpy.ndarray
Default value
None
pairs
Description
The description is different for each group of possible types.
Possible types
list, numpy.ndarray, pandas.DataFrame
The pairs description in the form of a two-dimensional matrix of shape N
by 2:
N
is the number of pairs.- The first element of the pair is the zero-based index of the winner object from the input dataset for pairwise comparison.
- The second element of the pair is the zero-based index of the loser object from the input dataset for pairwise comparison.
This information is used for calculation and optimization of Pairwise metrics.
string
The path to the input file that contains the pairs description.
This information is used for calculation and optimization of Pairwise metrics.
Default value
None
thread_count
Description
The number of threads to use.
Optimizes the speed of execution. This parameter doesn't affect results.
Possible types
int
Default value
-1 (the number of threads is equal to the number of processor cores)
Type of return value
list with metric values.
Usage examples
The following is an example of usage with a regression metric:
from catboost.utils import eval_metric
labels = [0.2, -1, 0.4]
predictions = [0.4, 0.1, 0.9]
rmse = eval_metric(labels, predictions, 'RMSE')
The following is an example of usage with a classification metric:
from catboost.utils import eval_metric
from math import log
labels = [1, 0, 1]
probabilities = [0.4, 0.1, 0.9]
# In binary classification it is necessary to apply the logit function
# to the probabilities to get approxes.
logit = lambda x: log(x / (1 - x))
approxes = list(map(logit, probabilities))
accuracy = eval_metric(labels, approxes, 'Accuracy')
The following is an example of usage with a ranking metric:
from catboost.utils import eval_metric
# The dataset consists of five objects. The first two belong to one group
# and the other three to another.
group_ids = [1, 1, 2, 2, 2]
labels = [0.9, 0.1, 0.5, 0.4, 0.8]
# In ranking tasks it is not necessary to predict the same labels.
# It is important to predict the right order of objects.
good_predictions = [0.5, 0.4, 0.2, 0.1, 0.3]
bad_predictions = [0.4, 0.5, 0.2, 0.3, 0.1]
good_ndcg = eval_metric(labels, good_predictions, 'NDCG', group_id=group_ids)
bad_ndcg = eval_metric(labels, bad_predictions, 'NDCG', group_id=group_ids)