get_confusion_matrix

Build a confusion matrix , such that is equal to the number of observations known to be in group but predicted to be in group .

Method call format

get_confusion_matrix(model, data, thread_count)

Parameters

Parameter Possible types Description Default value
model catboost.CatBoost The trained model. Required parameter
data catboost.Pool

A set of samples to build the confusion matrix with.

Required parameter
thread_count int

The number of threads to use.

-1 (the number of threads is set to the number of CPU cores)
Parameter Possible types Description Default value
model catboost.CatBoost The trained model. Required parameter
data catboost.Pool

A set of samples to build the confusion matrix with.

Required parameter
thread_count int

The number of threads to use.

-1 (the number of threads is set to the number of CPU cores)

Type of return value

confusion matrix : array, shape = [n_classes, n_classes]

Examples

Multiclassification
from catboost import Pool, CatBoostClassifier
from catboost.utils import get_confusion_matrix

train_data = [[1, 1924, 44],
              [1, 1932, 37],
              [0, 1980, 37],
              [1, 2012, 204]]

train_label = ["France", "USA", "USA", "UK"]

train_dataset = Pool(data=train_data,
                     label=train_label)

model = CatBoostClassifier(loss_function='MultiClass',
                           iterations=100,
                           verbose=False)

model.fit(train_dataset)

cm = get_confusion_matrix(model, Pool(train_data, train_label))
print(cm)

Output:

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 2.]]
Binary classification
from catboost import Pool, CatBoostClassifier
from catboost.utils import get_confusion_matrix

train_data = [[1, 1924, 44],
              [1, 1932, 37],
              [0, 1980, 37],
              [1, 2012, 204]]

train_label = [0, 1, 1, 0]

train_dataset = Pool(data=train_data,
                     label=train_label)

model = CatBoostClassifier(loss_function='Logloss',
                           iterations=100,
                           verbose=False)

model.fit(train_dataset)

cm = get_confusion_matrix(model, Pool(train_data, train_label))
print(cm)

Output:

[[2. 0.]
 [0. 2.]]