plot_predictions

Sequentially vary the value of the specified features to put them into all buckets and calculate predictions for the input objects accordingly.

Alert

Only models trained on datasets that do not contain categorical features are supported.
Multiclassification modes are not supported.

Parameters

data

Description

The data to plot predictions for.

For example, use a two-document slice of the original dataset (refer to the example below).

Possible types:

numpy.ndarray
pandas.DataFrame
pandas.SparseDataFrame
scipy.sparse.spmatrix (all subclasses except dia_matrix)
catboost.Pool

Default value

Required parameter

features_to_change

Description

The list of numerical features to vary the prediction value for.

For example, chose the required features by selecting top N most important features that impact the prediction results for a pair of objects according to PredictionDiff (refer to the example below).

Possible types

list of int
string
combination of list of int & string

Default value

Required parameter

plot

Description

Plot a Jupyter Notebook chart based on the calculated predictions.

Possible types

bool

Default value

True

plot_file

Description

The name of the output HTML-file to save the chart to.

Possible types

string

Default value

1 (the trees are applied sequentially: the first tree, then the first two
trees, etc.)

Return value

Dict with two fields:

params — dict of best-found parameters.
cv_results — dict or pandas.core.frame.DataFrame with cross-validation results. Сolumns are: test-error-mean, test-error-std, train-error-mean, train-error-std.

Examples

import numpy as np
from catboost import Pool, CatBoostRegressor

train_data = np.random.randint(0, 100, size=(100, 10))
train_label = np.random.randint(0, 1000, size=(100))
train_pool = Pool(train_data, train_label)
train_pool_slice = train_pool.slice([2, 3])

model = CatBoostRegressor()
model.fit(train_pool)

prediction_diff = model.get_feature_importance(train_pool_slice,
                                               type='PredictionDiff',
                                               prettified=True)

model.plot_predictions(data=train_pool_slice,
                       features_to_change=prediction_diff["Feature Id"][:2],
                       plot=True,
                       plot_file="plot_predictions_file.html")

An example of the first plotted chart:

Was the article helpful?

load_model

plot_tree