catboost.predict

catboost.predict(model,
                 pool,
                 verbose=FALSE,
                 prediction_type=None,
                 ntree_start=0,
                 ntree_end=0,
                 thread_count=-1 (the number of threads is equal to the number of processor cores))

Purpose

Apply the model to the given dataset.

Note

The model prediction results will be correct only if the features data in the pool parameter contains all the features used in the model. Typically, the order of these features must match the order of the corresponding columns that is provided during the training. But if feature names are provided both during the training and in the pool parameter when applying the model, they can be matched by names instead of the columns order.

Arguments

model

Description

The model obtained as the result of training.

Default value

Required argument

pool

Description

The input dataset.

Default value

Required argument

verbose

Description

Verbose output to stdout.

Default value

FALSE (not used)

prediction_type

Description

The required prediction type.

Supported prediction types:

Probability
Class
RawFormulaVal
Exponent
LogProbability

Default value

None (Exponent for Poisson and Tweedie, RawFormulaVal for all other loss functions)

ntree_start

Description

To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to[ntree_start; ntree_end).

This parameter defines the index of the first tree to be used when applying the model or calculating the metrics (the inclusive left border of the range). Indices are zero-based.

Default value

ntree_end

Description

To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to[ntree_start; ntree_end).

This parameter defines the index of the first tree not to be used when applying the model or calculating the metrics (the exclusive right border of the range). Indices are zero-based.

Default value

0 (the index of the last tree to use equals to the number of trees in the
model minus one)

thread_count

Description

The number of threads to use for operation.

Optimizes the speed of execution. This parameter doesn't affect results.

Default value

-1 (the number of threads is equal to the number of processor cores)

Specifics

In case of multiclassification the prediction is returned in the form of a matrix. Each line of this matrix contains the predictions for one object of the input dataset.

Was the article helpful?

catboost.save_model

catboost.shrink