catboost.staged_predict

catboost.staged_predict(model, 
                        pool, 
                        verbose = FALSE, 
                        prediction_type = "RawFormulaVal", 
                        ntree_start = 0, 
                        ntree_end = 0, 
                        eval_period = 1, 
                        thread_count = -1)

Purpose

Apply the model to the given dataset and calculate the results for the specified trees only.

Note.
The model prediction results will be correct only if the features data in the pool parameter contains all the features used in the model. Typically, the order of these features must match the order of the corresponding columns that is provided during the training. But if feature names are provided both during the training and in the pool parameter when applying the model, they can be matched by names instead of columns order.

Arguments

Argument Description Default value
model

The model obtained as the result of training.

Required argument
pool

The input dataset.

Required argument
verbose Verbose output to stdout. FALSE (not used)
prediction_type

The required prediction type.

Supported prediction types:
  • Probability
  • Class
  • RawFormulaVal
RawFormulaVal
ntree_start

To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to [ntree_start; ntree_end) and the the step of the trees to use to eval_period.

This parameter defines the index of the first tree to be used when applying the model or calculating the metrics (the inclusive left border of the range). Indices are zero-based.

0
ntree_end

To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to [ntree_start; ntree_end) and the the step of the trees to use to eval_period.

This parameter defines the index of the first tree not to be used when applying the model or calculating the metrics (the exclusive right border of the range). Indices are zero-based.

0 (the index of the last tree to use equals to the number of trees in the model minus one)
eval_period

To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to [ntree_start; ntree_end) and the the step of the trees to use to eval_period.

This parameter defines the step to iterate over the range [ntree_start; ntree_end). For example, let's assume that the following parameter values are set:

  • ntree_start is set 0
  • ntree_end is set to N (the total tree count)
  • eval_period is set to 2

In this case, the results are returned for the following tree ranges: [0, 2), [0, 4), ... , [0, N).

1 (the trees are applied sequentially: the first tree, then the first two trees, etc.)
thread_count

The number of threads to use during training.

Optimizes the speed of execution. This parameter doesn't affect results.

-1 (the number of threads is equal to the number of processor cores)
Argument Description Default value
model

The model obtained as the result of training.

Required argument
pool

The input dataset.

Required argument
verbose Verbose output to stdout. FALSE (not used)
prediction_type

The required prediction type.

Supported prediction types:
  • Probability
  • Class
  • RawFormulaVal
RawFormulaVal
ntree_start

To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to [ntree_start; ntree_end) and the the step of the trees to use to eval_period.

This parameter defines the index of the first tree to be used when applying the model or calculating the metrics (the inclusive left border of the range). Indices are zero-based.

0
ntree_end

To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to [ntree_start; ntree_end) and the the step of the trees to use to eval_period.

This parameter defines the index of the first tree not to be used when applying the model or calculating the metrics (the exclusive right border of the range). Indices are zero-based.

0 (the index of the last tree to use equals to the number of trees in the model minus one)
eval_period

To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to [ntree_start; ntree_end) and the the step of the trees to use to eval_period.

This parameter defines the step to iterate over the range [ntree_start; ntree_end). For example, let's assume that the following parameter values are set:

  • ntree_start is set 0
  • ntree_end is set to N (the total tree count)
  • eval_period is set to 2

In this case, the results are returned for the following tree ranges: [0, 2), [0, 4), ... , [0, N).

1 (the trees are applied sequentially: the first tree, then the first two trees, etc.)
thread_count

The number of threads to use during training.

Optimizes the speed of execution. This parameter doesn't affect results.

-1 (the number of threads is equal to the number of processor cores)