Apply a model
The model prediction results will be correct only if the features data in the input dataset contains all the features used in the model. Typically, the order of these features must match the order of the corresponding columns that is provided during the training. But if feature names are provided both during the training and in the third column of the feature descriptions in the columns description file (specified with the
--cd parameter), they can be matched by names instead of columns order.
catboost calc [optional parameters]
The name of the input file with the description of the model obtained as the result of training.
The format of the input model.
- CatboostBinary.AppleCoreML(only datasets without categorical features are currently supported).json (multiclassification models are not currently supported). Refer to the CatBoost JSON model tutorial for format details.
The name of the input file with the dataset description.
The path to the input file that contains the columns description.
If omitted, it is assumed that the first column in the file with the dataset description defines the label value, and the other columns are the values of numerical features.
The path to the input file that contains the pairs description for the dataset.
This information is used for the calculation of Pairwise metrics.
Pairwise metrics require pairs of data. If this data is not provided explicitly by specifying this parameter, pairs are generated automatically in each group using object label values
Defines the output settings for the resulting values of the model.
Supported value formats and types:
stream://<stream>— Output the results to one of the program's standard output streams.
streamis the name of the output stream. Possible values:
For example, set the following value to output the results of applying the model to
[<path>/]<filename>.tsv— Write the results into the specified file.
pathis the optional path to the directory, where the resulting file should be saved to. By default, the file is saved to the same directory, from which the application is launched.
filenameis the name of the output file.
For example, set the following value to output the results of applying the model to the
The output data format depends on the machine learning task being solved.
A comma-separated list of columns names to output when forming the results of applying the model (including the ones obtained for the validation dataset when training).
Prediction and feature values can be output for each object of the input dataset. Additionally, some column types can be output if specified in the input data.
The output columns can be set in any order. Format:
<prediction type 1>,[<prediction type 2> .. <prediction type N>][columns to output],[#<feature index 1>[:<name to output (user-defined)>] .. #<feature index N>[:<column name to output>]]
In this example, features with indices 3 and 4 are output. The header contains the index (
#3) for the feature indexed 3 and the string
Feature4 for the feature indexed 4.
Probability #3 Feature4 Label SampleId 0.4984999565 1 50.7799987793 0 0 0.8543220144 1 48.6333312988 2 1 0.7358535042 1 52.5699996948 1 2 0.8788711681 1 48.1699981689 2 3
At least one of the specified columns must contain prediction values. For example, the following value raises an error:
All columns that are supposed to be output according to the chosen parameters are output
The number of threads to use during the training.
Optimizes the speed of execution. This parameter doesn't affect results.
The number of processor cores
The number of trees from the model to use when applying. If specified, the first
0 (if value equals to 0 this parameter is ignored and all trees from the model are used)
To reduce the number of trees to use when the model is applied or the metrics are calculated, setthe step of the trees to use to
This parameter defines the step to iterate over the range
[--ntree-start; --ntree-end). For example, let's assume that the following parameter values are set:
--ntree-startis set 0
--ntree-endis set to N (the total tree count)
--eval-periodis set to 2
In this case, the results are returned for the following tree ranges:
[0, 4), ... ,
0 (the staged prediction mode is turned off)
A comma-separated list of prediction types.
Supported prediction types: