Implemented metrics
CatBoost provides built-in metrics for various machine learning problems. These functions can be used for model optimization or reference purposes. See the Objectives and metrics section for details on the calculation principles.
Choose the implementation for more details.
Python package
The following parameters can be set for the corresponding classes and are used when the model is trained.
Parameters for trained model
Classes:
loss-function
The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Supported metrics
-
RMSE
-
Logloss
-
MAE
-
CrossEntropy
-
Quantile
-
LogLinQuantile
-
Lq
-
MultiRMSE
-
MultiClass
-
MultiClassOneVsAll
-
MultiLogloss
-
MultiCrossEntropy
-
MAPE
-
Poisson
-
PairLogit
-
PairLogitPairwise
-
QueryRMSE
-
QuerySoftMax
-
GroupQuantile
-
Tweedie
-
YetiRank
-
YetiRankPairwise
-
StochasticFilter
-
StochasticRank
A custom python object can also be set as the value of this parameter (see an example).
For example, use the following construction to calculate the value of Quantile with the coefficient :
Quantile:alpha=0.1
custom_metric
Metric values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Examples:
-
Calculate the value of CrossEntropy
CrossEntropy
-
Calculate the value of Quantile with the coefficient
Quantile:alpha=0.1
-
Calculate the values of Logloss and AUC
['Logloss', 'AUC']
Values of all custom metrics for learn and validation datasets are saved to the Metric output files (learn_error.tsv
and test_error.tsv
respectively). The directory for these files is specified in the --train-dir
(train_dir
) parameter.
Use the visualization tools to see a live chart with the dynamics of the specified metrics.
use-best-model
If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:
- Build the number of trees defined by the training parameters.
- Use the validation dataset to identify the iteration with the optimal value of the metric specified in
--eval-metric
(--eval-metric
).
No trees are saved after this iteration.
This option requires a validation dataset to be provided.
eval-metric
The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
A user-defined function can also be set as the value (see an example).
Examples:
R2
The following parameters can be set for the corresponding methods and are used when the model is trained or applied.
Parameters for trained or applied model
The following parameters can be set for the corresponding methods and are used when the model is trained or applied.
Classes:
use_best_model
If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:
- Build the number of trees defined by the training parameters.
- Use the validation dataset to identify the iteration with the optimal value of the metric specified in
--eval-metric
(--eval-metric
).
No trees are saved after this iteration.
This option requires a validation dataset to be provided.
verbose
Output the measured evaluation metric to stderr.
plot
Plot the following information during training:
- the metric values;
- the custom loss values;
- the loss function change during feature selection;
- the time has passed since training started;
- the remaining time until the end of training.
This option can be used if training is performed in Jupyter notebook.
R package
The following parameters can be set for the corresponding methods and are used when the model is trained or applied.
Method: catboost.train
loss_function
Description
The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Supported metrics
-
RMSE
-
Logloss
-
MAE
-
CrossEntropy
-
Quantile
-
LogLinQuantile
-
Lq
-
MultiRMSE
-
MultiClass
-
MultiClassOneVsAll
-
MultiLogloss
-
MultiCrossEntropy
-
MAPE
-
Poisson
-
PairLogit
-
PairLogitPairwise
-
QueryRMSE
-
QuerySoftMax
-
GroupQuantile
-
Tweedie
-
YetiRank
-
YetiRankPairwise
-
StochasticFilter
-
StochasticRank
For example, use the following construction to calculate the value of Quantile with the coefficient :
Quantile:alpha=0.1
custom_loss
Parameters
Metric values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Examples:
-
Calculate the value of CrossEntropy
c('CrossEntropy')
Or simply:
'CrossEntropy'
-
Calculate the values of Logloss and AUC
c('Logloss', 'AUC')
-
Calculate the value of Quantile with the coefficient
c('Quantilealpha=0.1')
Values of all custom metrics for learn and validation datasets are saved to the Metric output files (learn_error.tsv
and test_error.tsv
respectively). The directory for these files is specified in the --train-dir
(train_dir
) parameter.
use-best-model
If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:
- Build the number of trees defined by the training parameters.
- Use the validation dataset to identify the iteration with the optimal value of the metric specified in
--eval-metric
(--eval-metric
).
No trees are saved after this iteration.
This option requires a validation dataset to be provided.
eval-metric
Parameters
The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Quantile:alpha=0.3
Command-line version
The following command keys can be specified for the corresponding commands and are used when the model is trained or applied.
Params for the catboost fit command:
--loss-function
The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Supported metrics
-
RMSE
-
Logloss
-
MAE
-
CrossEntropy
-
Quantile
-
LogLinQuantile
-
Lq
-
MultiRMSE
-
MultiClass
-
MultiClassOneVsAll
-
MultiLogloss
-
MultiCrossEntropy
-
MAPE
-
Poisson
-
PairLogit
-
PairLogitPairwise
-
QueryRMSE
-
QuerySoftMax
-
GroupQuantile
-
Tweedie
-
YetiRank
-
YetiRankPairwise
-
StochasticFilter
-
StochasticRank
For example, use the following construction to calculate the value of Quantile with the coefficient :
Quantilealpha=0.1
--custom-metric
Metric values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).
Format:
<Metric 1>[:<parameter 1>=<value>;..;<parameter N>=<value>],<Metric 2>[:<parameter 1>=<value>;..;<parameter N>=<value>],..,<Metric N>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Examples:
-
Calculate the value of CrossEntropy
CrossEntropy
-
Calculate the value of Quantile with the coefficient
Quantilealpha=0.1
Values of all custom metrics for learn and validation datasets are saved to the Metric output files (learn_error.tsv
and test_error.tsv
respectively). The directory for these files is specified in the --train-dir
(train_dir
) parameter.
--use-best-model
If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:
- Build the number of trees defined by the training parameters.
- Use the validation dataset to identify the iteration with the optimal value of the metric specified in
--eval-metric
(--eval-metric
).
No trees are saved after this iteration.
This option requires a validation dataset to be provided.
--eval-metric
The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).
Format:
<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Examples:
R2
Quantile:alpha=0.3
--logging-level
The logging level to output to stdout.
Possible values:
-
Silent — Do not output any logging information to stdout.
-
Verbose — Output the following data to stdout:
- optimized metric
- elapsed time of training
- remaining time of training
-
Info — Output additional information and the number of trees.
-
Debug — Output debugging information.