Calculate metrics
Purpose
Calculate metrics for a given dataset using a previously trained model.
Execution format
catboost eval-metrics --metrics <comma-separated list of metrics> [optional parameters]
Options
Option | Description | Default value |
---|---|---|
-m --model-file --model-path | The name of the input file with the description of the model obtained as the result of training. | model.bin |
--model-format | The format of the input model. Possible values:
| CatboostBinary |
--input-path | The name of the input file with the dataset description. | input.tsv |
--column-description --cd | The path to the input file that contains the columns description. | If omitted, it is assumed that the first column in the file with the dataset description defines the label value, and the other columns are the values of numerical features. |
--input-pairs | The path to the input file that contains the pairs description for the dataset. This information is used for the calculation of Pairwise metrics. | Omitted Pairwise metrics require pairs of data. If this data is not provided explicitly by specifying this parameter, pairs are generated automatically in each group using object label values |
-o --output-path | The path to the output file with calculated metrics. | output.tsv |
-T --thread-count | The number of threads to use during the training. Optimizes the speed of execution. This parameter doesn't affect results. | The number of processor cores |
--delimiter | The delimiter character used to separate the data in the dataset description input file. Only single char delimiters are supported. If the specified value contains more than one character, only the first one is used. Note. Used only if the dataset is given in the Delimiter-separated values format. | The input data is assumed to be tab-separated |
--has-header | False (the first line is supposed to have the same data as the rest of them) | False (the first line is supposed to have the same data as the rest of them) |
--ntree-start | To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to This parameter defines the index of the first tree to be used when applying the model or calculating the metrics (the inclusive left border of the range). Indices are zero-based. | 0 |
--ntree-end | To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to This parameter defines the index of the first tree not to be used when applying the model or calculating the metrics (the exclusive right border of the range). Indices are zero-based. | 0 (the index of the last tree to use equals to the number of trees in the model minus one) |
--eval-period | To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to This parameter defines the step to iterate over the range
In this case, the results are returned for the following tree ranges: | 0 (the index of the last tree to use equals to the number of trees in the model minus one) |
--metrics | A comma-separated list of metrics to be calculated. Possible valuesFor example, if the AUC and Logloss metrics should be calculated, use the following construction:
| Required parameter |
--result-dir | The directory for storing the files generated during metric calculation. | None (current directory) |
--tmp-dir | The directory for storing temporary files that are generated if non-additive metrics are calculated. By default, the directory is generated inside the current one at the start of calculation, and it is removed when the calculation is complete. Otherwise the specified value is used. | - (the directory is generated) |
--verbose | Verbose output to stdout. | False |
Option | Description | Default value |
---|---|---|
-m --model-file --model-path | The name of the input file with the description of the model obtained as the result of training. | model.bin |
--model-format | The format of the input model. Possible values:
| CatboostBinary |
--input-path | The name of the input file with the dataset description. | input.tsv |
--column-description --cd | The path to the input file that contains the columns description. | If omitted, it is assumed that the first column in the file with the dataset description defines the label value, and the other columns are the values of numerical features. |
--input-pairs | The path to the input file that contains the pairs description for the dataset. This information is used for the calculation of Pairwise metrics. | Omitted Pairwise metrics require pairs of data. If this data is not provided explicitly by specifying this parameter, pairs are generated automatically in each group using object label values |
-o --output-path | The path to the output file with calculated metrics. | output.tsv |
-T --thread-count | The number of threads to use during the training. Optimizes the speed of execution. This parameter doesn't affect results. | The number of processor cores |
--delimiter | The delimiter character used to separate the data in the dataset description input file. Only single char delimiters are supported. If the specified value contains more than one character, only the first one is used. Note. Used only if the dataset is given in the Delimiter-separated values format. | The input data is assumed to be tab-separated |
--has-header | False (the first line is supposed to have the same data as the rest of them) | False (the first line is supposed to have the same data as the rest of them) |
--ntree-start | To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to This parameter defines the index of the first tree to be used when applying the model or calculating the metrics (the inclusive left border of the range). Indices are zero-based. | 0 |
--ntree-end | To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to This parameter defines the index of the first tree not to be used when applying the model or calculating the metrics (the exclusive right border of the range). Indices are zero-based. | 0 (the index of the last tree to use equals to the number of trees in the model minus one) |
--eval-period | To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to This parameter defines the step to iterate over the range
In this case, the results are returned for the following tree ranges: | 0 (the index of the last tree to use equals to the number of trees in the model minus one) |
--metrics | A comma-separated list of metrics to be calculated. Possible valuesFor example, if the AUC and Logloss metrics should be calculated, use the following construction:
| Required parameter |
--result-dir | The directory for storing the files generated during metric calculation. | None (current directory) |
--tmp-dir | The directory for storing temporary files that are generated if non-additive metrics are calculated. By default, the directory is generated inside the current one at the start of calculation, and it is removed when the calculation is complete. Otherwise the specified value is used. | - (the directory is generated) |
--verbose | Verbose output to stdout. | False |