Calculate feature importance
Execution format
catboost fstr [-m <model name>] [--input-path] <dataset> --fstr-type <output format> [other parameters]
Options
--fstr-type
Description
The feature importance output format.
Possible values:
- PredictionValuesChange
- LossFunctionChange
- InternalFeatureImportance
- Interaction
- InternalInteraction
- ShapValues
Default value
Required parameter
-m, --model-file, --model-path
Description
The name of the input file with the description of the model obtained as the result of training.
Default value
model.bin
--model-format
Description
The format of the input model.
Possible values:
- CatboostBinary.
- AppleCoreML (only datasets without categorical features are currently supported).
- json (multiclassification models are not currently supported). Refer to the CatBoost JSON model tutorial for format details.
Default value
CatboostBinary
--input-path
Description
The name of the input file with the dataset description.
This parameter is required in the following cases:
- The feature importance format is set to LossFunctionChange or ShapValues.
- The feature impoertance format is set to PredictionValuesChange and the model does not contain information regarding the weight of leaves. All models trained with CatBoost version 0.9 or higher contain leaf weight information by default.
Default value
input.tsv
--column-description, --cd
Description
The path to the input file that contains the columns description.
This parameter is required in the following cases:
- The feature importance format is set to LossFunctionChange or ShapValues.
- The feature impoertance format is set to PredictionValuesChange and the model does not contain information regarding the weight of leaves. All models trained with CatBoost version 0.9 or higher contain leaf weight information by default.
Default value
If omitted, it is assumed that the first column in the file with the dataset description defines the label value, and the other columns are the values of numerical features.
--input-graph
Description
The path to the input file that contains the graph information for the dataset.
This information is used for calculation of Graph aggregated features.
Default value
None
-o, --output-path
Description
The path to the output file with data for feature analysis.
Default value
feature_strength.tsv
-T, --thread-count
Description
The number of threads to use for operation.
Optimizes the speed of execution. This parameter doesn't affect results.
Default value
The number of processor cores