• Installation
• Overview
• Python package installation
• CatBoost for Apache Spark installation
• R package installation
• Command-line version binary
• Key Features
• Training parameters
• Python package
• CatBoost for Apache Spark
• R package
• Command-line version
• Applying models
• Objectives and metrics
• Model analysis
• Data format description
• Parameter tuning
• Speeding up the training
• Data visualization
• Algorithm details
• FAQ
• Educational materials
• Development and contributions
• Contacts

# Feature importance

The following types of feature importance files are created depending on the task and the execution parameters:

## Regular feature importance

#### Contains

The individual importance values for each of the input features (the default feature importances calculation method for non-ranking metrics).

Possible values:

#### Format

• The rows are sorted in descending order of the feature importance value.

• Each row contains information related to one feature.

Format:

<feature strength><\t><feature name>

• feature strength is the value of the of the regular feature importance.
• feature name is the zero-based index of the feature.

An alphanumeric identifier is used instead if specified in the corresponding Num or Categ column of the input data.

For example, let's assume that the columns description file has the following structure:

0<\t>Label value<\t>
1<\t>Num
2<\t>Num<\t>ratio
3<\t>Categ
4<\t>Auxiliary
5<\t>Num


The input dataset description file contains the following line:

120<\t>80<\t>0.8<\t>rock<\t>some useless information<\t>12


The table below shows the compliance between the given feature values and the feature indices.

Feature value Feature index or name
80 0
0.8 ratio
rock 2
12 3

#### Example

8.4 <\t> 2
5.5 <\t> 0
2.6 <\t> 3
1.5 <\t> ratio


## InternalFeatureImportance

#### Contains

The importance values both for each of the input features and for their combinations (if any).

#### Format

• The rows are sorted in descending order of the feature importance value.

• Each row contains information related to one feature or a combination of features.

Format:

<feature strength><\t><{feature name 1,.., feature name n} pr<value> tb<value> type<value>

• feature strength is the value of the internal feature importance.

• feature name is the zero-based index of the feature.

An alphanumeric identifier is used instead if specified in the corresponding Num or Categ column of the input data.

For example, let's assume that the columns description file has the following structure:

0<\t>Label value<\t>
1<\t>Num
2<\t>Num<\t>ratio
3<\t>Categ
4<\t>Auxiliary
5<\t>Num


The input dataset description file contains the following line:

120<\t>80<\t>0.8<\t>rock<\t>some useless information<\t>12


The table below shows the compliance between the given feature values and the feature indices.

Feature value Feature index or name
80 0
0.8 ratio
rock 2
12 3
• pr is the prior value.
• tb is the label value border value.
• type is the feature border type.

#### Example

8.4<\t>0
5.2<\t>{2, ratio} pr2 tb0 type0
2.6<\t>{2} pr2 tb0 type0