Feature importance
The following types of feature importance files are created depending on the task and the execution parameters:
Regular feature importance
Contains
The individual importance values for each of the input features (the default feature importances calculation method for non-ranking metrics).
Possible values:
- PredictionValuesChange for non-ranking metrics
- LossFunctionChange for ranking metrics
Format
-
The rows are sorted in descending order of the feature importance value.
-
Each row contains information related to one feature.
Format:
<feature strength><\t><feature name>feature strengthis the value of the of the regular feature importance.feature nameis the zero-based index of the feature.
An alphanumeric identifier is used instead if specified in the corresponding
NumorCategcolumn of the input data.For example, let's assume that the columns description file has the following structure:
0<\t>Label value<\t> 1<\t>Num 2<\t>Num<\t>ratio 3<\t>Categ 4<\t>Auxiliary 5<\t>NumThe input dataset description file contains the following line:
120<\t>80<\t>0.8<\t>rock<\t>some useless information<\t>12The table below shows the compliance between the given feature values and the feature indices.
| Feature value | Feature index or name |
|---|---|
| 80 | 0 |
| 0.8 | ratio |
| rock | 2 |
| 12 | 3 |
Example
8.4 <\t> 2
5.5 <\t> 0
2.6 <\t> 3
1.5 <\t> ratio
InternalFeatureImportance
Contains
The importance values both for each of the input features and for their combinations (if any).
Format
-
The rows are sorted in descending order of the feature importance value.
-
Each row contains information related to one feature or a combination of features.
Format:
<feature strength><\t><{feature name 1,.., feature name n} pr<value> tb<value> type<value>-
feature strengthis the value of the internal feature importance. -
feature nameis the zero-based index of the feature.
An alphanumeric identifier is used instead if specified in the corresponding
NumorCategcolumn of the input data.For example, let's assume that the columns description file has the following structure:
0<\t>Label value<\t> 1<\t>Num 2<\t>Num<\t>ratio 3<\t>Categ 4<\t>Auxiliary 5<\t>NumThe input dataset description file contains the following line:
120<\t>80<\t>0.8<\t>rock<\t>some useless information<\t>12The table below shows the compliance between the given feature values and the feature indices.
Feature value Feature index or name 80 0 0.8 ratio rock 2 12 3 pris the prior value.tbis the label value border value.typeis the feature border type.
-
Example
8.4<\t>0
5.2<\t>{2, ratio} pr2 tb0 type0
2.6<\t>{2} pr2 tb0 type0