# Feature interaction

## Interaction

The value of the feature interaction strength for each pair of features.

All splits of features $f1$ and $f2$ in all trees of the resulting ensemble are observed when calculating the interaction between these features.

If splits of both features are present in the tree, then we are looking on how much leaf value changes when these splits have the same value and they have opposite values.

See the Interaction file format.

Calculation principles

$interaction(f_{1}, f_{2}) = \sum_{trees} \left |\sum_{leafs: split(f_1)=split(f_2)} LeafValue { } - \sum_{leafs: split(f_1)\ne split(f_2)}LeafValue \right |$
The sum inside the modulus always contains an even number of terms. The first half of terms contains leaf values when splits by $f1$ have the same value as splits by $f2$, the second half contains leaf values when two splits have different values, and the second half is in the sum with a different sign.

The larger the difference between sums of leaf values, the bigger the interaction. This process reflects the following idea: let's fix one feature and see if the changes to the other one will result in large formula changes.

## InternalInteraction

The value of the feature interaction strength for each pair of features that are used in the model. Internally the model uses feature combinations as separate features. All feature combinations that are used in the model are listed separately. For example, if the model contains a feature named F1 and a combination of features {F2, F3}, the interaction between F1 and the combination of features {F2, F3} is listed in the output file.

• The rows are sorted in descending order of the feature interaction strength value.

See the InternalInteraction file format.

Calculation principles

$interaction(f_{1}, f_{2}) = \sum_{trees} \left |\sum_{leafs: split(f_1)=split(f_2)} LeafValue { } - \sum_{leafs: split(f_1)\ne split(f_2)}LeafValue \right |$

Detailed information regarding usage specifics for different Catboost implementations.