Feature interaction strength

The following types of feature interaction strength files are created depending on the task and the execution parameters:

Interaction

Contains

The value of the feature interaction strength for each pair of features.

Format

  • The rows are sorted in descending order of the feature interaction strength value.

  • Each row contains information related to one pair of features.

    Format:

    <feature interaction strength><\t><feature name 1><\t><feature name 2>
    
    • feature interaction strength is the value of the feature interaction strength.

    • feature name is the zero-based index of the feature.

    An alphanumeric identifier is used instead if specified in the corresponding Num or Categ column of the input data.

    For example, let's assume that the columns description file has the following structure:

    0<\t>Label value<\t>
    1<\t>Num
    2<\t>Num<\t>ratio
    3<\t>Categ
    4<\t>Auxiliary
    5<\t>Num
    

    The input dataset description file contains the following line:

    120<\t>80<\t>0.8<\t>rock<\t>some useless information<\t>12
    

    The table below shows the compliance between the given feature values and the feature indices.

    Feature value Feature index or name
    80 0
    0.8 ratio
    rock 2
    12 3

Example

0.5<\t>0<\t>1
30<\t>1<\t>2

InternalInteraction

Contains

The value of the feature interaction strength for each pair of features that are used in the model. Internally the model uses feature combinations as separate features. All feature combinations that are used in the model are listed separately. For example, if the model contains a feature named F1 and a combination of features {F2, F3}, the interaction between F1 and the combination of features {F2, F3} is listed in the output file.

Format

  • The rows are sorted in descending order of the feature interaction strength value.

  • Each row contains information related to one pair of features and/or their combinations.

    Format:

    <feature interaction strength><\t><feature name 1><\t><feature name 2>
    
    • feature interaction strength is the value of the internal feature interaction strength.
    • feature name is the zero-based index of the feature.

    An alphanumeric identifier is used instead if specified in the corresponding Num or Categ column of the input data.

    For example, let's assume that the columns description file has the following structure:

    0<\t>Label value<\t>
    1<\t>Num
    2<\t>Num<\t>ratio
    3<\t>Categ
    4<\t>Auxiliary
    5<\t>Num
    

    The input dataset description file contains the following line:

    120<\t>80<\t>0.8<\t>rock<\t>some useless information<\t>12
    

    The table below shows the compliance between the given feature values and the feature indices.

    Feature value Feature index or name
    80 0
    0.8 ratio
    rock 2
    12 3

Example

0.4004860988<\t>15<\t>13
0.1134764975<\t>{4} prior_num=0 prior_denom=1 targetborder=0 type=Borders<\t>15