A vector with contributions of each feature to the prediction for every input object and the expected value of the model prediction for the object (average prediction given no knowledge about the object).
- is the contribution of the i-th feature.
- is the expected value of the model prediction.
For a given object the sum is equal to the prediction on this object.
This is an implementation of the Consistent Individualized Feature Attribution for Tree Ensembles approach.
See the ShapValues file format.
Use the SHAP package to plot the returned values.
- is the number of input features.
- is the set of all input features.
- is the set of non-zero feature indices (the features that are being observed and not unknown).
is the model's prediction for the input , where is the expected value of the function conditioned on a subset S of the input features.
If the mean leaf count in the tree is less than the number of documents and trees are oblivious:
In all other cases:
- samples_count is the number of documents in the dataset.
- dimension is the dimensionality for Multiclassification and Multiregression.
- trees_count is the number of trees.
- depth is the depth of tree.
- average_depth is the average depth of the trees.
- leaves_in_tree is the number of leaves in the tree.
- features_in_tree_count is the number of features in the tree.
The feature importance is calculated as follows for each feature :