Explain the interaction values by SHAP
Goal¶
This post aims to introduce how to explain the interaction values for the model's prediction by SHAP
. In this post, we will use data NHANES I (1971-1974) from National Health and Nutrition Examaination Survey.
Reference
Libraries¶
In [1]:
import shap
import xgboost
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
%matplotlib inline
Configuration¶
In [8]:
test_size = 0.2
random_state = 1
Load data for NHANES I¶
In [5]:
X, y = shap.datasets.nhanesi()
X.head()
Out[5]:
In [7]:
y[:5]
Out[7]:
Split the data into training and test¶
In [9]:
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=test_size, random_state=random_state)
xgb_train = xgboost.DMatrix(X_train, label=y_train)
xgb_test = xgboost.DMatrix(X_test, label=y_test)
Create a XGBoost model¶
Model Configuration¶
In [10]:
# For Training
params_train = {
"eta": 0.002,
"max_depth": 3,
"objective": "survival:cox",
"subsample": 0.5
}
Train a model¶
In [11]:
model_train = xgboost.train(params_train, xgb_train,
num_boost_round=10000,
evals=[(xgb_test, "test")],
verbose_eval=1000)
Create an explainer¶
In [14]:
explainer = shap.TreeExplainer(model_train)
shap_values = explainer.shap_values(X_test)
Compute shap interaction values¶
In [17]:
shap_interaction_values = explainer.shap_interaction_values(X_test.iloc[:1000, :])
Interaction Values across variables¶
In [18]:
shap.summary_plot(shap_interaction_values, X_test.iloc[:1000,:])
Interaction Value Dependence¶
In [19]:
shap.dependence_plot(
("Age", "Sex"),
shap_interaction_values, X_test.iloc[:1000,:],
display_features=X_test.iloc[:1000,:]
)
Comments
Comments powered by Disqus