Interactions plot

Most explainability plots only allow the user to analyze one variable at a time.

Interactions plots are an interesting way to visualize a couple of variables and their corresponding contribution to the model output.

Shapash integrates two methods that allow to display such interactions for several individuals : interactions_plot and top_interactions_plot.

This tutorial presents how to use both methods to get more insights about your model and how two variables interact with it.

Content : - Loading dataset and fitting a model - Declare and compile Shapash smart explainer - Plot top interaction values - Plot a chosen couple of variables

We used Kaggle’s Titanic dataset

[1]:
import pandas as pd
from category_encoders import OrdinalEncoder
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split

Building Supervized Model

Load Titanic data

[3]:
from shapash.data.data_loader import data_loading
titanic_df, titanic_dict = data_loading('titanic')
del titanic_df['Name']
y_df=titanic_df['Survived']
X_df=titanic_df[titanic_df.columns.difference(['Survived'])]
[4]:
titanic_df.head()
[4]:
Survived Pclass Sex Age SibSp Parch Fare Embarked Title
PassengerId
1 0 Third class male 22.0 1 0 7.25 Southampton Mr
2 1 First class female 38.0 1 0 71.28 Cherbourg Mrs
3 1 Third class female 26.0 0 0 7.92 Southampton Miss
4 1 First class female 35.0 1 0 53.10 Southampton Mrs
5 0 Third class male 35.0 0 0 8.05 Southampton Mr
[5]:
from category_encoders import OrdinalEncoder

categorical_features = [col for col in X_df.columns if X_df[col].dtype == 'object']

encoder = OrdinalEncoder(
    cols=categorical_features,
    handle_unknown='ignore',
    return_df=True).fit(X_df)

X_df=encoder.transform(X_df)

Train / Test Split + model fitting

[6]:
Xtrain, Xtest, ytrain, ytest = train_test_split(X_df, y_df, train_size=0.75, random_state=7)
[7]:
clf = XGBClassifier(n_estimators=200,min_child_weight=2).fit(Xtrain,ytrain)

Declare and Compile SmartExplainer

[8]:
from shapash import SmartExplainer
[9]:
response_dict = {0: 'Death', 1:' Survival'}
[10]:
xpl = SmartExplainer(
    model=clf,
    preprocessing=encoder,      # Optional: compile step can use inverse_transform method
    features_dict=titanic_dict, # Optional parameters
    label_dict=response_dict    # Optional parameters, dicts specify labels
)
[11]:
xpl.compile(x=Xtest)
Backend: Shap TreeExplainer

Plot top interactions

Now we may want to analyze our model and in particular how some variables combinations influence the output.

Shapash allows to quickly inspect your model by showing the variables for which there is the highest chance to get interesting interactions.

To do so you can use the following method (use the button to see the different variables interactions) :

[12]:
xpl.plot.top_interactions_plot(nb_top_interactions=5)
../../_images/tutorials_plots_and_charts_tuto-plot05-interactions-plot_16_0.png

Plot interactions between two selected variables

If you want to display a particular couple of interactions just use the following method with the chosen features (here ‘Sex’ and ‘Pclass’):

[13]:
xpl.plot.interactions_plot('Sex', 'Pclass')
../../_images/tutorials_plots_and_charts_tuto-plot05-interactions-plot_19_0.png

As a quick analysis we can see on the plot that the model learned the following points : - Female passengers have : - Highest chance of surviving when belonging to first or second class - Less chance of surviving when belonging to third class - On the contrary, male passengers have : - Highest chance of surviving when belonging to the third class - Less chance of surviving when belonging to first or second class