Interactions plot¶

Most explainability plots only allow the user to analyze one variable at a time.

Interactions plots are an interesting way to visualize a couple of variables and their corresponding contribution to the model output.

Shapash integrates two methods that allow to display such interactions for several individuals : interactions_plot and top_interactions_plot.

This tutorial presents how to use both methods to get more insights about your model and how two variables interact with it.

Content : - Loading dataset and fitting a model - Declare and compile Shapash smart explainer - Plot top interaction values - Plot a chosen couple of variables

We used Kaggle’s Titanic dataset

[1]:

import pandas as pd
from category_encoders import OrdinalEncoder
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split

Building Supervized Model¶

Load Titanic data

[3]:

from shapash.data.data_loader import data_loading
titanic_df, titanic_dict = data_loading('titanic')
del titanic_df['Name']
y_df=titanic_df['Survived']
X_df=titanic_df[titanic_df.columns.difference(['Survived'])]

[4]:

titanic_df.head()

[4]:

	Survived	Pclass	Sex	Age	SibSp	Parch	Fare	Embarked	Title
PassengerId
1	0	Third class	male	22.0	1	0	7.25	Southampton	Mr
2	1	First class	female	38.0	1	0	71.28	Cherbourg	Mrs
3	1	Third class	female	26.0	0	0	7.92	Southampton	Miss
4	1	First class	female	35.0	1	0	53.10	Southampton	Mrs
5	0	Third class	male	35.0	0	0	8.05	Southampton	Mr

[5]:

from category_encoders import OrdinalEncoder

categorical_features = [col for col in X_df.columns if X_df[col].dtype == 'object']

encoder = OrdinalEncoder(
    cols=categorical_features,
    handle_unknown='ignore',
    return_df=True).fit(X_df)

X_df=encoder.transform(X_df)

Train / Test Split + model fitting

[6]:

Xtrain, Xtest, ytrain, ytest = train_test_split(X_df, y_df, train_size=0.75, random_state=7)

[7]:

clf = XGBClassifier(n_estimators=200,min_child_weight=2).fit(Xtrain,ytrain)

Declare and Compile SmartExplainer¶

[8]:

from shapash import SmartExplainer

[9]:

response_dict = {0: 'Death', 1:' Survival'}

[10]:

xpl = SmartExplainer(
    model=clf,
    preprocessing=encoder,      # Optional: compile step can use inverse_transform method
    features_dict=titanic_dict, # Optional parameters
    label_dict=response_dict    # Optional parameters, dicts specify labels
)

[11]:

xpl.compile(x=Xtest)

Backend: Shap TreeExplainer

Plot top interactions¶

Now we may want to analyze our model and in particular how some variables combinations influence the output.

Shapash allows to quickly inspect your model by showing the variables for which there is the highest chance to get interesting interactions.

To do so you can use the following method (use the button to see the different variables interactions) :

[12]:

xpl.plot.top_interactions_plot(nb_top_interactions=5)

../../_images/tutorials_plots_and_charts_tuto-plot05-interactions-plot_16_0.png

Plot interactions between two selected variables¶

If you want to display a particular couple of interactions just use the following method with the chosen features (here ‘Sex’ and ‘Pclass’):

[13]:

xpl.plot.interactions_plot('Sex', 'Pclass')

../../_images/tutorials_plots_and_charts_tuto-plot05-interactions-plot_19_0.png

As a quick analysis we can see on the plot that the model learned the following points : - Female passengers have : - Highest chance of surviving when belonging to first or second class - Less chance of surviving when belonging to third class - On the contrary, male passengers have : - Highest chance of surviving when belonging to the third class - Less chance of surviving when belonging to first or second class