{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Compute Contributions with Shap - Summarize Them With Shapash\n",
"\n",
"Shapash uses Shap backend to compute the Shapley contributions
\n",
"in order to satisfy the most hurry users who wish to display
\n",
"results with little lines of code.\n",
"\n",
"But we recommend you to refer to the excellent [Shap library](https://github.com/slundberg/shap).\n",
"\n",
"This tutorial shows how to use precalculated contributions with Shap in Shapash \n",
"\n",
"Contents:\n",
"- Build a Binary Classifier\n",
"- Use Shap KernelExplainer\n",
"- Compile Shapash SmartExplainer\n",
"- Display local_plot\n",
"- to_pandas export\n",
"\n",
"We used Kaggle's [Titanic](https://www.kaggle.com/c/titanic) dataset"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"from category_encoders import OrdinalEncoder\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"from sklearn.model_selection import train_test_split\n",
"import shap"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from shapash.data.data_loader import data_loading"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"titan_df, titan_dict = data_loading('titanic')\n",
"del titan_df['Name']"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | Survived | \n", "Pclass | \n", "Sex | \n", "Age | \n", "SibSp | \n", "Parch | \n", "Fare | \n", "Embarked | \n", "Title | \n", "
---|---|---|---|---|---|---|---|---|---|
PassengerId | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
1 | \n", "0 | \n", "Third class | \n", "male | \n", "22.0 | \n", "1 | \n", "0 | \n", "7.25 | \n", "Southampton | \n", "Mr | \n", "
2 | \n", "1 | \n", "First class | \n", "female | \n", "38.0 | \n", "1 | \n", "0 | \n", "71.28 | \n", "Cherbourg | \n", "Mrs | \n", "
3 | \n", "1 | \n", "Third class | \n", "female | \n", "26.0 | \n", "0 | \n", "0 | \n", "7.92 | \n", "Southampton | \n", "Miss | \n", "
4 | \n", "1 | \n", "First class | \n", "female | \n", "35.0 | \n", "1 | \n", "0 | \n", "53.10 | \n", "Southampton | \n", "Mrs | \n", "
5 | \n", "0 | \n", "Third class | \n", "male | \n", "35.0 | \n", "0 | \n", "0 | \n", "8.05 | \n", "Southampton | \n", "Mr | \n", "
RandomForestClassifier(min_samples_leaf=3)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
RandomForestClassifier(min_samples_leaf=3)