{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Shapash in Jupyter - Overview\n",
"\n",
"With this tutorial you:
\n",
"Understand how Shapash works in Jupyter Notebook\n",
"with a simple use case
\n",
"\n",
"Contents:\n",
"- Build a Regressor\n",
"- Compile Shapash SmartExplainer\n",
"- Display global and local explanability\n",
"- Export local summarized explainability with to_pandas method\n",
"- Save Shapash object in pickle file\n",
"\n",
"Data from Kaggle [House Prices](https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"from category_encoders import OrdinalEncoder\n",
"from lightgbm import LGBMRegressor\n",
"from sklearn.model_selection import train_test_split"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Building Supervized Model "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from shapash.data.data_loader import data_loading\n",
"house_df, house_dict = data_loading('house_prices')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"y_df=house_df['SalePrice'].to_frame()\n",
"X_df=house_df[house_df.columns.difference(['SalePrice'])]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | MSSubClass | \n", "MSZoning | \n", "LotArea | \n", "Street | \n", "LotShape | \n", "LandContour | \n", "Utilities | \n", "LotConfig | \n", "LandSlope | \n", "Neighborhood | \n", "... | \n", "EnclosedPorch | \n", "3SsnPorch | \n", "ScreenPorch | \n", "PoolArea | \n", "MiscVal | \n", "MoSold | \n", "YrSold | \n", "SaleType | \n", "SaleCondition | \n", "SalePrice | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Id | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
1 | \n", "2-Story 1946 & Newer | \n", "Residential Low Density | \n", "8450 | \n", "Paved | \n", "Regular | \n", "Near Flat/Level | \n", "All public Utilities (E,G,W,& S) | \n", "Inside lot | \n", "Gentle slope | \n", "College Creek | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "2 | \n", "2008 | \n", "Warranty Deed - Conventional | \n", "Normal Sale | \n", "208500 | \n", "
2 | \n", "1-Story 1946 & Newer All Styles | \n", "Residential Low Density | \n", "9600 | \n", "Paved | \n", "Regular | \n", "Near Flat/Level | \n", "All public Utilities (E,G,W,& S) | \n", "Frontage on 2 sides of property | \n", "Gentle slope | \n", "Veenker | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "5 | \n", "2007 | \n", "Warranty Deed - Conventional | \n", "Normal Sale | \n", "181500 | \n", "
3 | \n", "2-Story 1946 & Newer | \n", "Residential Low Density | \n", "11250 | \n", "Paved | \n", "Slightly irregular | \n", "Near Flat/Level | \n", "All public Utilities (E,G,W,& S) | \n", "Inside lot | \n", "Gentle slope | \n", "College Creek | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "9 | \n", "2008 | \n", "Warranty Deed - Conventional | \n", "Normal Sale | \n", "223500 | \n", "
4 | \n", "2-Story 1945 & Older | \n", "Residential Low Density | \n", "9550 | \n", "Paved | \n", "Slightly irregular | \n", "Near Flat/Level | \n", "All public Utilities (E,G,W,& S) | \n", "Corner lot | \n", "Gentle slope | \n", "Crawford | \n", "... | \n", "272 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "2 | \n", "2006 | \n", "Warranty Deed - Conventional | \n", "Abnormal Sale | \n", "140000 | \n", "
5 | \n", "2-Story 1946 & Newer | \n", "Residential Low Density | \n", "14260 | \n", "Paved | \n", "Slightly irregular | \n", "Near Flat/Level | \n", "All public Utilities (E,G,W,& S) | \n", "Frontage on 2 sides of property | \n", "Gentle slope | \n", "Northridge | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "12 | \n", "2008 | \n", "Warranty Deed - Conventional | \n", "Normal Sale | \n", "250000 | \n", "
5 rows × 73 columns
\n", "\n", " | pred | \n", "feature_1 | \n", "value_1 | \n", "contribution_1 | \n", "feature_2 | \n", "value_2 | \n", "contribution_2 | \n", "feature_3 | \n", "value_3 | \n", "contribution_3 | \n", "
---|---|---|---|---|---|---|---|---|---|---|
259 | \n", "209141.256921 | \n", "Ground living area square feet | \n", "1792 | \n", "13710.4 | \n", "Overall material and finish of the house | \n", "7 | \n", "12776.3 | \n", "Total square feet of basement area | \n", "963 | \n", "-5103.03 | \n", "
268 | \n", "178734.474531 | \n", "Ground living area square feet | \n", "2192 | \n", "29747 | \n", "Overall material and finish of the house | \n", "5 | \n", "-26151.3 | \n", "Overall condition of the house | \n", "8 | \n", "9190.84 | \n", "
289 | \n", "113950.844570 | \n", "Overall material and finish of the house | \n", "5 | \n", "-24730 | \n", "Ground living area square feet | \n", "900 | \n", "-16342.6 | \n", "Total square feet of basement area | \n", "882 | \n", "-5922.64 | \n", "
650 | \n", "74957.162142 | \n", "Overall material and finish of the house | \n", "4 | \n", "-33927.7 | \n", "Ground living area square feet | \n", "630 | \n", "-23234.4 | \n", "Total square feet of basement area | \n", "630 | \n", "-11687.9 | \n", "
1234 | \n", "135305.243500 | \n", "Overall material and finish of the house | \n", "5 | \n", "-25445.7 | \n", "Ground living area square feet | \n", "1188 | \n", "-11476.6 | \n", "Condition of sale | \n", "Abnormal Sale | \n", "-5071.82 | \n", "