{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Shapash model in production - Overview\n", "\n", "With this tutorial you:
\n", "Understand how to create a Shapash SmartPredictor to make prediction and have local explanation in production\n", "with a simple use case.
\n", "\n", "This tutorial describes the different steps from training the model to Shapash SmartPredictor deployment.\n", "A more detailed tutorial allows you to know more about the SmartPredictor Object.\n", "\n", "Contents:\n", "- Build a Regressor\n", "- Compile Shapash SmartExplainer\n", "- From Shapash SmartExplainer to SmartPredictor\n", "- Save Shapash Smartpredictor Object in pickle file\n", "- Make a prediction\n", "\n", "Data from Kaggle [House Prices](https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "from category_encoders import OrdinalEncoder\n", "from lightgbm import LGBMRegressor\n", "from sklearn.model_selection import train_test_split" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 1 : Exploration and training of the model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Building Supervized Model " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, we train a Machine Learning supervized model with our data House Prices." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from shapash.data.data_loader import data_loading\n", "house_df, house_dict = data_loading('house_prices')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "y_df=house_df['SalePrice'].to_frame()\n", "X_df=house_df[house_df.columns.difference(['SalePrice'])]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Preprocessing step " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Encoding Categorical Features" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "from category_encoders import OrdinalEncoder\n", "\n", "categorical_features = [col for col in X_df.columns if X_df[col].dtype == 'object']\n", "\n", "encoder = OrdinalEncoder(cols=categorical_features,\n", " handle_unknown='ignore',\n", " return_df=True).fit(X_df)\n", "\n", "X_encoded=encoder.transform(X_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Train / Test Split" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "Xtrain, Xtest, ytrain, ytest = train_test_split(X_encoded, y_df, train_size=0.75, random_state=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Model Fitting" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.002116 seconds.\n", "You can set `force_row_wise=true` to remove the overhead.\n", "And if memory is not enough, you can set `force_col_wise=true`.\n", "[LightGBM] [Info] Total Bins 2986\n", "[LightGBM] [Info] Number of data points in the train set: 1095, number of used features: 66\n", "[LightGBM] [Info] Start training from score 182319.757078\n" ] } ], "source": [ "regressor = LGBMRegressor(n_estimators=200).fit(Xtrain, ytrain)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "y_pred = pd.DataFrame(regressor.predict(Xtest), columns=['pred'], index=Xtest.index)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Understand my model with shapash" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, we use the SmartExplainer Object from shapash.\n", "- It allows users to understand how the model works with the specified data. \n", "- This object must be used only for data mining step. Shapash provides another object for deployment.\n", "- In this tutorial, we are not exploring possibilites of the SmartExplainer but others will. (see other tutorials)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Declare and Compile SmartExplainer " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "from shapash import SmartExplainer" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Use wording on features names to better understanding results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, we use a wording to rename our features label with more understandable terms. It's usefull to make our local explainability more operational and understandable for users.\n", "- To do this, we use the house_dict dictionary which maps a description to each features.\n", "- We can then use it features_dict as a parameter of the SmartExplainer." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "xpl = SmartExplainer(\n", " model=regressor,\n", " preprocessing=encoder, # Optional: compile step can use inverse_transform method\n", " features_dict=house_dict\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**compile()
** This method is the first step to understand model and prediction.
It performs the sorting\n", "of contributions, the reverse preprocessing steps and all the calculations necessary for\n", "a quick display of plots and efficient summary of explanation. (see SmartExplainer documentation and tutorials)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO: Shap explainer type - \n" ] } ], "source": [ "xpl.compile(x=Xtest,\n", " y_pred=y_pred,\n", " y_target=ytest, # Optional: allows to display True Values vs Predicted Values\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Understand results of your trained model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, we can easily get a first summary of the explanation of the model results.\n", "- Here, we chose to get the 3 most contributive features for each prediction.\n", "- We used a wording to get features names more understandable in operationnal case." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "application/vnd.microsoft.datawrangler.viewer.v0+json": { "columns": [ { "name": "index", "rawType": "int64", "type": "integer" }, { "name": "pred", "rawType": "float64", "type": "float" }, { "name": "feature_1", "rawType": "object", "type": "string" }, { "name": "value_1", "rawType": "object", "type": "unknown" }, { "name": "contribution_1", "rawType": "object", "type": "unknown" }, { "name": "feature_2", "rawType": "object", "type": "string" }, { "name": "value_2", "rawType": "object", "type": "unknown" }, { "name": "contribution_2", "rawType": "object", "type": "unknown" }, { "name": "feature_3", "rawType": "object", "type": "string" }, { "name": "value_3", "rawType": "object", "type": "unknown" }, { "name": "contribution_3", "rawType": "object", "type": "unknown" } ], "ref": "6e37ae85-7725-4a4e-ae5d-597f0e89bd6d", "rows": [ [ "259", "211538.7421568184", "Ground living area square feet", "1792", "13995.651927455732", "Overall material and finish of the house", "7", "13539.4413532986", "Total square feet of basement area", "963", "-5652.2068542521465" ], [ "268", "178786.67725671502", "Ground living area square feet", "2192", "27967.96627840816", "Overall material and finish of the house", "5", "-26133.987558629702", "Overall condition of the house", "8", "7799.924797678616" ], [ "289", "111985.32465961165", "Overall material and finish of the house", "5", "-25571.34831492946", "Ground living area square feet", "900", "-16006.763921397876", "Total square feet of basement area", "882", "-5456.989325357975" ], [ "650", "73456.52251534877", "Overall material and finish of the house", "4", "-34517.0736758386", "Ground living area square feet", "630", "-21350.707866325673", "Total square feet of basement area", "630", "-12699.371235500535" ], [ "1234", "136249.55731584632", "Overall material and finish of the house", "5", "-26469.23540518885", "Ground living area square feet", "1188", "-10980.550285312522", "Condition of sale", "Abnormal Sale", "-5240.009373119406" ] ], "shape": { "columns": 10, "rows": 5 } }, "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
predfeature_1value_1contribution_1feature_2value_2contribution_2feature_3value_3contribution_3
259211538.742157Ground living area square feet179213995.651927Overall material and finish of the house713539.441353Total square feet of basement area963-5652.206854
268178786.677257Ground living area square feet219227967.966278Overall material and finish of the house5-26133.987559Overall condition of the house87799.924798
289111985.324660Overall material and finish of the house5-25571.348315Ground living area square feet900-16006.763921Total square feet of basement area882-5456.989325
65073456.522515Overall material and finish of the house4-34517.073676Ground living area square feet630-21350.707866Total square feet of basement area630-12699.371236
1234136249.557316Overall material and finish of the house5-26469.235405Ground living area square feet1188-10980.550285Condition of saleAbnormal Sale-5240.009373
\n", "
" ], "text/plain": [ " pred feature_1 value_1 \\\n", "259 211538.742157 Ground living area square feet 1792 \n", "268 178786.677257 Ground living area square feet 2192 \n", "289 111985.324660 Overall material and finish of the house 5 \n", "650 73456.522515 Overall material and finish of the house 4 \n", "1234 136249.557316 Overall material and finish of the house 5 \n", "\n", " contribution_1 feature_2 value_2 \\\n", "259 13995.651927 Overall material and finish of the house 7 \n", "268 27967.966278 Overall material and finish of the house 5 \n", "289 -25571.348315 Ground living area square feet 900 \n", "650 -34517.073676 Ground living area square feet 630 \n", "1234 -26469.235405 Ground living area square feet 1188 \n", "\n", " contribution_2 feature_3 value_3 \\\n", "259 13539.441353 Total square feet of basement area 963 \n", "268 -26133.987559 Overall condition of the house 8 \n", "289 -16006.763921 Total square feet of basement area 882 \n", "650 -21350.707866 Total square feet of basement area 630 \n", "1234 -10980.550285 Condition of sale Abnormal Sale \n", "\n", " contribution_3 \n", "259 -5652.206854 \n", "268 7799.924798 \n", "289 -5456.989325 \n", "650 -12699.371236 \n", "1234 -5240.009373 " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xpl.to_pandas(max_contrib=3).head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2 : SmartPredictor in production" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Switch from SmartExplainer to SmartPredictor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When you are satisfied by your results and the explainablity given by Shapash, you can use the SmartPredictor object for deployment. \n", "- In this section, we learn how to easily switch from SmartExplainer to a SmartPredictor.\n", "- SmartPredictor allows you to make predictions, detail and summarize contributions on new data automatically.\n", "- It only keeps the attributes needed for deployment to be lighter than the SmartExplainer object. \n", "- SmartPredictor performs additional consistency checks before deployment.\n", "- SmartPredictor allows you to configure the way of summary to suit your use cases.\n", "- It can be used with API or in batch mode." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "predictor = xpl.to_smartpredictor()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Save and Load your SmartPredictor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can easily save and load your SmartPredictor Object in pickle." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Save your SmartPredictor in Pickle File" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "predictor.save('./predictor.pkl')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Load your SmartPredictor in Pickle File" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "from shapash.utils.load_smartpredictor import load_smartpredictor" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "predictor_load = load_smartpredictor('./predictor.pkl')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make a prediction with your SmartPredictor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to make new predictions and summarize local explainability of your model on new datasets, you can use the method add_input of the SmartPredictor.\n", "- The add_input method is the first step to add a dataset for prediction and explainability.\n", "- It checks the structure of the dataset, the prediction and the contribution if specified. \n", "- It applies the preprocessing specified in the initialisation and reorder the features with the order used by the model. (see the documentation of this method)\n", "- In API mode, this method can handle dictionnaries data which can be received from a GET or a POST request." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Add data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The x input in add_input method doesn't have to be encoded, add_input applies preprocessing." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO: Shap explainer type - \n" ] } ], "source": [ "predictor_load.add_input(x=X_df, ypred=y_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Make prediction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, we can see ypred is the one given in add_input method by checking the attribute data[\"ypred\"]. If not specified, it will automatically be computed in the method. " ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "application/vnd.microsoft.datawrangler.viewer.v0+json": { "columns": [ { "name": "Id", "rawType": "int64", "type": "integer" }, { "name": "SalePrice", "rawType": "int64", "type": "integer" } ], "ref": "18ab9cd5-904c-4a57-8d39-710d5f4d3b55", "rows": [ [ "1", "208500" ], [ "2", "181500" ], [ "3", "223500" ], [ "4", "140000" ], [ "5", "250000" ] ], "shape": { "columns": 1, "rows": 5 } }, "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SalePrice
Id
1208500
2181500
3223500
4140000
5250000
\n", "
" ], "text/plain": [ " SalePrice\n", "Id \n", "1 208500\n", "2 181500\n", "3 223500\n", "4 140000\n", "5 250000" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "predictor_load.data[\"ypred\"].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Get detailed explanability associated to the prediction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can use the method detail_contributions to see the detailed contributions of each of your features for each row of your new dataset.\n", "- For classification problems, it automatically associates contributions with the right predicted label. \n", "- The predicted label can be computed automatically in the method or you can specify an ypred with add_input method." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO: Shap explainer type - \n" ] } ], "source": [ "detailed_contributions = predictor_load.detail_contributions()" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "application/vnd.microsoft.datawrangler.viewer.v0+json": { "columns": [ { "name": "Id", "rawType": "int64", "type": "integer" }, { "name": "SalePrice", "rawType": "int64", "type": "integer" }, { "name": "1stFlrSF", "rawType": "float64", "type": "float" }, { "name": "2ndFlrSF", "rawType": "float64", "type": "float" }, { "name": "3SsnPorch", "rawType": "float64", "type": "float" }, { "name": "BedroomAbvGr", "rawType": "float64", "type": "float" }, { "name": "BldgType", "rawType": "float64", "type": "float" }, { "name": "BsmtCond", "rawType": "float64", "type": "float" }, { "name": "BsmtExposure", "rawType": "float64", "type": "float" }, { "name": "BsmtFinSF1", "rawType": "float64", "type": "float" }, { "name": "BsmtFinSF2", "rawType": "float64", "type": "float" }, { "name": "BsmtFinType1", "rawType": "float64", "type": "float" }, { "name": "BsmtFinType2", "rawType": "float64", "type": "float" }, { "name": "BsmtFullBath", "rawType": "float64", "type": "float" }, { "name": "BsmtHalfBath", "rawType": "float64", "type": "float" }, { "name": "BsmtQual", "rawType": "float64", "type": "float" }, { "name": "BsmtUnfSF", "rawType": "float64", "type": "float" }, { "name": "CentralAir", "rawType": "float64", "type": "float" }, { "name": "Condition1", "rawType": "float64", "type": "float" }, { "name": "Condition2", "rawType": "float64", "type": "float" }, { "name": "Electrical", "rawType": "float64", "type": "float" }, { "name": "EnclosedPorch", "rawType": "float64", "type": "float" }, { "name": "ExterCond", "rawType": "float64", "type": "float" }, { "name": "ExterQual", "rawType": "float64", "type": "float" }, { "name": "Exterior1st", "rawType": "float64", "type": "float" }, { "name": "Exterior2nd", "rawType": "float64", "type": "float" }, { "name": "Fireplaces", "rawType": "float64", "type": "float" }, { "name": "Foundation", "rawType": "float64", "type": "float" }, { "name": "FullBath", "rawType": "float64", "type": "float" }, { "name": "Functional", "rawType": "float64", "type": "float" }, { "name": "GarageArea", "rawType": "float64", "type": "float" }, { "name": "GarageCond", "rawType": "float64", "type": "float" }, { "name": "GarageFinish", "rawType": "float64", "type": "float" }, { "name": "GarageQual", "rawType": "float64", "type": "float" }, { "name": "GarageType", "rawType": "float64", "type": "float" }, { "name": "GarageYrBlt", "rawType": "float64", "type": "float" }, { "name": "GrLivArea", "rawType": "float64", "type": "float" }, { "name": "HalfBath", "rawType": "float64", "type": "float" }, { "name": "Heating", "rawType": "float64", "type": "float" }, { "name": "HeatingQC", "rawType": "float64", "type": "float" }, { "name": "HouseStyle", "rawType": "float64", "type": "float" }, { "name": "KitchenAbvGr", "rawType": "float64", "type": "float" }, { "name": "KitchenQual", "rawType": "float64", "type": "float" }, { "name": "LandContour", "rawType": "float64", "type": "float" }, { "name": "LandSlope", "rawType": "float64", "type": "float" }, { "name": "LotArea", "rawType": "float64", "type": "float" }, { "name": "LotConfig", "rawType": "float64", "type": "float" }, { "name": "LotShape", "rawType": "float64", "type": "float" }, { "name": "LowQualFinSF", "rawType": "float64", "type": "float" }, { "name": "MSSubClass", "rawType": "float64", "type": "float" }, { "name": "MSZoning", "rawType": "float64", "type": "float" }, { "name": "MasVnrArea", "rawType": "float64", "type": "float" }, { "name": "MasVnrType", "rawType": "float64", "type": "float" }, { "name": "MiscVal", "rawType": "float64", "type": "float" }, { "name": "MoSold", "rawType": "float64", "type": "float" }, { "name": "Neighborhood", "rawType": "float64", "type": "float" }, { "name": "OpenPorchSF", "rawType": "float64", "type": "float" }, { "name": "OverallCond", "rawType": "float64", "type": "float" }, { "name": "OverallQual", "rawType": "float64", "type": "float" }, { "name": "PavedDrive", "rawType": "float64", "type": "float" }, { "name": "PoolArea", "rawType": "float64", "type": "float" }, { "name": "RoofMatl", "rawType": "float64", "type": "float" }, { "name": "RoofStyle", "rawType": "float64", "type": "float" }, { "name": "SaleCondition", "rawType": "float64", "type": "float" }, { "name": "SaleType", "rawType": "float64", "type": "float" }, { "name": "ScreenPorch", "rawType": "float64", "type": "float" }, { "name": "Street", "rawType": "float64", "type": "float" }, { "name": "TotRmsAbvGrd", "rawType": "float64", "type": "float" }, { "name": "TotalBsmtSF", "rawType": "float64", "type": "float" }, { "name": "Utilities", "rawType": "float64", "type": "float" }, { "name": "WoodDeckSF", "rawType": "float64", "type": "float" }, { "name": "YearBuilt", "rawType": "float64", "type": "float" }, { "name": "YearRemodAdd", "rawType": "float64", "type": "float" }, { "name": "YrSold", "rawType": "float64", "type": "float" } ], "ref": "e7b446e6-2965-4edd-bd09-a56b05983cdc", "rows": [ [ "1", "208500", "-864.3026659940512", "1089.4290097002256", "0.0", "337.5211663230402", "-1.949170382996597", "156.11146918254568", "-361.26238867270416", "605.4995026166981", "-62.668440413082514", "1984.828756317101", "-41.87191392870519", "1990.8962387696397", "-27.086896993715374", "-226.3454719925297", "1692.9479665063932", "114.1567072126657", "428.8721362712976", "0.0", "-80.69057447920923", "39.85508813252803", "21.036903532226745", "551.0343800650712", "794.5610197366157", "637.0358656485056", "-709.724191485144", "1059.1897263739058", "-220.9679051180128", "293.0509088279874", "3052.5047633325353", "4.125024876858291", "-148.69484626232196", "215.63307630308492", "869.5363419493488", "-453.2973028466613", "2269.7615034289643", "209.9725793225271", "-17.568215035146185", "49.9980482281087", "186.06378863332415", "82.51807616973737", "-1114.2945761635829", "174.96720068992707", "-31.730156428546405", "-292.49010715619903", "-34.98346771960558", "106.00359423382098", "0.0", "2191.6266960272774", "388.2694298567851", "59.670954039163824", "116.41049296934622", "-2.2274476865793833", "-475.2161218894182", "541.1655124649254", "975.0493471847424", "-1215.8028095881073", "6920.473837235947", "50.86094616396314", "0.0", "0.0", "-113.93639750857912", "320.4016799287954", "-121.38644827319511", "-340.8928055547724", "0.0", "-353.0017433937678", "-4739.814199902538", "0.0", "-595.9655099580244", "3880.341977513828", "2553.0541726345436", "-181.63161914344218" ], [ "2", "181500", "3350.8449326960717", "-584.3690970808641", "0.0", "205.51638410771704", "-5.940831374481436", "123.97569987808198", "3533.45329452523", "4220.333951722205", "-73.55810960243943", "918.062823986119", "-39.59741857339008", "-783.4678938705949", "408.04043341620894", "313.2291777970262", "332.76817074869996", "239.35357228708835", "-1998.207862212061", "0.0", "-103.3211346594825", "-3.625026509541008", "32.5428143240956", "-999.004089446092", "172.4883942679527", "81.17120371843218", "3853.403738689955", "-597.7475404444759", "441.36984524368336", "379.93599634282185", "-635.0457193613751", "9.69689548598594", "612.8761939825548", "209.55686713884168", "879.9895469832468", "-710.52146666323", "-9032.425475725368", "-216.11043404348774", "-17.120976331602666", "-117.03045661573194", "-135.32447173368377", "112.11302117668802", "-1124.9779178975286", "138.39005750865107", "-33.945592010224", "-477.82138063794815", "-660.683473261485", "-384.9053653618811", "0.0", "727.0932985727308", "450.4363825052229", "-416.4638032772958", "-71.92163190319326", "-5.863356313638359", "462.34253818549456", "1315.754367226545", "-232.2972670244968", "4246.185116148694", "-12277.905073344118", "55.925058441407785", "0.0", "0.0", "-59.30830405958524", "261.09243860032075", "-163.67075602268966", "-245.50118220891375", "0.0", "-614.916747659277", "3362.7911936254413", "0.0", "2428.197528104645", "960.4554756471458", "-3867.310294377886", "503.3149156201383" ], [ "3", "223500", "-1262.62867169775", "324.3961569352599", "0.0", "337.1416782259848", "-1.9491703829965963", "155.61745436544433", "517.916354983222", "897.5975305518241", "-67.81176580037655", "1414.0658345116715", "-39.46676448152517", "1442.0817795414287", "-26.33596550034477", "-232.4688693577726", "-112.81472564190192", "126.54523746977499", "311.564259285928", "0.0", "-76.5471741772432", "44.65790666222109", "17.11897752222626", "1131.4650141949785", "518.0287134999645", "477.5194832290621", "1484.5091671374873", "687.327053053561", "-236.96393815579165", "368.3473888663167", "6193.320014648451", "12.259785379537488", "-187.6797726379213", "223.4352533273805", "665.2909909962592", "-374.86770071456203", "16380.55427958463", "209.06637397212944", "-18.28018177362612", "76.06394506482359", "190.7897824498622", "89.49705097473442", "-548.4812969976356", "135.620242373347", "-32.3981896458561", "-146.25497617530218", "-253.85535266408579", "268.51993586677634", "0.0", "2560.462801007062", "232.38693914581734", "-258.52923524191795", "115.49010989448611", "-5.7980242882095", "-679.2349149367981", "306.62778778643604", "104.14540705368294", "-1120.9909244618832", "10036.003237560606", "52.60061327084994", "0.0", "0.0", "-131.10772678556413", "461.9977819539519", "-130.96278598695207", "-370.4609887544106", "0.0", "-224.93108870786142", "-5755.026659486052", "0.0", "-560.7410895442018", "3005.6878921006214", "2812.095217624558", "-439.7907040364498" ], [ "4", "140000", "-1480.7905663836862", "76.48056805786511", "0.0", "288.0910465421073", "-9.575974649454466", "315.4466397224675", "-688.8452364993326", "-2484.2130944444684", "-96.56033434789927", "1861.6503394050262", "-63.60627664780205", "525.7902131391522", "-26.928316518608703", "-517.1303663888157", "-124.37520861797813", "164.6273650582279", "305.7408660213683", "0.0", "-86.96259386813765", "-185.40407980626452", "37.640788836857", "-1466.3937083849803", "-361.92848693176256", "-569.3106987773348", "1777.0372147490525", "-504.2481566263623", "-338.77810314404724", "285.3920256076982", "5581.668158460934", "9.539855301857358", "-95.47511973944466", "206.15738474054103", "-1788.152484030531", "-112.44250637122697", "570.3920390330692", "-195.07111357012985", "-17.213181217475153", "-51.005282763602295", "33.51439948919779", "93.76088260125748", "-1135.2865690140966", "87.0827608588021", "-35.09810810208022", "-1041.6364406175721", "-282.59041832807554", "608.0905345092713", "0.0", "-3501.224520100811", "414.2938037562967", "-683.8514595109987", "-69.43763803958032", "-4.552640922619866", "-1377.739701997249", "3310.3928194854234", "-229.66035043935403", "-1915.5433773632337", "5283.399570125182", "58.4060329755488", "0.0", "0.0", "-41.274746120408594", "-4135.783182913163", "-107.29072719514006", "-332.888191195801", "0.0", "-576.0928585240883", "-6153.292222224339", "0.0", "-713.0471015287014", "-4324.383049373135", "-4434.789606161131", "1122.3012309613277" ], [ "5", "250000", "-9853.70872551561", "-1625.9570998471436", "0.0", "-528.7451687106447", "0.8314914297734498", "100.19498510037327", "374.71471952877044", "-1575.576971251935", "-74.6046279057801", "747.0370266577119", "-25.261572795784303", "1492.1978596882027", "-34.459513643380845", "-264.602511635346", "-196.3947303494027", "127.74888705041351", "106.15545950221048", "0.0", "-43.999480285329156", "36.0871024168967", "22.339785247283533", "-1574.886683069872", "300.91842882633085", "341.9285063494034", "1731.0987208098788", "459.2914495854218", "-391.5876180940073", "202.64238474581222", "12725.437382586935", "6.573078119159417", "-239.08788475377764", "121.71698102900949", "209.90620854432663", "-8.978266010195787", "15518.770217742456", "443.53556601601895", "-20.858769255629376", "77.96691839344453", "99.5361711215409", "49.34608088611008", "-5253.631561291575", "-29.650696484757873", "-54.64314482564241", "4019.4489457418795", "-728.6360726888216", "417.64720065152505", "0.0", "1336.1210893318664", "248.29840961961727", "4750.915661211558", "661.0641061957863", "-1.0244688345881765", "-3797.3829652807995", "164.45580006149154", "-89.5134488071986", "-1692.4453666306808", "59198.997537704374", "26.206015686304816", "0.0", "0.0", "-279.9112850241152", "-469.76411496977016", "-737.8651562311171", "-365.37427418213167", "0.0", "-4273.31308449945", "-4544.413358154178", "0.0", "-271.37683574699116", "1685.4594579538561", "1548.5981373376067", "-352.41656882533704" ] ], "shape": { "columns": 73, "rows": 5 } }, "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SalePrice1stFlrSF2ndFlrSF3SsnPorchBedroomAbvGrBldgTypeBsmtCondBsmtExposureBsmtFinSF1BsmtFinSF2...SaleTypeScreenPorchStreetTotRmsAbvGrdTotalBsmtSFUtilitiesWoodDeckSFYearBuiltYearRemodAddYrSold
Id
1208500-864.3026661089.4290100.0337.521166-1.949170156.111469-361.262389605.499503-62.668440...-121.386448-340.8928060.0-353.001743-4739.8142000.0-595.9655103880.3419782553.054173-181.631619
21815003350.844933-584.3690970.0205.516384-5.940831123.9757003533.4532954220.333952-73.558110...-163.670756-245.5011820.0-614.9167483362.7911940.02428.197528960.455476-3867.310294503.314916
3223500-1262.628672324.3961570.0337.141678-1.949170155.617454517.916355897.597531-67.811766...-130.962786-370.4609890.0-224.931089-5755.0266590.0-560.7410903005.6878922812.095218-439.790704
4140000-1480.79056676.4805680.0288.091047-9.575975315.446640-688.845236-2484.213094-96.560334...-107.290727-332.8881910.0-576.092859-6153.2922220.0-713.047102-4324.383049-4434.7896061122.301231
5250000-9853.708726-1625.9571000.0-528.7451690.831491100.194985374.714720-1575.576971-74.604628...-737.865156-365.3742740.0-4273.313084-4544.4133580.0-271.3768361685.4594581548.598137-352.416569
\n", "

5 rows × 73 columns

\n", "
" ], "text/plain": [ " SalePrice 1stFlrSF 2ndFlrSF 3SsnPorch BedroomAbvGr BldgType \\\n", "Id \n", "1 208500 -864.302666 1089.429010 0.0 337.521166 -1.949170 \n", "2 181500 3350.844933 -584.369097 0.0 205.516384 -5.940831 \n", "3 223500 -1262.628672 324.396157 0.0 337.141678 -1.949170 \n", "4 140000 -1480.790566 76.480568 0.0 288.091047 -9.575975 \n", "5 250000 -9853.708726 -1625.957100 0.0 -528.745169 0.831491 \n", "\n", " BsmtCond BsmtExposure BsmtFinSF1 BsmtFinSF2 ... SaleType \\\n", "Id ... \n", "1 156.111469 -361.262389 605.499503 -62.668440 ... -121.386448 \n", "2 123.975700 3533.453295 4220.333952 -73.558110 ... -163.670756 \n", "3 155.617454 517.916355 897.597531 -67.811766 ... -130.962786 \n", "4 315.446640 -688.845236 -2484.213094 -96.560334 ... -107.290727 \n", "5 100.194985 374.714720 -1575.576971 -74.604628 ... -737.865156 \n", "\n", " ScreenPorch Street TotRmsAbvGrd TotalBsmtSF Utilities WoodDeckSF \\\n", "Id \n", "1 -340.892806 0.0 -353.001743 -4739.814200 0.0 -595.965510 \n", "2 -245.501182 0.0 -614.916748 3362.791194 0.0 2428.197528 \n", "3 -370.460989 0.0 -224.931089 -5755.026659 0.0 -560.741090 \n", "4 -332.888191 0.0 -576.092859 -6153.292222 0.0 -713.047102 \n", "5 -365.374274 0.0 -4273.313084 -4544.413358 0.0 -271.376836 \n", "\n", " YearBuilt YearRemodAdd YrSold \n", "Id \n", "1 3880.341978 2553.054173 -181.631619 \n", "2 960.455476 -3867.310294 503.314916 \n", "3 3005.687892 2812.095218 -439.790704 \n", "4 -4324.383049 -4434.789606 1122.301231 \n", "5 1685.459458 1548.598137 -352.416569 \n", "\n", "[5 rows x 73 columns]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "detailed_contributions.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Summarize explanability of the predictions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- You can use the summarize method to summarize your local explainability\n", "- This summary can be configured with modify_mask method so that you have explainability that meets your operational needs.\n", "- When you initialize the SmartPredictor, you can also specify :\n", ">- postprocessing: to apply a wording to several values of your dataset.\n", ">- label_dict: to rename your label for classification problems.\n", ">- features_dict: to rename your features." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "predictor_load.modify_mask(max_contrib=3)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "explanation = predictor_load.summarize()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For example, here, we chose to build a summary with 3 most contributive features of your dataset.\n", "- As you can see below, the wording defined in the first step of this tutorial has been kept by the SmartPredictor and used in the summarize method. " ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "application/vnd.microsoft.datawrangler.viewer.v0+json": { "columns": [ { "name": "index", "rawType": "int64", "type": "integer" }, { "name": "SalePrice", "rawType": "int64", "type": "integer" }, { "name": "feature_1", "rawType": "object", "type": "string" }, { "name": "value_1", "rawType": "object", "type": "unknown" }, { "name": "contribution_1", "rawType": "object", "type": "unknown" }, { "name": "feature_2", "rawType": "object", "type": "string" }, { "name": "value_2", "rawType": "object", "type": "unknown" }, { "name": "contribution_2", "rawType": "object", "type": "unknown" }, { "name": "feature_3", "rawType": "object", "type": "string" }, { "name": "value_3", "rawType": "object", "type": "unknown" }, { "name": "contribution_3", "rawType": "object", "type": "unknown" } ], "ref": "82cc1867-9e03-460c-9b82-a26fd24ea53f", "rows": [ [ "1", "208500", "Overall material and finish of the house", "7", "6920.473837235947", "Total square feet of basement area", "856", "-4739.814199902538", "Original construction date", "2003", "3880.341977513828" ], [ "2", "181500", "Overall material and finish of the house", "6", "-12277.905073344118", "Ground living area square feet", "1262", "-9032.425475725368", "Overall condition of the house", "8", "4246.185116148694" ], [ "3", "223500", "Ground living area square feet", "1786", "16380.55427958463", "Overall material and finish of the house", "7", "10036.003237560606", "Size of garage in square feet", "608", "6193.320014648451" ], [ "4", "140000", "Total square feet of basement area", "756", "-6153.292222224339", "Size of garage in square feet", "642", "5581.668158460934", "Overall material and finish of the house", "7", "5283.399570125182" ], [ "5", "250000", "Overall material and finish of the house", "8", "59198.997537704374", "Ground living area square feet", "2198", "15518.770217742456", "Size of garage in square feet", "836", "12725.437382586935" ] ], "shape": { "columns": 10, "rows": 5 } }, "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SalePricefeature_1value_1contribution_1feature_2value_2contribution_2feature_3value_3contribution_3
1208500Overall material and finish of the house76920.473837Total square feet of basement area856-4739.8142Original construction date20033880.341978
2181500Overall material and finish of the house6-12277.905073Ground living area square feet1262-9032.425476Overall condition of the house84246.185116
3223500Ground living area square feet178616380.55428Overall material and finish of the house710036.003238Size of garage in square feet6086193.320015
4140000Total square feet of basement area756-6153.292222Size of garage in square feet6425581.668158Overall material and finish of the house75283.39957
5250000Overall material and finish of the house859198.997538Ground living area square feet219815518.770218Size of garage in square feet83612725.437383
\n", "
" ], "text/plain": [ " SalePrice feature_1 value_1 contribution_1 \\\n", "1 208500 Overall material and finish of the house 7 6920.473837 \n", "2 181500 Overall material and finish of the house 6 -12277.905073 \n", "3 223500 Ground living area square feet 1786 16380.55428 \n", "4 140000 Total square feet of basement area 756 -6153.292222 \n", "5 250000 Overall material and finish of the house 8 59198.997538 \n", "\n", " feature_2 value_2 contribution_2 \\\n", "1 Total square feet of basement area 856 -4739.8142 \n", "2 Ground living area square feet 1262 -9032.425476 \n", "3 Overall material and finish of the house 7 10036.003238 \n", "4 Size of garage in square feet 642 5581.668158 \n", "5 Ground living area square feet 2198 15518.770218 \n", "\n", " feature_3 value_3 contribution_3 \n", "1 Original construction date 2003 3880.341978 \n", "2 Overall condition of the house 8 4246.185116 \n", "3 Size of garage in square feet 608 6193.320015 \n", "4 Overall material and finish of the house 7 5283.39957 \n", "5 Size of garage in square feet 836 12725.437383 " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "explanation.head()" ] } ], "metadata": { "celltoolbar": "Aucun(e)", "hide_input": false, "kernelspec": { "display_name": "survptf_312", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.12" }, "pycharm": { "stem_cell": { "cell_type": "raw", "metadata": { "collapsed": false }, "source": [] } } }, "nbformat": 4, "nbformat_minor": 4 }