Overview

Installation

You can install Shapash using pip:

pip install shapash

How does shapash work?

Shapash is an overlay package for libraries dedicated to the interpretability of models. It uses Shap or Lime backend to compute contributions. Shapash relies on the different steps necessary to build a Machine Learning model to make the results understandable.

_images/shapash-diagram.png

User Manual

Shapash works for Regression, Binary Classification or Multiclass problems. It is compatible with many models: Catboost, Xgboost, LightGBM, Sklearn Ensemble, Linear models and SVM.

Shapash can use category-encoder object, sklearn ColumnTransformer or simply features dictionary.

  • Category_encoder: OneHotEncoder, OrdinalEncoder, BaseNEncoder, BinaryEncoder or TargetEncoder

  • Sklearn ColumnTransformer: OneHotEncoder, OrdinalEncoder, StandardScaler, QuantileTransformer or PowerTransformer

Compatible versions for those dependencies :

  • scikit-learn==0.23.2

  • catboost==0.22

  • xgboost==1.0.0

  • lightgbm==2.3.1

  • shap==0.37.0

  • category-encoders==2.1.0

Using Shapash is simple and requires only a few lines of code. Most parameters are optional, you can easily display plots.

You can also tune plots and outputs, specifying labels dict, features dict, encoders, predictions, … : The more you specify parameters, options, dictionaries and more the outputs will be understandable

Getting Started : 3 minutes to Shapash

The 4 steps to display results:

  • Step 1: Declare SmartExplainer Object > There 1 mandatory parameter in compile method: Model > You can declare features dict here to specify the labels Shapash will display

from shapash import SmartExplainer
xpl = SmartExplainer(
  features_dict=house_dict,  # Optional parameter
  model=regressor,
  preprocessing=encoder, # Optional: compile step can use inverse_transform method
  postprocessing=postprocess # Optional: see tutorial postprocessing
)
  • Step 2: Compile Dataset, … > There 1 mandatory parameter in compile method: Dataset

xpl.compile(
    x=Xtest,
    y_pred=y_pred, # Optional: for your own prediction (by default: model.predict)
    y_target=yTest, # Optional: allows to display True Values vs Predicted Values
    additional_data=X_additional, # Optional: additional dataset of features for Webapp
    additional_features_dict=features_dict_additional, # Optional: dict additional data
)
  • Step 3: Display output > There are several outputs and plots available. For example, you can launch the web app this way :

app = xpl.run_app()
  • Step 4: Generate the Shapash Report > This step allows to generate a standalone html report of your project using the different splits

    of your dataset and also the metrics you used:

xpl.generate_report(
    output_file='path/to/output/report.html',
    project_info_file='path/to/project_info.yml',
    x_train=Xtrain,
    y_train=ytrain,
    y_test=ytest,
    title_story="House prices report",
    title_description="""This document is a data science report of the kaggle house prices tutorial project.
        It was generated using the Shapash library.""",
    metrics=[{'name': 'MSE', 'path': 'sklearn.metrics.mean_squared_error'}]
)
  • Step 5: From training to deployment : SmartPredictor Object

Shapash provides a SmartPredictor object to deploy the summary of local explanation for the operational needs. It is an object dedicated to deployment, lighter than SmartExplainer with additional consistency checks. SmartPredictor can be used with an API or in batch mode. It provides predictions, detailed or summarized local explainability using appropriate wording.

predictor = xpl.to_smartpredictor()

From data mining to deployment

Shapash isn’t just a tool used for data mining purposes. Thanks to different objects from Shapash such as SmartExplainer and SmartPredictor, you can easily switch from data mining to deployment. Shapash SmartPredictor Object can be used through an API or in batch mode.

It allows you to produce prediction, detailed and summarized explanations on new datasets. In fact, Shapash summarize explainability that you configured to make it operational to your needs.

Watch the tutorial part of this documentation for more details.