SmartPredictor Object¶
- The SmartPredictor object allows to:
compute predictions
configure summary of the local explanation
deploy interpretability of your model for operational needs
It can be used in API mode and batch mode.
- class shapash.explainer.smart_predictor.SmartPredictor(features_dict, model, columns_dict, backend, features_types, label_dict=None, preprocessing=None, postprocessing=None, features_groups=None, mask_params=None)[source]¶
Bases:
object
The SmartPredictor class is an object lighter than SmartExplainer Object with additionnal consistency checks.
The SmartPredictor object is provided to deploy the summary of local explanation for the operational needs.
Switching from SmartExplainer to SmartPredictor, allows users to reproduce the same results automatically on datasets with right structure.
- SmartPredictor is designed to make new results understandable:
It checks consistency of all parameters
It applies preprocessing and postprocessing
It computes models contributions
It makes predictions
It summarizes local explainability
This class allows the user to automatically summarize the results of his model on new datasets (prediction, preprocessing and postprocessing linking, explainability). The SmartPredictor has several methods described below.
The SmartPredictor Attributes :
- features_dict: dict
Dictionary mapping technical feature names to domain names.
- model: model object
model used to check the different values of target estimate predict_proba
- backend: str or backend object
backend (explainer) used to compute contributions
- columns_dict: dict
Dictionary mapping integer column number (in the same order of the trained dataset) to technical feature names.
- features_types: dict
Dictionary mapping features with the right types needed.
- label_dict: dict (optional)
Dictionary mapping integer labels to domain names (classification - target values).
- preprocessing: category_encoders, ColumnTransformer, list or dict (optional)
The processing apply to the original data.
- postprocessing: dict (optional)
Dictionary of postprocessing modifications to apply in x_init dataframe.
- _case: string
String that informs if the model used is for classification or regression problem.
- _classes: list, None
List of labels if the model used is for classification problem, None otherwise.
- mask_params: dict (optional)
Dictionary that specify how to summarize the explainability.
How to declare a new SmartPredictor object?
Example
>>> predictor = SmartPredictor(features_dict=my_features_dict, >>> model=my_model, >>> backend=my_backend, >>> columns_dict=my_columns_dict, >>> features_types=my_features_type_dict, >>> label_dict=my_label_dict, >>> preprocessing=my_preprocess, >>> postprocessing=my_postprocess)
or the most common syntax
>>> predictor = xpl.to_smartpredictor()
- xpl, explainer: object
SmartExplainer instance to point to.
- add_input(x=None, ypred=None, contributions=None)[source]¶
The add_input method is the first step to add a dataset for prediction and explainability.
- add_input applies to x parameter :
consistencies checks
preprocessing and postprocessing specified during the initialisation
features reordering with the right order for the model
If you don’t specify ypred or contributions, add_input compute them. It’s possible to not specified one parameter if it has already been defined before. For example, if the user want to specified an ypred without reinitialize the dataset x already defined. If the user declare a new input x, all the parameters stored will be cleaned.
Example
>>> predictor.add_input(x=xtest_df) >>> predictor.add_input(ypred=ytest_df)
- Parameters
x (dict, pandas.DataFrame (optional)) – Raw dataset used by the model to perform the prediction (not preprocessed).
ypred (pandas.DataFrame (optional)) – User-specified prediction values.
contributions (pandas.DataFrame (regression) or list (classification) (optional)) – local contributions aggregated if the preprocessing part requires it (e.g. one-hot encoding).
- detail_contributions(contributions=None, use_groups=None)[source]¶
The detail_contributions method associates the right contributions with the right data predicted. (with ypred specified in add_input or computed automatically)
- Parameters
contributions (object (optional)) – Local contributions, or list of local contributions.
use_groups (bool (optional)) – Whether or not to compute groups of features contributions.
- Returns
A Dataset with ypred and the right associated contributions.
- Return type
pandas.DataFrame
Example
>>> predictor.add_input(x=xtest_df) >>> predictor.detail_contributions()
- modify_mask(features_to_hide=None, threshold=None, positive=None, max_contrib=None)[source]¶
This method allows the users to modify the mask_params values. Each parameter is optional, modify_mask method modifies only the values specified in parameters.
This method has to be used to configure the summary displayed with summarize method.
- Parameters
features_to_hide (list, optional (default: None)) – List of strings, containing features to hide.
threshold (float, optional (default: None)) – Absolute threshold below which any contribution is hidden.
positive (bool, optional (default: None)) – If True, hide negative values. False, hide positive values If None, hide nothing.
max_contrib (int, optional (default: None)) – Maximum number of contributions to show.
Examples
>>> predictor.modify_mask(max_contrib=1) >>> summary_df = predictor.summarize() >>> summary_df pred proba feature_1 value_1 contribution_1 0 0 0.756416 Sex 1.0 0.322308 1 3 0.628911 Sex 2.0 0.585475 2 0 0.543308 Sex 2.0 -0.486667
- predict()[source]¶
The predict method compute the predicted values for each x row defined in add_input.
- Returns
A dataset with predicted values for each x row.
- Return type
pandas.DataFrame
Example
>>> predictor.add_input(x=xtest_df) >>> predictor.predict()
- predict_proba()[source]¶
The predict_proba compute the probabilities predicted for each x row defined in add_input.
- Returns
A dataset with all probabilities of each label if there is no ypred data or a dataset with ypred and the associated probability.
- Return type
pandas.DataFrame
Example
>>> predictor.add_input(x=xtest_df) >>> predictor.predict_proba()
- save(path)[source]¶
Save method allows users to save SmartPredictor object on disk using a pickle file. Save method can be useful: you don’t have to recompile to display results later.
Load_smartpredictor method allow to load your SmartPredictor object saved. (See example below)
- Parameters
path (str) – File path to store the pickle file
Example
>>> predictor.save('path_to_pkl/predictor.pkl') >>> from shapash.utils.load_smartpredictor import load_smartpredictor >>> predictor_load = load_smartpredictor('path_to_pkl/predictor.pkl')
- summarize(use_groups=None)[source]¶
The summarize method allows to display the summary of local explainability. This method can be configured with modify_mask method to summarize the explainability to suit needs.
If the user doesn’t use modify_mask, the summarize method uses the mask_params parameters specified during the initialisation of the SmartPredictor.
- In classification case, The summarize method summarizes the explainability which corresponds to :
the predicted values specified by the user or automatically computed (with add_input method)
the right probabilities from predict_proba associated to the right predicted values
the right contributions ranked and filtered as specify with modify_mask method
- Parameters
use_groups (bool (optional)) – Whether or not to compute groups of features contributions.
- Returns
selected explanation of each row for classification case
- Return type
pandas.DataFrame
Examples
>>> summary_df = predictor.summarize() >>> summary_df pred proba feature_1 value_1 contribution_1 feature_2 value_2 contribution_2 0 0 0.756416 Sex 1.0 0.322308 Pclass 3.0 0.155069 1 3 0.628911 Sex 2.0 0.585475 Pclass 1.0 0.370504 2 0 0.543308 Sex 2.0 -0.486667 Pclass 3.0 0.255072
>>> predictor.modify_mask(max_contrib=1) >>> summary_df = predictor.summarize() >>> summary_df pred proba feature_1 value_1 contribution_1 0 0 0.756416 Sex 1.0 0.322308 1 3 0.628911 Sex 2.0 0.585475 2 0 0.543308 Sex 2.0 -0.486667