01. A practical example

This script provides a practical, hands-on introduction to the SHAP (SHapley Additive exPlanations) library, a leading tool for machine learning model interpretability.

It demonstrates a complete workflow for understanding the predictions of a “black box” model:

  • Data Preparation: Loads the standard breast cancer dataset and splits it for training.

  • Model Training: Trains an XGBoost classifier, a powerful gradient boosting model.

  • Explainability: Uses a shap.Explainer to compute SHAP values, which quantify the contribution of each feature to a model’s prediction.

  • Visualization: Generates a SHAP summary plot (beeswarm plot) to elegantly visualize global feature importance and the impact of feature values on the model’s output.

The primary goal is to illustrate how to move beyond simple accuracy metrics and gain deeper insights into why a model makes the decisions it does.

plot main01

Out:

Kernel type: <class 'shap.explainers._tree.TreeExplainer'>
.values =
array([[-0.07339976,  1.23534454, -4.69448856],
       [-0.28925516, -0.13902985, -4.83976683],
       [-0.23044579, -0.75952114, -4.40960412],
       ...,
       [ 0.41172362,  0.30341838,  1.756622  ],
       [-0.28925516, -0.13902985, -4.83976683],
       [-0.23044579, -0.75952114, -4.40960412]])

.base_values =
array([0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342,
       0.80274342, 0.80274342, 0.80274342, 0.80274342, 0.80274342])

.data =
array([[ 17.99,  10.38, 122.8 ],
       [ 20.57,  17.77, 132.9 ],
       [ 19.69,  21.25, 130.  ],
       ...,
       [ 12.47,  17.31,  80.45],
       [ 18.49,  17.52, 121.3 ],
       [ 20.59,  21.24, 137.8 ]])
shap_values (shape): (500, 3)
<IPython.core.display.HTML object>
[[-0.26615972 -0.68707205 -4.41828418]
 [-0.48923345  0.89701019  3.28383803]
 [ 0.25992588  2.43031889 -3.09868905]
 ...
 [ 0.73170752 -0.09166955 -2.45801772]
 [ 0.14106127 -0.04692391  0.63537647]
 [ 0.07274038  1.37873462 -3.73005894]]
C:\Users\kelda\Desktop\repositories\github\python-spare-code\main\examples\shap\plot_main01.py:155: FutureWarning:

The NumPy global RNG was seeded by calling `np.random.seed`. In a future version this function will no longer use the global RNG. Pass `rng` explicitly to opt-in to the new behaviour and silence this warning.

C:\Users\kelda\Desktop\repositories\github\python-spare-code\main\examples\shap\plot_main01.py:161: UserWarning:

FigureCanvasAgg is non-interactive, and thus cannot be shown

None

'\nprint(sv)\n#sns.swarmplot(data=sv, x=0, y=\'level_1\', color=\'viridis\', palette=\'viridis\')\n#sns.stripplot(data=sv, x=0, y=\'level_1\', color=\'viridis\', palette=\'viridis\')\n#plt.show()\nimport sys\nsys.exit()\n#sns.swarmplot(x=)\n\nimport sys\nsys.exit()\n\n#html = f"<head>{shap.getjs()}</head><body>"\n# Bee swarm\n# .. note: unexpected algorithm matplotlib!\n# .. note: does not return an object!\nplot_bee = shap.plots.beeswarm(shap_values, show=False)\n\n# Sow\nprint("\nBEE")\nprint(plot_bee)\n\n#print(f)\n# Waterfall\n# .. note: not working!\n#shap.plots.waterfall(shap_values[0], max_display=14)\n\n# Force plot\n# .. note: not working!\nplot_force = shap.plots.force(explainer.expected_value,\n    explainer.shap_values(X_train), X_train,\n    matplotlib=False, show=False)\n\n# Show\nprint("\nFORCE:")\nprint(plot_force)\nprint(plot_force.html())\nprint(shap.save_html(\'e.html\', plot_force))\n'

 24 # Libraries
 25 import numpy as np
 26 import pandas as pd
 27 import matplotlib.pyplot as plt
 28
 29 # Sklearn
 30 from sklearn.model_selection import train_test_split
 31 from sklearn.datasets import load_iris
 32 from sklearn.datasets import load_breast_cancer
 33 from sklearn.naive_bayes import GaussianNB
 34 from sklearn.linear_model import LogisticRegression
 35 from sklearn.tree import DecisionTreeClassifier
 36 from sklearn.ensemble import RandomForestClassifier
 37
 38 # Xgboost
 39 from xgboost import XGBClassifier
 40
 41 # ----------------------------------------
 42 # Load data
 43 # ----------------------------------------
 44 # Seed
 45 seed = 0
 46
 47 # Load dataset
 48 bunch = load_iris()
 49 bunch = load_breast_cancer()
 50 features = list(bunch['feature_names'])
 51
 52 # Create DataFrame
 53 data = pd.DataFrame(data=np.c_[bunch['data'], bunch['target']],
 54                     columns=features + ['target'])
 55
 56 # Create X, y
 57 X = data[bunch['feature_names']]
 58 y = data['target']
 59
 60 # Filter
 61 X = X.iloc[:500, :3]
 62 y = y.iloc[:500]
 63
 64 # Split dataset
 65 X_train, X_test, y_train, y_test = \
 66     train_test_split(X, y, random_state=seed)
 67
 68
 69 # ----------------------------------------
 70 # Classifiers
 71 # ----------------------------------------
 72 # Train classifier
 73 gnb = GaussianNB()
 74 llr = LogisticRegression()
 75 dtc = DecisionTreeClassifier(random_state=seed)
 76 rfc = RandomForestClassifier(random_state=seed)
 77 xgb = XGBClassifier(
 78     min_child_weight=0.005,
 79     eta= 0.05, gamma= 0.2,
 80     max_depth= 4,
 81     n_estimators= 100)
 82
 83 # Select one
 84 clf = xgb
 85
 86 # Fit
 87 clf.fit(X_train, y_train)
 88
 89 # ----------------------------------------
 90 # Find shap values
 91 # ----------------------------------------
 92 # Import
 93 import shap
 94
 95 """
 96 # Create shap explainer
 97 if isinstance(clf,
 98     (DecisionTreeClassifier,
 99      RandomForestClassifier,
100      XGBClassifier)):
101     # Set Tree explainer
102     explainer = shap.TreeExplainer(clf)
103 elif isinstance(clf, int):
104     # Set NN explainer
105     explainer = shap.DeepExplainer(clf)
106 else:
107     # Set generic kernel explainer
108     explainer = shap.KernelExplainer(clf.predict_proba, X_train)
109 """
110
111 # Get generic explainer
112 explainer = shap.Explainer(clf, X_train)
113
114 # Show kernel type
115 print("\nKernel type: %s" % type(explainer))
116
117 # Get shap values
118 shap_values = explainer(X)
119
120 print(shap_values)
121
122 # For interactions!!
123 # https://github.com/slundberg/shap/issues/501
124
125 # Get shap values
126 #shap_values = \
127 #    explainer.shap_values(X_train)
128 #shap_interaction_values = \
129 #    explainer.shap_interaction_values(X_train)
130
131 # Show information
132 print("shap_values (shape): %s" % \
133       str(shap_values.shape))
134 #print("shap_values_interaction (shape): %s" % \
135 #      str(shap_interaction_values.shape))
136
137
138 # ----------------------------------------
139 # Visualize
140 # ----------------------------------------
141 # Initialise
142 shap.initjs()
143
144 """
145 # Dependence plot
146 shap.dependence_plot(0, shap_values,
147     X_train, interaction_index=None, dot_size=5,
148     alpha=0.5, color='#3F75BC', show=False)
149 plt.tight_layout()
150 """
151
152 print(explainer.shap_values(X_train))
153
154 # Summary plot
155 plot_summary = shap.summary_plot( \
156     explainer.shap_values(X_train),
157     X_train, cmap='viridis',
158     show=False)
159
160 plt.tight_layout()
161 plt.show()
162
163 print(plot_summary)
164
165
166 import seaborn as sns
167 sv = explainer.shap_values(X_train)
168 sv = pd.DataFrame(sv, columns=X.columns)
169 sv = sv.stack().reset_index()
170 sv['val'] = X_train.stack().reset_index()[0]
171
172 #import plotly.express as px
173
174 #f = px.strip(data_frame=sv, x=0, y='level_1', color='val')
175 #f.show()
176
177 """
178 print(sv)
179 #sns.swarmplot(data=sv, x=0, y='level_1', color='viridis', palette='viridis')
180 #sns.stripplot(data=sv, x=0, y='level_1', color='viridis', palette='viridis')
181 #plt.show()
182 import sys
183 sys.exit()
184 #sns.swarmplot(x=)
185
186 import sys
187 sys.exit()
188
189 #html = f"<head>{shap.getjs()}</head><body>"
190 # Bee swarm
191 # .. note: unexpected algorithm matplotlib!
192 # .. note: does not return an object!
193 plot_bee = shap.plots.beeswarm(shap_values, show=False)
194
195 # Sow
196 print("\nBEE")
197 print(plot_bee)
198
199 #print(f)
200 # Waterfall
201 # .. note: not working!
202 #shap.plots.waterfall(shap_values[0], max_display=14)
203
204 # Force plot
205 # .. note: not working!
206 plot_force = shap.plots.force(explainer.expected_value,
207     explainer.shap_values(X_train), X_train,
208     matplotlib=False, show=False)
209
210 # Show
211 print("\nFORCE:")
212 print(plot_force)
213 print(plot_force.html())
214 print(shap.save_html('e.html', plot_force))
215 """

Total running time of the script: ( 0 minutes 3.036 seconds)

Gallery generated by Sphinx-Gallery