局部可解释模型无关解释

局部可解释模型无关解释#

API 参考链接: LimeTabular

查看 LIME 的支持仓库此处。

总结

局部可解释模型无关解释（LIME）[1] 是一种方法，它在任何黑盒模型预测的决策空间周围拟合一个代理白盒模型。LIME 明确尝试对任何预测的局部邻域进行建模——通过聚焦于足够窄的决策表面，即使简单的线性模型也能很好地近似黑盒模型的行为。用户随后可以检查该白盒模型，以了解黑盒模型在该区域的行为方式。

LIME 的工作原理是扰动任何单个数据点并生成合成数据，这些数据由黑盒系统进行评估，并最终用作白盒模型的训练集。LIME 的优点是您可以像理解线性模型一样解释解释，并且它几乎可以用于任何模型。另一方面，解释有时不稳定，并且高度依赖于扰动过程。

工作原理

Christoph Molnar 的电子书《可解释机器学习》[2] 对 LIME 有很好的概述，可在此处找到。

关于其思想的论文“Why should i trust you?” Explaining the predictions of any classifier.” [1] 可以在 arXiv 上在此处找到。

如果您觉得视频是学习此算法更好的媒介，您可以在下方找到作者 Marco Tulio Ribeiro 对该算法的概念性概述：

代码示例

以下代码将针对乳腺癌数据集训练一个黑盒管道。之后，它将使用 LIME 解释该管道及其决策。提供的可视化将用于局部解释。

from interpret import set_visualize_provider
from interpret.provider import InlineProvider
set_visualize_provider(InlineProvider())

import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestClassifier
from sklearn.decomposition import PCA
from sklearn.pipeline import Pipeline

from interpret import show
from interpret.blackbox import LimeTabular

seed = 42
np.random.seed(seed)
X, y = load_breast_cancer(return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=seed)

pca = PCA()
rf = RandomForestClassifier(random_state=seed)

blackbox_model = Pipeline([('pca', pca), ('rf', rf)])
blackbox_model.fit(X_train, y_train)

lime = LimeTabular(blackbox_model, X_train)

show(lime.explain_local(X_test[:5], y_test[:5]), 0)

参考文献

[1] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 1135–1144. 2016.

[2] Christoph Molnar. Interpretable machine learning. Lulu. com, 2020.

局部可解释模型无关解释

局部可解释模型无关解释#

总结

工作原理

代码示例

更多资源

参考文献