用于局部可解释的模型不合时式解释的修改后的扰动采样方法

论文标题

用于局部可解释的模型不合时式解释的修改后的扰动采样方法

A Modified Perturbed Sampling Method for Local Interpretable Model-agnostic Explanation

论文作者

Shi, Sheng, Zhang, Xinfeng, Fan, Wei

论文摘要

解释性是人工智能与社会之间的门户，因为当前流行的深度学习模型在解释推理过程和预测结果方面通常很弱。局部可解释的模型不足的解释（Lime）是一种最新技术，它通过在预测的本地学习可解释的模型来忠实地解释任何分类器的预测。但是，石灰标准实现中的采样操作有缺陷。扰动样品是由均匀分布产生的，忽略了特征之间的复杂相关性。本文提出了一种新型的石灰（MPS-lime）修饰的扰动采样操作，该操作被正式化为集团设定的构造问题。在图像分类中，MPS-lime将Superpixel图像转换为无向图。各种实验表明，黑盒模型的MPS-lime解释在可理解性，忠诚度和效率方面取得了更好的性能。

Explainability is a gateway between Artificial Intelligence and society as the current popular deep learning models are generally weak in explaining the reasoning process and prediction results. Local Interpretable Model-agnostic Explanation (LIME) is a recent technique that explains the predictions of any classifier faithfully by learning an interpretable model locally around the prediction. However, the sampling operation in the standard implementation of LIME is defective. Perturbed samples are generated from a uniform distribution, ignoring the complicated correlation between features. This paper proposes a novel Modified Perturbed Sampling operation for LIME (MPS-LIME), which is formalized as the clique set construction problem. In image classification, MPS-LIME converts the superpixel image into an undirected graph. Various experiments show that the MPS-LIME explanation of the black-box model achieves much better performance in terms of understandability, fidelity, and efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题