模因：通过模型提取生成RNN模型解释

论文标题

模因：通过模型提取生成RNN模型解释

MEME: Generating RNN Model Explanations via Model Extraction

论文作者

Kazhdan, Dmitry, Dimanov, Botty, Jamnik, Mateja, Liò, Pietro

论文摘要

经常性的神经网络（RNN）在一系列任务上取得了出色的表现。进一步授权基于RNN的方法的关键步骤是提高其解释性和解释性。在这项工作中，我们提出了模因：一种模型提取方法，能够以人为理解的概念及其相互作用为代表的可解释模型近似RNN。我们证明了如何将模因应用于两个多变量的连续数据案例研究：房间职业预测和院内死亡率预测。使用这些案例研究，我们通过可解释的概念相互作用近似于RNN的决策来展示如何使用我们提取的模型来解释本地和全球的RNN。

Recurrent Neural Networks (RNNs) have achieved remarkable performance on a range of tasks. A key step to further empowering RNN-based approaches is improving their explainability and interpretability. In this work we present MEME: a model extraction approach capable of approximating RNNs with interpretable models represented by human-understandable concepts and their interactions. We demonstrate how MEME can be applied to two multivariate, continuous data case studies: Room Occupation Prediction, and In-Hospital Mortality Prediction. Using these case-studies, we show how our extracted models can be used to interpret RNNs both locally and globally, by approximating RNN decision-making via interpretable concept interactions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题