论文标题

模因:通过模型提取生成RNN模型解释

MEME: Generating RNN Model Explanations via Model Extraction

论文作者

Kazhdan, Dmitry, Dimanov, Botty, Jamnik, Mateja, Liò, Pietro

论文摘要

经常性的神经网络(RNN)在一系列任务上取得了出色的表现。进一步授权基于RNN的方法的关键步骤是提高其解释性和解释性。在这项工作中,我们提出了模因:一种模型提取方法,能够以人为理解的概念及其相互作用为代表的可解释模型近似RNN。我们证明了如何将模因应用于两个多变量的连续数据案例研究:房间职业预测和院内死亡率预测。使用这些案例研究,我们通过可解释的概念相互作用近似于RNN的决策来展示如何使用我们提取的模型来解释本地和全球的RNN。

Recurrent Neural Networks (RNNs) have achieved remarkable performance on a range of tasks. A key step to further empowering RNN-based approaches is improving their explainability and interpretability. In this work we present MEME: a model extraction approach capable of approximating RNNs with interpretable models represented by human-understandable concepts and their interactions. We demonstrate how MEME can be applied to two multivariate, continuous data case studies: Room Occupation Prediction, and In-Hospital Mortality Prediction. Using these case-studies, we show how our extracted models can be used to interpret RNNs both locally and globally, by approximating RNN decision-making via interpretable concept interactions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源