论文标题
最短的理由是人类理解的最佳解释吗?
Are Shortest Rationales the Best Explanations for Human Understanding?
论文作者
论文摘要
现有的自我解释模型通常有利于提取最短的理由 - 输入文本的摘要“负责“相应的输出)来解释模型预测,并假设较短的理由对人类更为直观。但是,此假设尚未得到验证。最短的理由确实是最容易理解的吗?为了回答这个问题,我们设计了一个自我解释的模型Limitedink,该模型允许用户以任何目标长度提取理由。与现有的基线相比,有限的ink实现了兼容的终端表现和人为宣布的理由协议,使其成为了最近的自我解释模型的合适代表。我们使用LimitedINK来进行一项用户研究,以了解基本原理长度的影响,我们要求人类法官仅基于有限的生成的理由来预测文档的情感标签,这些原理具有不同的长度。我们表明的原理太短了,没有帮助人类比随机掩盖的文本更好地预测标签,这表明需要更仔细的人类理性。
Existing self-explaining models typically favor extracting the shortest possible rationales - snippets of an input text "responsible for" corresponding output - to explain the model prediction, with the assumption that shorter rationales are more intuitive to humans. However, this assumption has yet to be validated. Is the shortest rationale indeed the most human-understandable? To answer this question, we design a self-explaining model, LimitedInk, which allows users to extract rationales at any target length. Compared to existing baselines, LimitedInk achieves compatible end-task performance and human-annotated rationale agreement, making it a suitable representation of the recent class of self-explaining models. We use LimitedInk to conduct a user study on the impact of rationale length, where we ask human judges to predict the sentiment label of documents based only on LimitedInk-generated rationales with different lengths. We show rationales that are too short do not help humans predict labels better than randomly masked text, suggesting the need for more careful design of the best human rationales.