论文标题

Winowhy:对回答Winograd模式挑战的基本常识性知识的深入诊断

WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge

论文作者

Zhang, Hongming, Zhao, Xinran, Song, Yangqiu

论文摘要

在本文中,我们介绍了回答Winograd模式挑战(WSC)的基本常识性知识的第一个全面分类。对于每个问题,我们邀请注释者首先提供做出正确决策的原因,然后将其分为六个主要的知识类别。通过这样做,我们可以更好地理解现有方法的局限性(即,使用现有方法无法有效地代表或推断出什么样的知识),并阐明了我们将来需要获得的常识知识以获得更好的常识性推理。此外,要调查当前的WSC模型是否可以理解常识,或者只是根据数据集的统计偏差解决WSC问题,我们利用收集到的理由来制定一个称为Winowhy的新任务,该任务要求模型将所有WSC问题的理由与非常相似但错误的理由区分开来。实验结果证明,即使预先训练的语言表示模型在原始WSC数据集上取得了希望的进展,但他们仍在Winowhy挣扎。进一步的实验表明,即使监督模型可以实现更好的性能,这些模型的性能也可能对数据集分布敏感。 Winowhy和所有代码均可在以下网址提供:https://github.com/hkust-knowcomp/winowhy。

In this paper, we present the first comprehensive categorization of essential commonsense knowledge for answering the Winograd Schema Challenge (WSC). For each of the questions, we invite annotators to first provide reasons for making correct decisions and then categorize them into six major knowledge categories. By doing so, we better understand the limitation of existing methods (i.e., what kind of knowledge cannot be effectively represented or inferred with existing methods) and shed some light on the commonsense knowledge that we need to acquire in the future for better commonsense reasoning. Moreover, to investigate whether current WSC models can understand the commonsense or they simply solve the WSC questions based on the statistical bias of the dataset, we leverage the collected reasons to develop a new task called WinoWhy, which requires models to distinguish plausible reasons from very similar but wrong reasons for all WSC questions. Experimental results prove that even though pre-trained language representation models have achieved promising progress on the original WSC dataset, they are still struggling at WinoWhy. Further experiments show that even though supervised models can achieve better performance, the performance of these models can be sensitive to the dataset distribution. WinoWhy and all codes are available at: https://github.com/HKUST-KnowComp/WinoWhy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源