预训练的语言模型可以将明喻解释为像人类一样聪明吗？

论文标题

预训练的语言模型可以将明喻解释为像人类一样聪明吗？

Can Pre-trained Language Models Interpret Similes as Smart as Human?

论文作者

He, Qianyu, Cheng, Sijie, Li, Zhixu, Xie, Rui, Xiao, Yanghua

论文摘要

明喻解释是自然语言处理中的至关重要任务。如今，预先训练的语言模型（PLM）已在许多任务上实现了最先进的表现。但是，PLM是否可以解释明喻，它仍然尚未探索。在本文中，我们通过设计一个名为Simile属性探测的新任务，即让PLMS推断emimiles的共享属性，从而研究了PLM在明喻解释中的能力。我们从一般文本语料库和人类设计的问题中构建了明喻属性探测数据集，其中包含涵盖七个主要类别的1,633个示例。我们基于构造的数据集的经验研究表明，PLM可以在仍然表现不佳的同时推断明喻的共享特性。为了弥补人类绩效的差距，我们还通过通过知识嵌入方法将明喻知识纳入PLM来设计一个知识增强的培训目标。我们的方法在探测任务中的增长率为8.58％，情感分类的下游任务中的增长率为1.37％。数据集和代码可在https://github.com/abbey4799/plms-interpret-simile上公开获得。

Simile interpretation is a crucial task in natural language processing. Nowadays, pre-trained language models (PLMs) have achieved state-of-the-art performance on many tasks. However, it remains under-explored whether PLMs can interpret similes or not. In this paper, we investigate the ability of PLMs in simile interpretation by designing a novel task named Simile Property Probing, i.e., to let the PLMs infer the shared properties of similes. We construct our simile property probing datasets from both general textual corpora and human-designed questions, containing 1,633 examples covering seven main categories. Our empirical study based on the constructed datasets shows that PLMs can infer similes' shared properties while still underperforming humans. To bridge the gap with human performance, we additionally design a knowledge-enhanced training objective by incorporating the simile knowledge into PLMs via knowledge embedding methods. Our method results in a gain of 8.58% in the probing task and 1.37% in the downstream task of sentiment classification. The datasets and code are publicly available at https://github.com/Abbey4799/PLMs-Interpret-Simile.

下载PDF全文

下载文献需遵守相关版权规定

论文标题