评估变压器中的突变表示和组成

论文标题

评估变压器中的突变表示和组成

Assessing Phrasal Representation and Composition in Transformers

论文作者

Yu, Lang, Ettinger, Allyson

论文摘要

Deep Transformer模型将NLP任务的性能推向了新的限制，这表明对复杂语言输入（例如短语）的复杂处理。但是，我们对这些模型如何处理短语表示的理解有限，这是否反映了像人类所做的那样的短语含义的复杂组成。在本文中，我们介绍了最先进的预训练变压器中的短语表示的系统分析。我们使用利用人类对短语相似性和含义转移的判断的测试，并比较控制单词重叠之前和之后的结果，以分开词汇效应与组成效应。我们发现这些模型中的短语表示在很大程度上取决于单词内容，而几乎没有细微的组成证据。我们还确定了跨模型，层和表示类型的短语表示质量的变化，并为使用这些模型的表示形式提出相应的建议。

Deep transformer models have pushed performance on NLP tasks to new limits, suggesting sophisticated treatment of complex linguistic inputs, such as phrases. However, we have limited understanding of how these models handle representation of phrases, and whether this reflects sophisticated composition of phrase meaning like that done by humans. In this paper, we present systematic analysis of phrasal representations in state-of-the-art pre-trained transformers. We use tests leveraging human judgments of phrase similarity and meaning shift, and compare results before and after control of word overlap, to tease apart lexical effects versus composition effects. We find that phrase representation in these models relies heavily on word content, with little evidence of nuanced composition. We also identify variations in phrase representation quality across models, layers, and representation types, and make corresponding recommendations for usage of representations from these models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题