论文标题

在序列到序列学习中理解和改进编码器层融合

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning

论文作者

Liu, Xuebo, Wang, Longyue, Wong, Derek F., Ding, Liang, Chao, Lidia S., Tu, Zhaopeng

论文摘要

编码器层融合(EncoderFusion)是一种用于序列到序列(SEQ2SEQ)模型的所有编码层(而不是最高层)的技术,该技术已证明对各种NLP任务有效。但是,目前尚不清楚为什么以及何时编码应该起作用。在本文中,我们的主要贡献是进一步了解编码灌注。以前的许多研究都认为,编码的成功源于嵌入到下层编码器层中的表面和句法信息。与他们不同,我们发现编码器嵌入层比其他中间编码层更重要。此外,上层解码器层始终向NLP任务上的编码器嵌入层持续更多的注意。基于此观察结果,我们通过仅融合软max层的编码器嵌入层来提出一种简单的融合方法,即表面性。实验结果表明,表面灌注优于几个NLP基准上的编码,包括机器翻译,文本摘要和语法误差校正。它在WMT16罗马尼亚英语和WMT14英语 - 法语翻译任务上获得了最先进的性能。广泛的分析表明,表面灌注可以通过在相关源和目标嵌入之间建立更紧密的关系来学习更具表现力的双语单词嵌入。源代码可在https://github.com/sunbowliu/surfacefusion上免费获得。

Encoder layer fusion (EncoderFusion) is a technique to fuse all the encoder layers (instead of the uppermost layer) for sequence-to-sequence (Seq2Seq) models, which has proven effective on various NLP tasks. However, it is still not entirely clear why and when EncoderFusion should work. In this paper, our main contribution is to take a step further in understanding EncoderFusion. Many of previous studies believe that the success of EncoderFusion comes from exploiting surface and syntactic information embedded in lower encoder layers. Unlike them, we find that the encoder embedding layer is more important than other intermediate encoder layers. In addition, the uppermost decoder layer consistently pays more attention to the encoder embedding layer across NLP tasks. Based on this observation, we propose a simple fusion method, SurfaceFusion, by fusing only the encoder embedding layer for the softmax layer. Experimental results show that SurfaceFusion outperforms EncoderFusion on several NLP benchmarks, including machine translation, text summarization, and grammatical error correction. It obtains the state-of-the-art performance on WMT16 Romanian-English and WMT14 English-French translation tasks. Extensive analyses reveal that SurfaceFusion learns more expressive bilingual word embeddings by building a closer relationship between relevant source and target embedding. Source code is freely available at https://github.com/SunbowLiu/SurfaceFusion.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源