基于规范的课程学习神经机器翻译

论文标题

基于规范的课程学习神经机器翻译

Norm-Based Curriculum Learning for Neural Machine Translation

论文作者

Liu, Xuebo, Lai, Houtim, Wong, Derek F., Chao, Lidia S.

论文摘要

神经机器翻译（NMT）系统的训练昂贵，尤其是在高资源设置的情况下。随着NMT架构变得越来越深，这个问题变得越来越糟。在本文中，我们旨在通过引入一种新型的基于规范的课程学习方法来提高培训NMT的效率。我们使用单词嵌入的标准（又称长度或模块）作为量度的度量1）句子的难度，2）模型的能力，以及3）句子的重量。基于规范的句子难度既具有语言动机和基于模型的句子困难的优势。很容易确定并包含与学习有关的特征。基于规范的模型能力使NMT以完全自动化的方式学习课程，而基于规范的句子权重进一步增强了NMT的向量表示的学习。 WMT'14英国 - 德国人和WMT'17中文英语翻译任务的实验结果表明，该方法在BLEU得分（+1.17/+1.56）和训练速度（2.22x x/3.33x）方面优于强大的基准（+1.17/+1.56）。

A neural machine translation (NMT) system is expensive to train, especially with high-resource settings. As the NMT architectures become deeper and wider, this issue gets worse and worse. In this paper, we aim to improve the efficiency of training an NMT by introducing a novel norm-based curriculum learning method. We use the norm (aka length or module) of a word embedding as a measure of 1) the difficulty of the sentence, 2) the competence of the model, and 3) the weight of the sentence. The norm-based sentence difficulty takes the advantages of both linguistically motivated and model-based sentence difficulties. It is easy to determine and contains learning-dependent features. The norm-based model competence makes NMT learn the curriculum in a fully automated way, while the norm-based sentence weight further enhances the learning of the vector representation of the NMT. Experimental results for the WMT'14 English-German and WMT'17 Chinese-English translation tasks demonstrate that the proposed method outperforms strong baselines in terms of BLEU score (+1.17/+1.56) and training speedup (2.22x/3.33x).

下载PDF全文

下载文献需遵守相关版权规定

论文标题