论文标题
专注于目标的词汇:用于机器翻译的蒙版标签平滑
Focus on the Target's Vocabulary: Masked Label Smoothing for Machine Translation
论文作者
论文摘要
标签平滑和词汇共享是神经机器翻译模型中两种广泛使用的技术。但是,我们认为,仅仅应用两种技术可能是矛盾的,甚至会导致次优性能。当分配平滑概率时,原始标签平滑性将对目标语言中永远不会出现的源端单词与真实的目标侧单词平等出现,这可能会偏向翻译模型。为了解决这个问题,我们提出了蒙版标签平滑(MLS),这是一种掩盖源端单词的软标签概率为零的新机制。简单而有效的MLS设法更好地将标签平滑与词汇共享整合在一起。我们的广泛实验表明,MLS始终在不同数据集上的原始标签平滑度相比,包括翻译质量和模型校准的双语和多语言翻译。我们的代码在https://github.com/pkunlp-icler/mls上发布
Label smoothing and vocabulary sharing are two widely used techniques in neural machine translation models. However, we argue that simply applying both techniques can be conflicting and even leads to sub-optimal performance. When allocating smoothed probability, original label smoothing treats the source-side words that would never appear in the target language equally to the real target-side words, which could bias the translation model. To address this issue, we propose Masked Label Smoothing (MLS), a new mechanism that masks the soft label probability of source-side words to zero. Simple yet effective, MLS manages to better integrate label smoothing with vocabulary sharing. Our extensive experiments show that MLS consistently yields improvement over original label smoothing on different datasets, including bilingual and multilingual translation from both translation quality and model's calibration. Our code is released at https://github.com/PKUnlp-icler/MLS