论文标题
基于知识的模板机在低资源设置中的翻译
Knowledge Based Template Machine Translation In Low-Resource Setting
论文作者
论文摘要
将标签纳入神经机器翻译(NMT)系统已显示出令人鼓舞的结果,可帮助翻译诸如命名实体(NE)之类的稀有单词。但是,在低资源环境中翻译NE仍然是一个挑战。在这项工作中,我们研究了在不同级别的资源条件下,在平行语料库中使用标签和NE Hypernyms的效果。我们发现标签和复制机制(标记源句子中的NES并将其复制到目标句子)仅在高资源设置中改进翻译。引入复制还会导致翻译不同语音部分(POS)的两极化效果。有趣的是,我们发现高鼻的复制精度始终高于实体。为了避免在引导稀有实体中“硬”复制和使用HyperNym的一种方式,我们引入了“软”标记机构,并在高资源和低资产设置中发现了一致的改进。
Incorporating tagging into neural machine translation (NMT) systems has shown promising results in helping translate rare words such as named entities (NE). However, translating NE in low-resource setting remains a challenge. In this work, we investigate the effect of using tags and NE hypernyms from knowledge graphs (KGs) in parallel corpus in different levels of resource conditions. We find the tag-and-copy mechanism (tag the NEs in the source sentence and copy them to the target sentence) improves translation in high-resource settings only. Introducing copying also results in polarizing effects in translating different parts-of-speech (POS). Interestingly, we find that copy accuracy for hypernyms is consistently higher than that of entities. As a way of avoiding "hard" copying and utilizing hypernym in bootstrapping rare entities, we introduced a "soft" tagging mechanism and found consistent improvement in high and low-resource settings.