阿里巴巴 - 翻译中国提交的WMT 2022质量估计共享任务

论文标题

阿里巴巴 - 翻译中国提交的WMT 2022质量估计共享任务

Alibaba-Translate China's Submission for WMT 2022 Quality Estimation Shared Task

论文作者

Bao, Keqin, Wan, Yu, Liu, Dayiheng, Yang, Baosong, Lei, Wenqiang, He, Xiangnan, Wong, Derek F., Xie, Jun

论文摘要

在本文中，我们以质量估计共享任务（统一翻译评估）提交给句子级MQM基准的提交。具体而言，我们的系统采用了Unite的框架，该框架在培训期间与预训练的语言模型结合了三种类型的输入格式。首先，我们将伪标记的数据示例应用于连续的预训练阶段。值得注意的是，为了减少培训和微调之间的差距，我们使用数据修剪和基于排名的分数归一化策略。对于微调阶段，我们使用了过去几年WMT竞赛的直接评估（DA）和多维质量指标（MQM）数据。最后，我们收集了仅源评估结果，并集成了两个Unite模型产生的预测，其骨架分别为XLM-R和Infoxlm。结果表明，我们的模型在多语言和英语俄罗斯的环境中排名第一，在英语 - 德语和中文英语环境中排名第二，在今年的质量估计竞赛中表现出相对较强的表现。

In this paper, we present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE (Unified Translation Evaluation). Specifically, our systems employ the framework of UniTE, which combined three types of input formats during training with a pre-trained language model. First, we apply the pseudo-labeled data examples for the continuously pre-training phase. Notably, to reduce the gap between pre-training and fine-tuning, we use data pruning and a ranking-based score normalization strategy. For the fine-tuning phase, we use both Direct Assessment (DA) and Multidimensional Quality Metrics (MQM) data from past years' WMT competitions. Finally, we collect the source-only evaluation results, and ensemble the predictions generated by two UniTE models, whose backbones are XLM-R and InfoXLM, respectively. Results show that our models reach 1st overall ranking in the Multilingual and English-Russian settings, and 2nd overall ranking in English-German and Chinese-English settings, showing relatively strong performances in this year's quality estimation competition.

下载PDF全文

下载文献需遵守相关版权规定

论文标题