手语翻译的令牌级对比框架

论文标题

手语翻译的令牌级对比框架

A Token-level Contrastive Framework for Sign Language Translation

论文作者

Fu, Biao, Ye, Peigen, Zhang, Liang, Yu, Pei, Hu, Cong, Chen, Yidong, Shi, Xiaodong

论文摘要

手语翻译（SLT）是一项有前途的技术，可以弥合聋人与听力人之间的沟通差距。最近，研究人员采用了通常需要大规模训练的神经机器翻译（NMT）方法才能实现SLT。但是，公开可用的SLT语料库非常有限，这导致令牌表示的崩溃和生成的令牌的不准确性。 To alleviate this issue, we propose ConSLT, a novel token-level \textbf{Con}trastive learning framework for \textbf{S}ign \textbf{L}anguage \textbf{T}ranslation , which learns effective token representations by incorporating token-level contrastive learning into the SLT decoding process.具体而言，Conslt将每个令牌及其在解码过程中作为正对生成的代币及其对应物作为正对，然后在词汇中随机示例$ k $令牌，而当前句子中不在当前句子中构造负面示例。我们对端到端和级联设置进行了两个基准（Phoenix14t和CSL）进行全面的实验。实验结果表明，与强质基线相比，Conslt可以实现更好的翻译质量。

Sign Language Translation (SLT) is a promising technology to bridge the communication gap between the deaf and the hearing people. Recently, researchers have adopted Neural Machine Translation (NMT) methods, which usually require large-scale corpus for training, to achieve SLT. However, the publicly available SLT corpus is very limited, which causes the collapse of the token representations and the inaccuracy of the generated tokens. To alleviate this issue, we propose ConSLT, a novel token-level \textbf{Con}trastive learning framework for \textbf{S}ign \textbf{L}anguage \textbf{T}ranslation , which learns effective token representations by incorporating token-level contrastive learning into the SLT decoding process. Concretely, ConSLT treats each token and its counterpart generated by different dropout masks as positive pairs during decoding, and then randomly samples $K$ tokens in the vocabulary that are not in the current sentence to construct negative examples. We conduct comprehensive experiments on two benchmarks (PHOENIX14T and CSL-Daily) for both end-to-end and cascaded settings. The experimental results demonstrate that ConSLT can achieve better translation quality than the strong baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题