通过合成代码转换文本生成优化双语神经传感器

论文标题

通过合成代码转换文本生成优化双语神经传感器

Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation

论文作者

Nguyen, Thien, Tran, Nathalie, Deng, Liuhui, da Silva, Thiago Fraga, Radzihovsky, Matthew, Hsiao, Roger, Mason, Henry, Braun, Stefan, McDermott, Erik, Can, Dogan, Swietojanski, Pawel, Verwimp, Lyan, Oyman, Sibel, Arvizo, Tresi, Silovsky, Honza, Ghoshal, Arnab, Martel, Mathieu, Ambati, Bharat Ram, Ali, Mohamed

论文摘要

代码切换描述了在同一句子中使用多种语言的实践。在这项研究中，我们研究了如何优化基于神经传感器的双语自动语音识别（ASR）模型，以进行代码转换语音。专注于在没有监督的代码转换数据的情况下对ASR模型进行培训的场景，我们发现半监督培训和合成代码转换数据可以改善双语ASR系统在代码转换语音上。我们分析了每个神经传感器的编码器如何通过测量特定于编码器的召回值并在Ascend数据集上评估我们的英语/普通话系统来有助于代码转换性能。我们的最终系统在上升的英语/普通话代码开关测试集上达到了25％的混合错误率（MER） - 与以前的文献相比，MER的绝对性降低了2.1％ - 同时保持单语测试集的良好精度。

Code-switching describes the practice of using more than one language in the same sentence. In this study, we investigate how to optimize a neural transducer based bilingual automatic speech recognition (ASR) model for code-switching speech. Focusing on the scenario where the ASR model is trained without supervised code-switching data, we found that semi-supervised training and synthetic code-switched data can improve the bilingual ASR system on code-switching speech. We analyze how each of the neural transducer's encoders contributes towards code-switching performance by measuring encoder-specific recall values, and evaluate our English/Mandarin system on the ASCEND data set. Our final system achieves 25% mixed error rate (MER) on the ASCEND English/Mandarin code-switching test set -- reducing the MER by 2.1% absolute compared to the previous literature -- while maintaining good accuracy on the monolingual test sets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题