论文标题
ASRU 2019普通话 - 英语转换语音识别挑战:打开数据集,轨道,方法和结果
The ASRU 2019 Mandarin-English Code-Switching Speech Recognition Challenge: Open Datasets, Tracks, Methods and Results
论文作者
论文摘要
代码转换(CS)是一种普遍现象,识别CS语音具有挑战性。但是CS语音数据很少,并且在相关研究中没有常见的测试。本文描述了ASRU 2019英语 - 英语代码转换语音识别挑战的设计和主要结果,该挑战旨在改善普通话 - 英语代码转换情况下的ASR性能。 500小时的普通话数据和240小时的普通话 - 英语内术中CS数据已向参与者发布。设定了三个曲目,以推进传统DNN-HMM ASR系统中的AM和LM部分,并探索E2E模型的性能。然后,本文概述了三个曲目中的结果和系统性能。事实证明,传统的ASR系统受益于发音词典,CS文本生成和数据增强。但是,在E2E轨道中,结果强调了使用语言识别,建立一组合理的建模单元和规范仪的重要性。讨论了模型培训和方法肯定中的其他细节。
Code-switching (CS) is a common phenomenon and recognizing CS speech is challenging. But CS speech data is scarce and there' s no common testbed in relevant research. This paper describes the design and main outcomes of the ASRU 2019 Mandarin-English code-switching speech recognition challenge, which aims to improve the ASR performance in Mandarin-English code-switching situation. 500 hours Mandarin speech data and 240 hours Mandarin-English intra-sentencial CS data are released to the participants. Three tracks were set for advancing the AM and LM part in traditional DNN-HMM ASR system, as well as exploring the E2E models' performance. The paper then presents an overview of the results and system performance in the three tracks. It turns out that traditional ASR system benefits from pronunciation lexicon, CS text generating and data augmentation. In E2E track, however, the results highlight the importance of using language identification, building-up a rational set of modeling units and spec-augment. The other details in model training and method comparsion are discussed.