论文标题

通过基于元学习的模型重新定位改进了端到端质心识别语音识别

Improved End-to-End Dysarthric Speech Recognition via Meta-learning Based Model Re-initialization

论文作者

Wang, Disong, Yu, Jianwei, Wu, Xixin, Sun, Lifa, Liu, Xunying, Meng, Helen

论文摘要

违反语音识别是一项具有挑战性的任务,因为违反数据的数据受到限制,其声学与正常语音显着偏离。基于模型的扬声器适应是一种有前途的方法,它使用有限的质心语音来微调一种基本模型,该模型已通过大量正常语音进行了预先训练以获得依赖说话者的模型。但是,统计分布在正常语音数据和违反语音数据之间限制了基本模型的适应性性能。为了解决这个问题,我们建议通过元学习重新定位基础模型,以获得更好的模型初始化。具体而言,我们专注于端到端模型,并通过反复模拟对不同质心扬声器的适应来扩展模型不合时宜的元学习(MAML)和爬行动物算法来更新基本模型。结果,重新定位的模型获得了违反语音的语音知识,并学习了如何进行快速适应,以提高性能的不看到违反障碍扬声器。对Uapeech数据集的实验结果表明,与基本模型相比,使用提出方法的最佳模型可实现54.2%和7.6%的相对单词误差率降低,而无需填充模​​型,并且该模型分别从基本模型中直接进行了微调,并且与最新的ART HYBRID HYBRID HYBRID DNN-HMM模型相当。

Dysarthric speech recognition is a challenging task as dysarthric data is limited and its acoustics deviate significantly from normal speech. Model-based speaker adaptation is a promising method by using the limited dysarthric speech to fine-tune a base model that has been pre-trained from large amounts of normal speech to obtain speaker-dependent models. However, statistic distribution mismatches between the normal and dysarthric speech data limit the adaptation performance of the base model. To address this problem, we propose to re-initialize the base model via meta-learning to obtain a better model initialization. Specifically, we focus on end-to-end models and extend the model-agnostic meta learning (MAML) and Reptile algorithms to meta update the base model by repeatedly simulating adaptation to different dysarthric speakers. As a result, the re-initialized model acquires dysarthric speech knowledge and learns how to perform fast adaptation to unseen dysarthric speakers with improved performance. Experimental results on UASpeech dataset show that the best model with proposed methods achieves 54.2% and 7.6% relative word error rate reduction compared with the base model without finetuning and the model directly fine-tuned from the base model, respectively, and it is comparable with the state-of-the-art hybrid DNN-HMM model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源