基于多功能和多语言融合的语音情感识别

论文标题

基于多功能和多语言融合的语音情感识别

Speech Emotion Recognition Based on Multi-feature and Multi-lingual Fusion

论文作者

Wang, Chunyi

论文摘要

提出了基于多功能和多语言融合的语音情感识别算法，以解决缺乏大型语音数据集和声学特征低鲁棒性在识别语音情感的情况下引起的低识别精度。首先，从中文和英语语音情绪中提取手工制作和深层自动功能。然后，分别融合了各种功能。最后，不同语言的融合功能再次融合并通过分类模型进行了训练。将融合功能与未连接的特征区分开来，结果表明，融合功能显着提高了语音情感识别算法的准确性。在两个中国语料库和两个英语语料库上评估所提出的解决方案，与原始解决方案相比，可以提供更准确的预测。这项研究的结果是，当数据集较小时，多功能和多语言融合算法可以显着提高语音情感识别精度。

A speech emotion recognition algorithm based on multi-feature and Multi-lingual fusion is proposed in order to resolve low recognition accuracy caused by lack of large speech dataset and low robustness of acoustic features in the recognition of speech emotion. First, handcrafted and deep automatic features are extracted from existing data in Chinese and English speech emotions. Then, the various features are fused respectively. Finally, the fused features of different languages are fused again and trained in a classification model. Distinguishing the fused features with the unfused ones, the results manifest that the fused features significantly enhance the accuracy of speech emotion recognition algorithm. The proposed solution is evaluated on the two Chinese corpus and two English corpus, and is shown to provide more accurate predictions compared to original solution. As a result of this study, the multi-feature and Multi-lingual fusion algorithm can significantly improve the speech emotion recognition accuracy when the dataset is small.

下载PDF全文

下载文献需遵守相关版权规定

论文标题