探索基于语音识别的自动广告检测的语言功能和模型组合

论文标题

探索基于语音识别的自动广告检测的语言功能和模型组合

Exploring linguistic feature and model combination for speech recognition based automatic AD detection

论文作者

Wang, Yi, Wang, Tianzi, Ye, Zi, Meng, Lingwei, Hu, Shoukang, Wu, Xixin, Liu, Xunying, Meng, Helen

论文摘要

阿尔茨海默氏病（AD）的早期诊断对于促进预防性护理和延迟进展至关重要。基于语音的自动广告筛选系统为其他临床筛查技术提供了一种非侵入性，更可扩展的替代方案。此类专业数据的稀缺性会导致模型选择和开发此类系统时特征学习的不确定性。为此，本文研究了特征和模型组合方法的使用，以改善BERT和Roberta预训练的文本对有限数据的域进行微调的鲁棒性，然后在由此产生的嵌入功能被馈入后端分类器集合中，以通过大量投票来制作最终的广告检测决策。在ADRESS20挑战数据集上进行的实验表明，使用模型和功能组合在系统开发中获得了一致的性能改进。在ADRESS20测试集上分别获得了91.67％和93.75％的最先进的AD检测精度，并分别获得了由48位老年人的ADRESS20测试集获得的。

Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and delay progression. Speech based automatic AD screening systems provide a non-intrusive and more scalable alternative to other clinical screening techniques. Scarcity of such specialist data leads to uncertainty in both model selection and feature learning when developing such systems. To this end, this paper investigates the use of feature and model combination approaches to improve the robustness of domain fine-tuning of BERT and Roberta pre-trained text encoders on limited data, before the resulting embedding features being fed into an ensemble of backend classifiers to produce the final AD detection decision via majority voting. Experiments conducted on the ADReSS20 Challenge dataset suggest consistent performance improvements were obtained using model and feature combination in system development. State-of-the-art AD detection accuracies of 91.67 percent and 93.75 percent were obtained using manual and ASR speech transcripts respectively on the ADReSS20 test set consisting of 48 elderly speakers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题