论文标题
评估自动语音识别系统的对话语音:语言观点
Evaluation of Automated Speech Recognition Systems for Conversational Speech: A Linguistic Perspective
论文作者
论文摘要
自动语音识别(ASR)符合更多非正式和自由形式的输入数据,因为语音用户界面和对话剂(例如Alexa,Google Home等语音助手)获得了受欢迎程度。会话语音既是语音识别的最困难和与环境相关的数据。在本文中,我们采用语言观点,并将法语作为案例研究,以消除法国同音词的歧义。我们的贡献旨在在条件下对人类语音转录准确性的更多见解,以重现最先进的ASR系统的言语转录准确性,尽管处于非常重视的情况下。我们研究了一个案例研究,涉及法语自动转录中遇到的最常见错误。
Automatic speech recognition (ASR) meets more informal and free-form input data as voice user interfaces and conversational agents such as the voice assistants such as Alexa, Google Home, etc., gain popularity. Conversational speech is both the most difficult and environmentally relevant sort of data for speech recognition. In this paper, we take a linguistic perspective, and take the French language as a case study toward disambiguation of the French homophones. Our contribution aims to provide more insight into human speech transcription accuracy in conditions to reproduce those of state-of-the-art ASR systems, although in a much focused situation. We investigate a case study involving the most common errors encountered in the automatic transcription of French language.