论文标题

正确的史上:一个全自动系统,用于语音校正和减少口音

CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction

论文作者

Tan, Daxin, Deng, Liqun, Zheng, Nianzu, Yeung, Yu Ting, Jiang, Xin, Chen, Xiao, Lee, Tan

论文摘要

这项研究提出了一个完全自动化的系统,用于语音校正和减少口音。考虑录制的语音音频包含某些错误的应用程序方案,例如,不适当的单词,错误发音,需要纠正。所提出的名为CrieftSpeech的系统分为三个步骤进行校正:识别记录的语音并将其转换为时戳符号序列,将识别的符号序列与目标文本对齐,以确定所需的编辑操作的位置和类型,并生成校正后的语音。实验表明,校正语音的质量和自然性取决于语音识别和对齐模块的性能以及编辑操作的粒度水平。在两个语料库中评估了所提出的系统:VCTK和L2-极an的手动扰动版本。结果表明,我们的系统能够纠正错误发音并减少语音记录中的重音。音频样本可在线进行演示https://daxintan-cuhk.github.io/correctspeech/。

This study propose a fully automated system for speech correction and accent reduction. Consider the application scenario that a recorded speech audio contains certain errors, e.g., inappropriate words, mispronunciations, that need to be corrected. The proposed system, named CorrectSpeech, performs the correction in three steps: recognizing the recorded speech and converting it into time-stamped symbol sequence, aligning recognized symbol sequence with target text to determine locations and types of required edit operations, and generating the corrected speech. Experiments show that the quality and naturalness of corrected speech depend on the performance of speech recognition and alignment modules, as well as the granularity level of editing operations. The proposed system is evaluated on two corpora: a manually perturbed version of VCTK and L2-ARCTIC. The results demonstrate that our system is able to correct mispronunciation and reduce accent in speech recordings. Audio samples are available online for demonstration https://daxintan-cuhk.github.io/CorrectSpeech/ .

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源