对医疗ASR的标点符号和TRUECAS的强大预测

论文标题

对医疗ASR的标点符号和TRUECAS的强大预测

Robust Prediction of Punctuation and Truecasing for Medical ASR

论文作者

Sunkara, Monica, Ronanki, Srikanth, Dixit, Kalpit, Bodapati, Sravan, Kirchhoff, Katrin

论文摘要

医疗领域中的自动语音识别（ASR）系统的重点是转录临床概念和医生对话，通常由于域的复杂性而构成许多挑战。 ASR output typically undergoes automatic punctuation to enable users to speak naturally, without having to vocalise awkward and explicit punctuation commands, such as "period", "add comma" or "exclamation point", while truecasing enhances user readability and improves the performance of downstream NLP tasks.本文提出了一个有条件的关节建模框架，用于使用审计的蒙版语言模型（例如Bert，Biobert和Roberta）预测标点符号和truecastion。我们还通过通过医学域数据对掩盖语言模型进行微调掩盖语言模型提供了针对领域和任务适应的技术。最后，我们通过执行数据增强来提高模型对ASR中常见错误的鲁棒性。在概念和对话风格的情况下进行的实验表明，我们提出的模型在地面真理文本上实现了约5％的绝对改善，而在F1度量下，ASR输出对ASR输出的提高了约10％。

Automatic speech recognition (ASR) systems in the medical domain that focus on transcribing clinical dictations and doctor-patient conversations often pose many challenges due to the complexity of the domain. ASR output typically undergoes automatic punctuation to enable users to speak naturally, without having to vocalise awkward and explicit punctuation commands, such as "period", "add comma" or "exclamation point", while truecasing enhances user readability and improves the performance of downstream NLP tasks. This paper proposes a conditional joint modeling framework for prediction of punctuation and truecasing using pretrained masked language models such as BERT, BioBERT and RoBERTa. We also present techniques for domain and task specific adaptation by fine-tuning masked language models with medical domain data. Finally, we improve the robustness of the model against common errors made in ASR by performing data augmentation. Experiments performed on dictation and conversational style corpora show that our proposed model achieves ~5% absolute improvement on ground truth text and ~10% improvement on ASR outputs over baseline models under F1 metric.

下载PDF全文

下载文献需遵守相关版权规定

论文标题