倾向于从语音中提取

论文标题

倾向于从语音中提取

Towards Relation Extraction From Speech

论文作者

Wu, Tongtong, Wang, Guitao, Zhao, Jinming, Liu, Zhaoran, Qi, Guilin, Li, Yuan-Fang, Haffari, Gholamreza

论文摘要

关系提取通常旨在从非结构化文本中提取实体之间的语义关系。关系提取的最重要的数据源之一是口语，例如访谈和对话。但是，在关系提取中忽略了自动语音识别（ASR）中引入的错误传播，并且很少探索基于端到端语音的关系提取方法。在本文中，我们提出了一项新的听力信息提取任务，即语音关系提取。我们通过文本到语音系统构建培训数据集，以进行语音关系提取，并通过以英语为英语的人来构建测试数据集。我们通过两种方法探索语音关系提取：使用验证的ASR模块进行基于文本的提取的管道方法，以及通过新提出的编码器模型或我们所谓的SpeechRE的End2End方法。我们进行全面的实验，以区分语音关系提取的挑战，这可能会揭示未来的探索。我们在https://github.com/wutong8023/speechre上共享代码和数据。

Relation extraction typically aims to extract semantic relationships between entities from the unstructured text. One of the most essential data sources for relation extraction is the spoken language, such as interviews and dialogues. However, the error propagation introduced in automatic speech recognition (ASR) has been ignored in relation extraction, and the end-to-end speech-based relation extraction method has been rarely explored. In this paper, we propose a new listening information extraction task, i.e., speech relation extraction. We construct the training dataset for speech relation extraction via text-to-speech systems, and we construct the testing dataset via crowd-sourcing with native English speakers. We explore speech relation extraction via two approaches: the pipeline approach conducting text-based extraction with a pretrained ASR module, and the end2end approach via a new proposed encoder-decoder model, or what we called SpeechRE. We conduct comprehensive experiments to distinguish the challenges in speech relation extraction, which may shed light on future explorations. We share the code and data on https://github.com/wutong8023/SpeechRE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题