桌面开放域QA的混合链上的推理

论文标题

桌面开放域QA的混合链上的推理

Reasoning over Hybrid Chain for Table-and-Text Open Domain QA

论文作者

Zhong, Wanjun, Huang, Junjie, Liu, Qian, Zhou, Ming, Wang, Jiahai, Yin, Jian, Duan, Nan

论文摘要

表格和文本问题答案要求系统对异质信息进行推理，考虑表结构以及表格和文本之间的连接。在本文中，我们提出了一个以链条为中心的推理和预训练框架（CARP）。鲤鱼利用混合链对桌子和文本进行了明确的中间推理过程来建模，以回答。我们还提出了一种新颖的以链为中心的预训练方法，以增强预先训练的模型，以识别交叉模式推理过程并减轻数据稀疏问题。该方法通过合成Wikipedia的伪异构推理路径并产生相应的问题来构建大规模推理语料库。我们在OTT-QA上评估了我们的系统，OTT-QA是一个大规模的表和文本开放域问题，可以回答基准，我们的系统实现了最新的性能。进一步的分析表明，显式混合链提供了中间推理过程的实质性提高和解释性，而以链为中心的预训练可以提高链提取的性能。

Tabular and textual question answering requires systems to perform reasoning over heterogeneous information, considering table structure, and the connections among table and text. In this paper, we propose a ChAin-centric Reasoning and Pre-training framework (CARP). CARP utilizes hybrid chain to model the explicit intermediate reasoning process across table and text for question answering. We also propose a novel chain-centric pre-training method, to enhance the pre-trained model in identifying the cross-modality reasoning process and alleviating the data sparsity problem. This method constructs the large-scale reasoning corpus by synthesizing pseudo heterogeneous reasoning paths from Wikipedia and generating corresponding questions. We evaluate our system on OTT-QA, a large-scale table-and-text open-domain question answering benchmark, and our system achieves the state-of-the-art performance. Further analyses illustrate that the explicit hybrid chain offers substantial performance improvement and interpretablity of the intermediate reasoning process, and the chain-centric pre-training boosts the performance on the chain extraction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题