论文标题

桌面开放域QA的混合链上的推理

Reasoning over Hybrid Chain for Table-and-Text Open Domain QA

论文作者

Zhong, Wanjun, Huang, Junjie, Liu, Qian, Zhou, Ming, Wang, Jiahai, Yin, Jian, Duan, Nan

论文摘要

表格和文本问题答案要求系统对异质信息进行推理,考虑表结构以及表格和文本之间的连接。在本文中,我们提出了一个以链条为中心的推理和预训练框架(CARP)。鲤鱼利用混合链对桌子和文本进行了明确的中间推理过程来建模,以回答。我们还提出了一种新颖的以链为中心的预训练方法,以增强预先训练的模型,以识别交叉模式推理过程并减轻数据稀疏问题。该方法通过合成Wikipedia的伪异构推理路径并产生相应的问题来构建大规模推理语料库。我们在OTT-QA上评估了我们的系统,OTT-QA是一个大规模的表和文本开放域问题,可以回答基准,我们的系统实现了最新的性能。进一步的分析表明,显式混合链提供了中间推理过程的实质性提高和解释性,而以链为中心的预训练可以提高链提取的性能。

Tabular and textual question answering requires systems to perform reasoning over heterogeneous information, considering table structure, and the connections among table and text. In this paper, we propose a ChAin-centric Reasoning and Pre-training framework (CARP). CARP utilizes hybrid chain to model the explicit intermediate reasoning process across table and text for question answering. We also propose a novel chain-centric pre-training method, to enhance the pre-trained model in identifying the cross-modality reasoning process and alleviating the data sparsity problem. This method constructs the large-scale reasoning corpus by synthesizing pseudo heterogeneous reasoning paths from Wikipedia and generating corresponding questions. We evaluate our system on OTT-QA, a large-scale table-and-text open-domain question answering benchmark, and our system achieves the state-of-the-art performance. Further analyses illustrate that the explicit hybrid chain offers substantial performance improvement and interpretablity of the intermediate reasoning process, and the chain-centric pre-training boosts the performance on the chain extraction.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源