论文标题

致密的文本富集

Dense Paraphrasing for Textual Enrichment

论文作者

Tu, Jingxuan, Rim, Kyeongmin, Holderness, Eben, Pustejovsky, James

论文摘要

理解文本中的推论和回答问题不仅需要恢复与查询术语相关的表面参数,辅助或字符串。作为人类,我们通过填写丢失的信息以及有关事件后果的推理来将句子解释为叙事或话语的上下文化组成部分。在本文中,我们定义了重写文本表达式(Lexeme或短语)的过程,使其降低了歧义,同时还可以显式地说明了(一定)在句子结构经济中不表达的基本语义作为密集的术语(DP)。我们构建了第一个完整的DP数据集,提供注释任务的范围和设计,并提供了结果,证明了该DP过程如何丰富源文本以改善推断和质量检查任务性能。数据和源代码将公开可用。

Understanding inferences and answering questions from text requires more than merely recovering surface arguments, adjuncts, or strings associated with the query terms. As humans, we interpret sentences as contextualized components of a narrative or discourse, by both filling in missing information, and reasoning about event consequences. In this paper, we define the process of rewriting a textual expression (lexeme or phrase) such that it reduces ambiguity while also making explicit the underlying semantics that is not (necessarily) expressed in the economy of sentence structure as Dense Paraphrasing (DP). We build the first complete DP dataset, provide the scope and design of the annotation task, and present results demonstrating how this DP process can enrich a source text to improve inferencing and QA task performance. The data and the source code will be publicly available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源