论文标题
从非结构化文本到因果知识图:基于变压器的方法
From Unstructured Text to Causal Knowledge Graphs: A Transformer-Based Approach
论文作者
论文摘要
定性因果关系紧凑地表达了世界上离散或连续相互作用的方向,依赖性,时间约束和单调性约束。在日常或学术语言中,我们可以表达数量之间的相互作用(例如,睡眠减轻压力),离散事件或实体之间(例如蛋白质抑制另一种蛋白质的转录),或有意或功能因素(例如,医院患者祈祷减轻其疼痛)之间的相互作用。提取和代表这些多种因果关系对于在从科学发现到社会科学的领域运作的认知系统至关重要。本文提出了基于变压器的NLP体系结构,该体系结构共同提取知识图,包括(1)语言中描述的变量或因素,(2)这些变量上的定性因果关系,(3)限制符和幅度限制了这些因果关系,以及(4)单词感官在较大的物质上进行均提取的node。我们没有声称我们的基于变压器的体系结构本身就是一种认知系统。但是,我们提供了其准确的知识图表提取在现实世界中的准确知识提取及其对执行基于图推理的认知系统所产生的知识图的实用性。我们证明了这种方法,并在两种用例中包括有希望的结果,处理学术出版物,新闻文章和社交媒体的文本输入。
Qualitative causal relationships compactly express the direction, dependency, temporal constraints, and monotonicity constraints of discrete or continuous interactions in the world. In everyday or academic language, we may express interactions between quantities (e.g., sleep decreases stress), between discrete events or entities (e.g., a protein inhibits another protein's transcription), or between intentional or functional factors (e.g., hospital patients pray to relieve their pain). Extracting and representing these diverse causal relations are critical for cognitive systems that operate in domains spanning from scientific discovery to social science. This paper presents a transformer-based NLP architecture that jointly extracts knowledge graphs including (1) variables or factors described in language, (2) qualitative causal relationships over these variables, (3) qualifiers and magnitudes that constrain these causal relationships, and (4) word senses to localize each extracted node within a large ontology. We do not claim that our transformer-based architecture is itself a cognitive system; however, we provide evidence of its accurate knowledge graph extraction in real-world domains and the practicality of its resulting knowledge graphs for cognitive systems that perform graph-based reasoning. We demonstrate this approach and include promising results in two use cases, processing textual inputs from academic publications, news articles, and social media.