论文标题

生活是马戏团,我们是小丑:在情况和过程之间自动寻找类比

Life is a Circus and We are the Clowns: Automatically Finding Analogies between Situations and Processes

论文作者

Sultan, Oren, Shahaf, Dafna

论文摘要

类比制造会产生推理,抽象,灵活的分类和反事实推断 - 当今最佳的AI系统甚至缺乏能力。许多研究表明,类比是可以适应新领域的非脆性系统的关键。尽管它们很重要,但类比在NLP社区中很少关注,大多数研究都集中在简单的单词类比上。解决更复杂的类比的工作在很大程度上依赖于手动构建的,难以规模的输入表示。在这项工作中,我们探索了一个更现实,更具挑战性的设置:我们的意见是一对自然语言程序文本,描述了情况或过程(例如,心脏如何工作/泵的工作原理)。我们的目标是自动从文本中提取实体及其关系,并根据关系相似性找到不同域之间的映射(例如,血液被映射到水中)。我们开发了一种可解释的,可扩展的算法,并证明它可以确定正确映射的过程文本的87%的时间,而认知心理学文献中的故事为94%。我们表明,它可以从大量的程序文本数据集中提取类比,从而达到79%的精度(数据中的类比流行率:3%)。最后,我们证明我们的算法可以强大地解释输入文本。

Analogy-making gives rise to reasoning, abstraction, flexible categorization and counterfactual inference -- abilities lacking in even the best AI systems today. Much research has suggested that analogies are key to non-brittle systems that can adapt to new domains. Despite their importance, analogies received little attention in the NLP community, with most research focusing on simple word analogies. Work that tackled more complex analogies relied heavily on manually constructed, hard-to-scale input representations. In this work, we explore a more realistic, challenging setup: our input is a pair of natural language procedural texts, describing a situation or a process (e.g., how the heart works/how a pump works). Our goal is to automatically extract entities and their relations from the text and find a mapping between the different domains based on relational similarity (e.g., blood is mapped to water). We develop an interpretable, scalable algorithm and demonstrate that it identifies the correct mappings 87% of the time for procedural texts and 94% for stories from cognitive-psychology literature. We show it can extract analogies from a large dataset of procedural texts, achieving 79% precision (analogy prevalence in data: 3%). Lastly, we demonstrate that our algorithm is robust to paraphrasing the input texts.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源