论文标题
通过元强化学习来回答几乎没有复杂的知识基础问题
Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement Learning
论文作者
论文摘要
复杂的提问(CQA)涉及在知识库(KB)上回答复杂的自然语言问题。但是,当问题具有不同的类型,具有不同的特征,例如难度水平时,常规的神经程序归纳(NPI)方法表现出不均匀的性能。本文提出了一种元强化学习方法,以在CQA中进行计划归纳,以解决问题中潜在的分布偏见。我们的方法快速有效地将元学习的程序员调整为基于从培训数据中检测到的最类似问题的新问题。然后,元学习的政策用于学习良好的编程政策,利用试验轨迹及其在支持集中的类似问题的奖励。我们的方法在CQA数据集(Saha等,2018)上实现了最新的性能,同时仅在每个支持集中使用五个试验轨迹来检索到的前5个问题,并在仅从训练集的1%构建的任务上进行了元素。我们已经在https://github.com/devinjake/mrl-cqa上发布了代码。
Complex question-answering (CQA) involves answering complex natural-language questions on a knowledge base (KB). However, the conventional neural program induction (NPI) approach exhibits uneven performance when the questions have different types, harboring inherently different characteristics, e.g., difficulty level. This paper proposes a meta-reinforcement learning approach to program induction in CQA to tackle the potential distributional bias in questions. Our method quickly and effectively adapts the meta-learned programmer to new questions based on the most similar questions retrieved from the training data. The meta-learned policy is then used to learn a good programming policy, utilizing the trial trajectories and their rewards for similar questions in the support set. Our method achieves state-of-the-art performance on the CQA dataset (Saha et al., 2018) while using only five trial trajectories for the top-5 retrieved questions in each support set, and metatraining on tasks constructed from only 1% of the training set. We have released our code at https://github.com/DevinJake/MRL-CQA.