语义解析中的组成概括：预训练与专业体系结构

论文标题

语义解析中的组成概括：预训练与专业体系结构

Compositional Generalization in Semantic Parsing: Pre-training vs. Specialized Architectures

论文作者

Furrer, Daniel, van Zee, Marc, Scales, Nathan, Schärli, Nathanael

论文摘要

虽然已知主流机器学习方法的组合概括能力有限，但继续提出新的体系结构和技术来解决这一限制。我们研究了最新的技术和体系结构，以评估它们根据扫描和CFQ数据集改善语义解析任务中组成概括的有效性。我们表明，蒙版语言模型（MLM）预训练的竞争对手在原始保留拆分上扫描启发的体系结构。在更复杂的组成任务上，我们表明预训练会导致性能与可比的非训练模型的显着改善，而旨在鼓励扫描或算法学习领域的构造概括未能带来重大改进。我们使用MLM预训练以及中间表示，在CFQ组成概括基准上建立了新的艺术状态。

While mainstream machine learning methods are known to have limited ability to compositionally generalize, new architectures and techniques continue to be proposed to address this limitation. We investigate state-of-the-art techniques and architectures in order to assess their effectiveness in improving compositional generalization in semantic parsing tasks based on the SCAN and CFQ datasets. We show that masked language model (MLM) pre-training rivals SCAN-inspired architectures on primitive holdout splits. On a more complex compositional task, we show that pre-training leads to significant improvements in performance vs. comparable non-pre-trained models, whereas architectures proposed to encourage compositional generalization on SCAN or in the area of algorithm learning fail to lead to significant improvements. We establish a new state of the art on the CFQ compositional generalization benchmark using MLM pre-training together with an intermediate representation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题