论文标题
Dabert:双重注意增强了语义匹配的BERT
DABERT: Dual Attention Enhanced BERT for Semantic Matching
论文作者
论文摘要
基于变压器的预训练的语言模型(例如BERT)在语义句子匹配中取得了显着的结果。但是,现有模型仍然无法捕获细微差异的能力。诸如单词添加,删除和修改句子之类的小噪声可能会导致翻转的预测。为了减轻这个问题,我们提出了一种新颖的双重注意增强的BERT(DABERT),以增强BERT捕获句子对中细粒度差异的能力。 Dabert包括(1)双重注意模块,该模块通过引入新的双通道对准机制来模拟亲和力和差异注意力来衡量软词匹配。 (2)自适应融合模块,该模块使用注意力来学习差异和亲和力特征的聚合,并生成描述句子对匹配细节的向量。我们对研究精心培训和鲁棒性测试数据集进行了广泛的实验,实验结果显示了我们提出的方法的有效性。
Transformer-based pre-trained language models such as BERT have achieved remarkable results in Semantic Sentence Matching. However, existing models still suffer from insufficient ability to capture subtle differences. Minor noise like word addition, deletion, and modification of sentences may cause flipped predictions. To alleviate this problem, we propose a novel Dual Attention Enhanced BERT (DABERT) to enhance the ability of BERT to capture fine-grained differences in sentence pairs. DABERT comprises (1) Dual Attention module, which measures soft word matches by introducing a new dual channel alignment mechanism to model affinity and difference attention. (2) Adaptive Fusion module, this module uses attention to learn the aggregation of difference and affinity features, and generates a vector describing the matching details of sentence pairs. We conduct extensive experiments on well-studied semantic matching and robustness test datasets, and the experimental results show the effectiveness of our proposed method.