通过检索启发式诊断BERT

论文标题

通过检索启发式诊断BERT

Diagnosing BERT with Retrieval Heuristics

论文作者

Câmara, Arthur, Hauff, Claudia

论文摘要

Word2Vec的发行在2013年广泛流行的单词嵌入已成为NLP工程管道的中流型。最近，随着BERT的释放，单词嵌入已从基于术语的嵌入空间转移到上下文嵌入空间 - 每个术语不再由单个低维矢量表示，而是每个术语，而是每个术语，并且\ emph {其上下文}确定矢量权重。伯特的设置和体系结构已被证明足以适用于许多自然语言任务。重要的是，对于需要对IR问题进行大量调整的IR问题的先前深度学习解决方案形成鲜明对比的是，“ Vanilla Bert”已被证明超过了广泛的检索算法，包括很大的检索算法，包括长期以来在抗拒传统的IR基础上的任务和CORPORA（例如，抗IR基础）的回收效果（如强大的基础）（例如）（例如）。在本文中，我们采用了最近提出的公理数据集分析技术 - 也就是说，我们创建了每个诊断数据集，每个数据集都可以满足检索启发式的启发式（术语匹配和基于语义的匹配） - 探索Bert能够学习的内容。与我们的期望相反，我们发现伯特（Bert）应用于最近发布的具有临时主题的大型Web语料库时，\ emph {not}遵守了任何探索的公理。同时，伯特（Bert）的表现优于传统查询似然检索模型40 \％。这意味着，IR的公理方法（及其为检索启发式方法创建的诊断数据集的扩展）可能以当前形式不适用于大型语料库。附加 - 不同 - 需要公理。

Word embeddings, made widely popular in 2013 with the release of word2vec, have become a mainstay of NLP engineering pipelines. Recently, with the release of BERT, word embeddings have moved from the term-based embedding space to the contextual embedding space -- each term is no longer represented by a single low-dimensional vector but instead each term and \emph{its context} determine the vector weights. BERT's setup and architecture have been shown to be general enough to be applicable to many natural language tasks. Importantly for Information Retrieval (IR), in contrast to prior deep learning solutions to IR problems which required significant tuning of neural net architectures and training regimes, "vanilla BERT" has been shown to outperform existing retrieval algorithms by a wide margin, including on tasks and corpora that have long resisted retrieval effectiveness gains over traditional IR baselines (such as Robust04). In this paper, we employ the recently proposed axiomatic dataset analysis technique -- that is, we create diagnostic datasets that each fulfil a retrieval heuristic (both term matching and semantic-based) -- to explore what BERT is able to learn. In contrast to our expectations, we find BERT, when applied to a recently released large-scale web corpus with ad-hoc topics, to \emph{not} adhere to any of the explored axioms. At the same time, BERT outperforms the traditional query likelihood retrieval model by 40\%. This means that the axiomatic approach to IR (and its extension of diagnostic datasets created for retrieval heuristics) may in its current form not be applicable to large-scale corpora. Additional -- different -- axioms are needed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题