论文标题
在解析作为标签
On Parsing as Tagging
论文作者
论文摘要
有许多建议减少选区解析文献中的标签。为了更好地理解这些方法的共同点,我们将一些现有的建议投入到一个由三个步骤组成的统一管道中:线性化,学习和解码。特别是,我们展示了如何通过对语法上进行右角转换并做出特定的独立性假设来减少一个最先进的选区标记器,以减少解析。此外,我们从经验上评估了我们的分类法,该分类法用不同选择的线性化器,学习者和解码器进行标记。根据英语的结果和8种类型上多样的语言,我们得出结论,衍生树的线性化及其与输入序列的对齐是实现准确的标记器的最关键因素。
There have been many proposals to reduce constituency parsing to tagging in the literature. To better understand what these approaches have in common, we cast several existing proposals into a unifying pipeline consisting of three steps: linearization, learning, and decoding. In particular, we show how to reduce tetratagging, a state-of-the-art constituency tagger, to shift--reduce parsing by performing a right-corner transformation on the grammar and making a specific independence assumption. Furthermore, we empirically evaluate our taxonomy of tagging pipelines with different choices of linearizers, learners, and decoders. Based on the results in English and a set of 8 typologically diverse languages, we conclude that the linearization of the derivation tree and its alignment with the input sequence is the most critical factor in achieving accurate taggers.