论文标题
通过线性无上下文重写系统,基于超级贴的解析
Supertagging-based Parsing with Linear Context-free Rewriting Systems
论文作者
论文摘要
我们介绍了第一个针对LCFR的基于超级贴纸的解析器。它利用神经分类器,并且在准确性和解析速度方面极大地超过了以前的基于LCFRS的解析器。此外,我们的结果与最好的(一般)不连续的解析器保持相符,尤其是不连续组成的分数非常出色。我们方法的核心是一种有效的词汇化程序,该程序诱导了来自任何不连续的树库的词汇LCFR。这是Mörbitz和Ruprecht(2020)先前作品的改编。我们还描述了对基于图表的LCFRS解析的修改,该修改说明了超级壁式标签,并引入了将词汇LCFRS派生转换为原始Treebank等效解析树的过程。我们的方法是对英国不连续的宾夕法尼亚州立树库以及德国Corpora Negra和Tiger的实施和评估。
We present the first supertagging-based parser for LCFRS. It utilizes neural classifiers and tremendously outperforms previous LCFRS-based parsers in both accuracy and parsing speed. Moreover, our results keep up with the best (general) discontinuous parsers, particularly the scores for discontinuous constitutents are excellent. The heart of our approach is an efficient lexicalization procedure which induces a lexical LCFRS from any discontinuous treebank. It is an adaptation of previous work by Mörbitz and Ruprecht (2020). We also describe a modification to usual chart-based LCFRS parsing that accounts for supertagging and introduce a procedure for the transformation of lexical LCFRS derivations into equivalent parse trees of the original treebank. Our approach is implemented and evaluated on the English Discontinuous Penn Treebank and the German corpora NeGra and Tiger.