论文标题
法律案例文件摘要:提取和抽象方法及其评估
Legal Case Document Summarization: Extractive and Abstractive Methods and their Evaluation
论文作者
论文摘要
法律案件判决文件的汇总是法律NLP中的一个具有挑战性的问题。但是,当应用于法律案例文件时,对摘要模型的不同家族(例如,提取性与抽象性)的不同分析并不多。这个问题尤其重要,因为许多最近基于变压器的抽象摘要模型对输入令牌的数量有限制,并且已知法律文件很长。此外,这是一个关于如何最好地评估法律案例文档摘要系统的公开问题。在本文中,我们对我们开发的三个法律摘要数据集进行了多种提取性和抽象性摘要方法(既有监督和无监督)的广泛实验。我们的分析包括法律从业人员的评估,导致了有关特定和长期文档摘要中法律总结的一些有趣的见解。
Summarization of legal case judgement documents is a challenging problem in Legal NLP. However, not much analyses exist on how different families of summarization models (e.g., extractive vs. abstractive) perform when applied to legal case documents. This question is particularly important since many recent transformer-based abstractive summarization models have restrictions on the number of input tokens, and legal documents are known to be very long. Also, it is an open question on how best to evaluate legal case document summarization systems. In this paper, we carry out extensive experiments with several extractive and abstractive summarization methods (both supervised and unsupervised) over three legal summarization datasets that we have developed. Our analyses, that includes evaluation by law practitioners, lead to several interesting insights on legal summarization in specific and long document summarization in general.