可读性评估的神经成对排名模型

论文标题

可读性评估的神经成对排名模型

A Neural Pairwise Ranking Model for Readability Assessment

论文作者

Lee, Justin, Vajjala, Sowmya

论文摘要

自动可读性评估（ARA）是将阅读水平分配给文本的任务，传统上被视为NLP研究中的分类问题。在本文中，我们提出了第一种神经，成对的排名方法，并将其与现有的分类，回归和（非神经）排名方法进行比较。我们通过使用三个英语，一个法语和一个西班牙数据集进行实验来建立模型的性能。我们证明，我们的方法在单语单/交叉语料库测试方案中表现良好，并且在接受英语数据培训时，法语和西班牙语的零镜头跨语性排名准确性超过80％。此外，我们还用英语和法语发布了新的平行双语可读性数据集。据我们所知，本文提出了ARA的第一个神经成对排名模型，并展示了使用神经模型对ARA进行跨语言，零射门评估的第一个结果。

Automatic Readability Assessment (ARA), the task of assigning a reading level to a text, is traditionally treated as a classification problem in NLP research. In this paper, we propose the first neural, pairwise ranking approach to ARA and compare it with existing classification, regression, and (non-neural) ranking methods. We establish the performance of our model by conducting experiments with three English, one French and one Spanish datasets. We demonstrate that our approach performs well in monolingual single/cross corpus testing scenarios and achieves a zero-shot cross-lingual ranking accuracy of over 80% for both French and Spanish when trained on English data. Additionally, we also release a new parallel bilingual readability dataset in English and French. To our knowledge, this paper proposes the first neural pairwise ranking model for ARA, and shows the first results of cross-lingual, zero-shot evaluation of ARA with neural models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题