文本难度研究：机器在文本难度方面的行为与人类相同吗？

论文标题

文本难度研究：机器在文本难度方面的行为与人类相同吗？

Text Difficulty Study: Do machines behave the same as humans regarding text difficulty?

论文作者

Chen, Bowen, Ding, Xiao, Du, Li, Bing, Qin, Liu, Ting

论文摘要

鉴于一项任务，人类从轻松到硬学习，而模型随机学习。不可否认的是，难度不敏感的学习在NLP中取得了巨大的成功，但是很少关注NLP文本难度的影响。在这项研究中，我们建议人类学习匹配指数（HLM指数）来研究文本难度的效果。实验结果表明：（1）LSTM比BERT具有更多类似的人类学习行为。（2）uid-superlinear在四个文本难度标准中对文本难度进行了最佳评估。（3）在九个任务中，某些任务的性能与文本难度有关，而有些则没有。（4）经过简单数据训练的模型在简单和中等数据中表现最佳，而硬级别的火车只能在硬数据上表现良好。（5）训练模型从易于到硬线到快速收敛。

Given a task, human learns from easy to hard, whereas the model learns randomly. Undeniably, difficulty insensitive learning leads to great success in NLP, but little attention has been paid to the effect of text difficulty in NLP. In this research, we propose the Human Learning Matching Index (HLM Index) to investigate the effect of text difficulty. Experiment results show: (1) LSTM has more human-like learning behavior than BERT. (2) UID-SuperLinear gives the best evaluation of text difficulty among four text difficulty criteria. (3) Among nine tasks, some tasks' performance is related to text difficulty, whereas some are not. (4) Model trained on easy data performs best in easy and medium data, whereas trains on a hard level only perform well on hard data. (5) Training the model from easy to hard leads to fast convergence.

下载PDF全文

下载文献需遵守相关版权规定

论文标题