论文标题

通过对比代表学习在Wasserstein空间中对学生写作的自动编码

Automatic coding of students' writing via Contrastive Representation Learning in the Wasserstein space

论文作者

Jiang, Ruijie, Gouvea, Julia, Hammer, David, Miller, Eric, Aeron, Shuchin

论文摘要

言语数据的定性分析在学习科学中至关重要。但是,它是劳动密集型且耗时的,它限制了研究人员可以在研究中包含的数据。这项工作是迈向建立统计机器学习(ML)方法的一步,以实现对学生写作的定性分析的自动支持,在这里特别是在评分实验室报告中,在介绍性生物学中进行了论证和推理的精致。我们从本科生物学课程的一系列实验室报告开始,该课程由四级方案得分,该方案考虑了论证结构的复杂性,证据范围以及结论的关心和细微差别。使用这组标记的数据,我们表明,一种流行的自然语言建模处理管道,即单词的矢量表示,又称单词嵌入,随后是长期短期记忆(LSTM)模型,用于捕获语言生成作为状态空间模型,能够通过训练(QWK)进行数量地捕获侵蚀,并在训练中进行训练,以通过AN训练A tralling A.我们表明,ML算法接近人类分析的评估者间可靠性。最终,我们得出的结论是,自然语言处理(NLP)的机器学习(ML)有望帮助学习科学研究人员以比目前可能更大的规模进行定性研究。

Qualitative analysis of verbal data is of central importance in the learning sciences. It is labor-intensive and time-consuming, however, which limits the amount of data researchers can include in studies. This work is a step towards building a statistical machine learning (ML) method for achieving an automated support for qualitative analyses of students' writing, here specifically in score laboratory reports in introductory biology for sophistication of argumentation and reasoning. We start with a set of lab reports from an undergraduate biology course, scored by a four-level scheme that considers the complexity of argument structure, the scope of evidence, and the care and nuance of conclusions. Using this set of labeled data, we show that a popular natural language modeling processing pipeline, namely vector representation of words, a.k.a word embeddings, followed by Long Short Term Memory (LSTM) model for capturing language generation as a state-space model, is able to quantitatively capture the scoring, with a high Quadratic Weighted Kappa (QWK) prediction score, when trained in via a novel contrastive learning set-up. We show that the ML algorithm approached the inter-rater reliability of human analysis. Ultimately, we conclude, that machine learning (ML) for natural language processing (NLP) holds promise for assisting learning sciences researchers in conducting qualitative studies at much larger scales than is currently possible.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源