论文标题
NIKSSS在Hinglisheval:基于语言Agnostic Bert的上下文嵌入,带有CATBOOST,用于质量评估低资源合成生成的代码混合的Hinglish Text
niksss at HinglishEval: Language-agnostic BERT-based Contextual Embeddings with Catboost for Quality Evaluation of the Low-Resource Synthetically Generated Code-Mixed Hinglish Text
论文作者
论文摘要
本文介绍了INLG 2022年Hinglisheval挑战的系统描述。该任务的目的是研究影响代码混合文本生成系统质量的因素。该任务分为两个子任务,质量评级预测和注释者的分歧预测。我们尝试使用句子级嵌入来解决这些任务,这些任务是通过平均汇总本文中所有输入令牌的上下文化词嵌入而获得的。我们在为各自任务生成的嵌入式外面尝试了各种分类器。我们表现最佳的系统在子任务B上排名第一,在子任务A上排名第三。
This paper describes the system description for the HinglishEval challenge at INLG 2022. The goal of this task was to investigate the factors influencing the quality of the code-mixed text generation system. The task was divided into two subtasks, quality rating prediction and annotators disagreement prediction of the synthetic Hinglish dataset. We attempted to solve these tasks using sentence-level embeddings, which are obtained from mean pooling the contextualized word embeddings for all input tokens in our text. We experimented with various classifiers on top of the embeddings produced for respective tasks. Our best-performing system ranked 1st on subtask B and 3rd on subtask A.