论文标题
检测有害的在线对话内容对LGBTQIA+个人
Detecting Harmful Online Conversational Content towards LGBTQIA+ Individuals
论文作者
论文摘要
在线讨论,面板,谈话页面编辑等通常包含有害的对话内容,即仇恨言论,死亡威胁和令人反感的语言,尤其是对某些人口组的群体。例如,确定为LGBTQIA+社区成员和/或BIPOC(黑色,土著,有色人种)的个人面临更高的在线虐待和骚扰风险。在这项工作中,我们首先引入了一个现实世界中的数据集,该数据集将使我们能够学习并了解有害的在线对话内容。然后,我们进行了几项探索性数据分析实验,以从数据集中获得更深入的见解。后来,我们描述了我们检测有害在线抗LGBTQIA+对话内容的方法,最后,我们实施了两个基线机器学习模型(即支持向量机和逻辑回归),以及微调3预训练的大语言模型(Bert,Roberta,Roberta和Hatebert)。我们的发现证明,大型语言模型可以在检测在线抗LGBTQIA+会话内容检测任务时实现非常有希望的表现。
Online discussions, panels, talk page edits, etc., often contain harmful conversational content i.e., hate speech, death threats and offensive language, especially towards certain demographic groups. For example, individuals who identify as members of the LGBTQIA+ community and/or BIPOC (Black, Indigenous, People of Color) are at higher risk for abuse and harassment online. In this work, we first introduce a real-world dataset that will enable us to study and understand harmful online conversational content. Then, we conduct several exploratory data analysis experiments to gain deeper insights from the dataset. We later describe our approach for detecting harmful online Anti-LGBTQIA+ conversational content, and finally, we implement two baseline machine learning models (i.e., Support Vector Machine and Logistic Regression), and fine-tune 3 pre-trained large language models (BERT, RoBERTa, and HateBERT). Our findings verify that large language models can achieve very promising performance on detecting online Anti-LGBTQIA+ conversational content detection tasks.