论文标题
使用多任务学习概括仇恨言论检测:政治公众人物的案例研究
Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures
论文作者
论文摘要
自动识别仇恨和虐待内容对于打击有害在线内容及其破坏性效果的传播至关重要。大多数现有作品通过检查仇恨语音数据集中的火车测试拆分上的概括错误来评估模型。这些数据集的定义和标记标准通常有所不同,在预测跨新域和数据集时会导致概括性能差。这项工作提出了一种新的多任务学习(MTL)管道,该管道同时跨多个仇恨语音数据集训练,以构建一个更包含的分类模型。使用数据集级别的保留对一个输出评估(指定用于测试的数据集并在所有其他数据集上进行共同培训),我们在新的,以前看不见的数据集上试用MTL检测。我们的结果始终优于现有工作的大量样本。在检查火车测试拆分中的概括误差和预测以前看不见的数据集时,我们会表现出很强的结果。此外,我们组装了一个新颖的数据集,称为Pubfigs,重点是美国公共政治人物的问题。我们使用Amazon MTURK众包标签超过20,000美元的推文和机器标签有问题的演讲,这是所有$ 305,235 $的推文。我们发现,虐待和仇恨的推文主要起源于右倾的人物,并涉及六个主题,包括伊斯兰教,妇女,种族和移民。我们表明,MTL构建可以同时将虐待与仇恨言论分开的嵌入,并确定其主题。
Automatic identification of hateful and abusive content is vital in combating the spread of harmful online content and its damaging effects. Most existing works evaluate models by examining the generalization error on train-test splits on hate speech datasets. These datasets often differ in their definitions and labeling criteria, leading to poor generalization performance when predicting across new domains and datasets. This work proposes a new Multi-task Learning (MTL) pipeline that trains simultaneously across multiple hate speech datasets to construct a more encompassing classification model. Using a dataset-level leave-one-out evaluation (designating a dataset for testing and jointly training on all others), we trial the MTL detection on new, previously unseen datasets. Our results consistently outperform a large sample of existing work. We show strong results when examining the generalization error in train-test splits and substantial improvements when predicting on previously unseen datasets. Furthermore, we assemble a novel dataset, dubbed PubFigs, focusing on the problematic speech of American Public Political Figures. We crowdsource-label using Amazon MTurk more than $20,000$ tweets and machine-label problematic speech in all the $305,235$ tweets in PubFigs. We find that the abusive and hate tweeting mainly originates from right-leaning figures and relates to six topics, including Islam, women, ethnicity, and immigrants. We show that MTL builds embeddings that can simultaneously separate abusive from hate speech, and identify its topics.