检测所有虐待！朝着普遍的滥用语言检测模型

论文标题

检测所有虐待！朝着普遍的滥用语言检测模型

Detect All Abuse! Toward Universal Abusive Language Detection Models

论文作者

Wang, Kunze, Lu, Dong, Han, Soyeon Caren, Long, Siqu, Poon, Josiah

论文摘要

在线滥用语言检测（ALD）已成为近年来越来越重要的社会问题。在线ALD的几项先前的作品重点是在一个域（例如Twitter）中解决单个虐待语言问题，并且无法成功地转移到一般的ALD任务或域。在本文中，我们引入了一个新的通用ALD框架MACAS，该框架能够解决不同域之间的几种ALD任务。我们的通用框架涵盖了代表滥用语言的目标和内容方面的多种虐待语言嵌入，并应用了分析用户语言行为的文本图嵌入。然后，我们提出并使用跨意义的门流机制来接受滥用语言的多个方面。定量和定性评估结果表明，我们的ALD算法竞争对手或超过了七个ALD数据集中的六个最先进的ALD算法，涵盖了滥用语言的多个方面和不同的在线社区领域。

Online abusive language detection (ALD) has become a societal issue of increasing importance in recent years. Several previous works in online ALD focused on solving a single abusive language problem in a single domain, like Twitter, and have not been successfully transferable to the general ALD task or domain. In this paper, we introduce a new generic ALD framework, MACAS, which is capable of addressing several types of ALD tasks across different domains. Our generic framework covers multi-aspect abusive language embeddings that represent the target and content aspects of abusive language and applies a textual graph embedding that analyses the user's linguistic behaviour. Then, we propose and use the cross-attention gate flow mechanism to embrace multiple aspects of abusive language. Quantitative and qualitative evaluation results show that our ALD algorithm rivals or exceeds the six state-of-the-art ALD algorithms across seven ALD datasets covering multiple aspects of abusive language and different online community domains.

下载PDF全文

下载文献需遵守相关版权规定

论文标题