通过在线滥用内容检测中设计道德规范

论文标题

通过在线滥用内容检测中设计道德规范

Towards Ethics by Design in Online Abusive Content Detection

论文作者

Kiritchenko, Svetlana, Nejadgholi, Isar

论文摘要

为了支持在线通信中的安全性和包容性，已经为解决滥用内容检测的问题做出了重大努力，通常被定义为监督分类任务。研究工作已经分布在几个密切相关的次级区域中，例如仇恨言论，毒性，网络欺凌等。在共同的任务制定，数据集设计和绩效评估的共同框架下，迫切需要巩固该领域。此外，尽管当前的技术达到了高分类精度，但已经揭示了一些道德问题。我们将道德问题带到前列，并提出一个统一的框架作为两个步骤的过程。首先，在线内容围绕与个人和身份相关的主题进行分类。其次，通过每个类别内的比较注释来确定滥用的严重性。新颖的框架以设计原则为指导，是迈向建立更准确和值得信赖的模型的一步。

To support safety and inclusion in online communications, significant efforts in NLP research have been put towards addressing the problem of abusive content detection, commonly defined as a supervised classification task. The research effort has spread out across several closely related sub-areas, such as detection of hate speech, toxicity, cyberbullying, etc. There is a pressing need to consolidate the field under a common framework for task formulation, dataset design and performance evaluation. Further, despite current technologies achieving high classification accuracies, several ethical issues have been revealed. We bring ethical issues to forefront and propose a unified framework as a two-step process. First, online content is categorized around personal and identity-related subject matters. Second, severity of abuse is identified through comparative annotation within each category. The novel framework is guided by the Ethics by Design principle and is a step towards building more accurate and trusted models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题