论文标题
Suum Cuique:在社区视角研究禁忌检测中的偏见
Suum Cuique: Studying Bias in Taboo Detection with a Community Perspective
论文作者
论文摘要
先前的研究已经讨论并说明了在研究禁忌(仇恨/进攻/有毒等)语言时考虑在社区层面考虑语言规范的必要性。但是,这样做的一种方法,基于社区语言规范仍然存在很大程度上。这可能会导致禁忌文本分类的偏见,也可能导致我们对偏见原因的理解的局限性。我们提出了一种研究禁忌分类和注释的偏见的方法,在社区观点是前列和中心。这是通过使用针对每个社区语言调整的特殊分类器来完成的。本质上,这些分类器代表社区级别的语言规范。我们使用这些来研究偏见,并发现对非裔美国人(7/10个数据集和所有3个分类器)的偏见最大。与以前的论文相反,我们还研究了其他社区,并发现例如对南亚人的强烈偏见。在一项小规模的用户研究中,我们说明了我们的关键思想,即,与社区(社区分类器信心分数)具有高对齐分数的常见话语,不太可能被视为禁忌。作为社区成员的注释者在大多数情况下与禁忌分类决策和注释相矛盾。本文是迈向减少假积极的禁忌决策的重要一步,随着时间的流逝,损害少数民族社区。
Prior research has discussed and illustrated the need to consider linguistic norms at the community level when studying taboo (hateful/offensive/toxic etc.) language. However, a methodology for doing so, that is firmly founded on community language norms is still largely absent. This can lead both to biases in taboo text classification and limitations in our understanding of the causes of bias. We propose a method to study bias in taboo classification and annotation where a community perspective is front and center. This is accomplished by using special classifiers tuned for each community's language. In essence, these classifiers represent community level language norms. We use these to study bias and find, for example, biases are largest against African Americans (7/10 datasets and all 3 classifiers examined). In contrast to previous papers we also study other communities and find, for example, strong biases against South Asians. In a small scale user study we illustrate our key idea which is that common utterances, i.e., those with high alignment scores with a community (community classifier confidence scores) are unlikely to be regarded taboo. Annotators who are community members contradict taboo classification decisions and annotations in a majority of instances. This paper is a significant step toward reducing false positive taboo decisions that over time harm minority communities.