跨同类群体分类

论文标题

跨同类群体分类

Towards classification parity across cohorts

论文作者

Patel, Aarsh, Gupta, Rahul, Harakere, Mukund, Krishna, Satyapriya, Alok, Aman, Liu, Peng

论文摘要

最近，人们对确保机器学习中的算法公平性引起了人们的兴趣，其中主要问题是如何防止敏感信息（例如，对个人的种族知识）在学习算法中添加“不公平”偏见（Feldman等人（Feldman等人（2015）（2015年），Zemel等人（2013年），Zemel等人（2013年））。 This has led to several debiasing algorithms on word embeddings (Qian et al. (2019) , Bolukbasi et al. (2016)), coreference resolution (Zhao et al. (2018a)), semantic role labeling (Zhao et al. (2017)), etc. Most of these existing work deals with explicit sensitive features such as gender, occupations or race which doesn't work with data where such features are not captured由于隐私问题。在这项研究工作中，我们旨在在明确的敏感特征和隐性敏感特征上实现分类均衡。我们根据数据（年龄，性别，种族）中提供的明确敏感属性将明确的人群定义为人群，而隐式群体被定义为具有相似语言使用的人群。我们通过使用语言模型对他们生成的语言进行培训的每个单独的单个单独的嵌入来获得隐式人群。我们在这项工作中实现了两个主要目标：[1。]我们根据隐式和显式特征进行了跨人群的分类性能差异，[2]我们通过对旨在最小化跨同类模型性能范围的损失函数进行修改，改善了分类奇偶校验。

Recently, there has been a lot of interest in ensuring algorithmic fairness in machine learning where the central question is how to prevent sensitive information (e.g. knowledge about the ethnic group of an individual) from adding "unfair" bias to a learning algorithm (Feldman et al. (2015), Zemel et al. (2013)). This has led to several debiasing algorithms on word embeddings (Qian et al. (2019) , Bolukbasi et al. (2016)), coreference resolution (Zhao et al. (2018a)), semantic role labeling (Zhao et al. (2017)), etc. Most of these existing work deals with explicit sensitive features such as gender, occupations or race which doesn't work with data where such features are not captured due to privacy concerns. In this research work, we aim to achieve classification parity across explicit as well as implicit sensitive features. We define explicit cohorts as groups of people based on explicit sensitive attributes provided in the data (age, gender, race) whereas implicit cohorts are defined as groups of people with similar language usage. We obtain implicit cohorts by clustering embeddings of each individual trained on the language generated by them using a language model. We achieve two primary objectives in this work : [1.] We experimented and discovered classification performance differences across cohorts based on implicit and explicit features , [2] We improved classification parity by introducing modification to the loss function aimed to minimize the range of model performances across cohorts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题