单方面的反馈单独学习

论文标题

单方面的反馈单独学习

Individually Fair Learning with One-Sided Feedback

论文作者

Bechavod, Yahav, Roth, Aaron

论文摘要

我们考虑了一个单方面反馈的在线学习问题，其中学习者只能观察到真正的标签以进行积极的预测实例。在每回合中，$ k $实例到达并根据学习者部署的随机策略获得分类结果，其目标是在部署单独的公平政策的同时最大化准确性。我们首先扩展了Bechavod等人的框架。（2020）依靠人类公平审计师的存在来检测公平性违规行为，而是将动态选择的审计师的动态选择面板的反馈纳入了反馈。然后，我们从在线学习问题中构建了有效的减少，并通过单方面反馈和一个小组向上下文组合半伴侣问题报告公平性违反（Cesa-Bianchi＆Lugosi，2009；György等，2007）。最后，我们展示了如何在上下文组合半循环设置中利用两种算法的保证：exp2（Bubeck等，2012）和Oracle效率的上下文 - 上下文semi-bandit-ftpl（Syrgkanis et al。，2016年），为准确性和公平性提供多种遗憾。我们的结果消除了先前工作中的两个潜在偏见来源：在整个信息设置中运行的算法无法使用的“隐藏结果”，以及任何单个人类审核员中可能存在的人类偏见，但可以通过选择一个精心选择的面板来减轻。

We consider an online learning problem with one-sided feedback, in which the learner is able to observe the true label only for positively predicted instances. On each round, $k$ instances arrive and receive classification outcomes according to a randomized policy deployed by the learner, whose goal is to maximize accuracy while deploying individually fair policies. We first extend the framework of Bechavod et al. (2020), which relies on the existence of a human fairness auditor for detecting fairness violations, to instead incorporate feedback from dynamically-selected panels of multiple, possibly inconsistent, auditors. We then construct an efficient reduction from our problem of online learning with one-sided feedback and a panel reporting fairness violations to the contextual combinatorial semi-bandit problem (Cesa-Bianchi & Lugosi, 2009, György et al., 2007). Finally, we show how to leverage the guarantees of two algorithms in the contextual combinatorial semi-bandit setting: Exp2 (Bubeck et al., 2012) and the oracle-efficient Context-Semi-Bandit-FTPL (Syrgkanis et al., 2016), to provide multi-criteria no regret guarantees simultaneously for accuracy and fairness. Our results eliminate two potential sources of bias from prior work: the "hidden outcomes" that are not available to an algorithm operating in the full information setting, and human biases that might be present in any single human auditor, but can be mitigated by selecting a well chosen panel.

下载PDF全文

下载文献需遵守相关版权规定

论文标题