论文标题

在上下文匪徒中实现用户端公平性

Achieving User-Side Fairness in Contextual Bandits

论文作者

Huang, Wen, Labille, Kevin, Wu, Xintao, Lee, Dongwon, Heffernan, Neil

论文摘要

基于多臂强盗(MAB)算法的个性化建议已显示出高实用性和效率,因为它可以动态地调整基于反馈的推荐策略。但是,不公平可能会在个性化的建议中产生。在本文中,我们研究了如何在个性化建议中实现用户端公平。我们将公平的个性化建议作为修改的上下文强盗,并专注于在推荐的个人上实现公平性,而不是在推荐的项目上公平地实现公平性。我们介绍并定义了一个指标,该指标可以根据特权和受保护的群体获得的奖励来捕捉公平性。我们开发了一种公平的bandit算法,即Fair-linucb,该算法改进了传统的Linucb算法,以实现用户的群体级别。我们的算法检测并监视不公平的,而它学会向学生推荐个性化视频以提高效率。我们提供了理论上的遗憾分析,并表明我们的算法比Linucb具有稍高的遗憾。我们进行了大量的实验评估,以将我们公平的上下文匪徒与Linucb的表现进行比较,并表明我们的方法在保持高效用的同时达到了群体水平的公平性。

Personalized recommendation based on multi-arm bandit (MAB) algorithms has shown to lead to high utility and efficiency as it can dynamically adapt the recommendation strategy based on feedback. However, unfairness could incur in personalized recommendation. In this paper, we study how to achieve user-side fairness in personalized recommendation. We formulate our fair personalized recommendation as a modified contextual bandit and focus on achieving fairness on the individual whom is being recommended an item as opposed to achieving fairness on the items that are being recommended. We introduce and define a metric that captures the fairness in terms of rewards received for both the privileged and protected groups. We develop a fair contextual bandit algorithm, Fair-LinUCB, that improves upon the traditional LinUCB algorithm to achieve group-level fairness of users. Our algorithm detects and monitors unfairness while it learns to recommend personalized videos to students to achieve high efficiency. We provide a theoretical regret analysis and show that our algorithm has a slightly higher regret bound than LinUCB. We conduct numerous experimental evaluations to compare the performances of our fair contextual bandit to that of LinUCB and show that our approach achieves group-level fairness while maintaining a high utility.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源