理解或操纵：重新思考现代推荐系统的在线性能增长

论文标题

理解或操纵：重新思考现代推荐系统的在线性能增长

Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems

论文作者

Zhu, Zhengbang, Qin, Rongjun, Huang, Junjie, Dai, Xinyi, Yu, Yang, Yu, Yong, Zhang, Weinan

论文摘要

预计推荐系统将是帮助人类用户在没有明确查询的情况下自动找到相关信息的助手。随着推荐系统的发展，应用了越来越复杂的学习技术，并且在用户参与度指标（例如点击和浏览时间）方面取得了更好的性能。但是，测得的性能的提高可以具有两个可能的属性：更好地理解用户偏好，以及更主动地利用人类有限合理性来引诱用户过度消费的能力。一个自然的以下问题是当前的建议算法是否正在操纵用户偏好。如果是这样，我们可以衡量操纵水平吗？在本文中，我们提出了一个一般框架，用于基准在Slate建议和顺序推荐方案中基准推荐算法的操纵程度。该框架包括四个阶段，初始偏好计算，培训数据收集，算法培训和交互作用以及涉及两个建议指标的指标计算。我们基准在拟议的框架下基于合成和现实数据集中的一些代表性建议算法。我们已经观察到，高的在线点击率并不一定意味着更好地了解用户最初的偏好，而是促使用户选择最初不喜欢的更多文档。此外，我们发现培训数据对操作程度产生了显着影响，并且具有更强大建模能力的算法对这种影响更敏感。该实验还验证了所提出的指标在测量操作程度的实用性。我们主张未来的建议算法研究应被视为用户偏好操作受限的优化问题。

Recommender systems are expected to be assistants that help human users find relevant information automatically without explicit queries. As recommender systems evolve, increasingly sophisticated learning techniques are applied and have achieved better performance in terms of user engagement metrics such as clicks and browsing time. The increase in the measured performance, however, can have two possible attributions: a better understanding of user preferences, and a more proactive ability to utilize human bounded rationality to seduce user over-consumption. A natural following question is whether current recommendation algorithms are manipulating user preferences. If so, can we measure the manipulation level? In this paper, we present a general framework for benchmarking the degree of manipulations of recommendation algorithms, in both slate recommendation and sequential recommendation scenarios. The framework consists of four stages, initial preference calculation, training data collection, algorithm training and interaction, and metrics calculation that involves two proposed metrics. We benchmark some representative recommendation algorithms in both synthetic and real-world datasets under the proposed framework. We have observed that a high online click-through rate does not necessarily mean a better understanding of user initial preference, but ends in prompting users to choose more documents they initially did not favor. Moreover, we find that the training data have notable impacts on the manipulation degrees, and algorithms with more powerful modeling abilities are more sensitive to such impacts. The experiments also verified the usefulness of the proposed metrics for measuring the degree of manipulations. We advocate that future recommendation algorithm studies should be treated as an optimization problem with constrained user preference manipulations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题