个性化新闻推荐的两阶段神经背景匪徒

论文标题

个性化新闻推荐的两阶段神经背景匪徒

Two-Stage Neural Contextual Bandits for Personalised News Recommendation

论文作者

Zhang, Mengyan, Nguyen-Tang, Thanh, Wu, Fangzhao, He, Zhenyu, Xie, Xing, Ong, Cheng Soon

论文摘要

我们考虑了个性化新闻推荐的问题，每个用户都以顺序消费新闻。现有的个性化新闻推荐方法的重点是利用用户兴趣，而忽略了建议中的探索，从而导致反馈循环并长期损害了建议质量。我们基于上下文匪徒推荐策略，自然可以解决剥削 - 探索权衡问题。主要的挑战是探索大规模项目空间并利用不确定性深度表示的计算效率。我们提出了一个两阶段的分层主题，新的深层上下文强盗框架，以有效地学习用户偏好，当时有许多新闻项目。我们为用户和新闻使用深度学习表示形式，并将神经上限限制（UCB）策略推广到通用的添加剂UCB和BILINEAR UCB。大规模新闻建议数据集的经验结果表明，我们提出的政策是有效的，并且表现优于基线匪徒策略。

We consider the problem of personalised news recommendation where each user consumes news in a sequential fashion. Existing personalised news recommendation methods focus on exploiting user interests and ignores exploration in recommendation, which leads to biased feedback loops and hurt recommendation quality in the long term. We build on contextual bandits recommendation strategies which naturally address the exploitation-exploration trade-off. The main challenges are the computational efficiency for exploring the large-scale item space and utilising the deep representations with uncertainty. We propose a two-stage hierarchical topic-news deep contextual bandits framework to efficiently learn user preferences when there are many news items. We use deep learning representations for users and news, and generalise the neural upper confidence bound (UCB) policies to generalised additive UCB and bilinear UCB. Empirical results on a large-scale news recommendation dataset show that our proposed policies are efficient and outperform the baseline bandit policies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题