论文标题

深度检索:学习一个可检索的结构以进行大规模建议

Deep Retrieval: Learning A Retrievable Structure for Large-Scale Recommendations

论文作者

Gao, Weihao, Fan, Xiangjun, Wang, Chong, Sun, Jiankai, Jia, Kai, Xiao, Wenzhi, Ding, Ruofan, Bin, Xingyan, Yang, Hui, Liu, Xiaobing

论文摘要

大规模建议中的核心问题之一是准确有效地检索顶级相关的候选者,最好是在亚线性时间内。先前的方法主要基于两个步骤:首先学习内部产品模型,然后使用一些近似最近的邻居(ANN)搜索算法来查找顶级候选者。在本文中,我们介绍了深度检索(DR),以直接使用用户项目交互数据(例如,点击)学习可检索的结构,而无需诉诸于ANN算法中的Euclidean空间假设。 DR的结构将所有候选项目编码为离散的潜在空间。这些候选者的潜在代码是模型参数,并与其他神经网络参数一起学习,以最大化相同的目标函数。通过学习的模型,对结构进行了光束搜索,以检索重新骑行的顶部候选者。从经验上讲,我们首先证明,具有子线性计算复杂性的DR可以达到与两个公共数据集上的蛮力基线几乎相同的准确性。此外,我们表明,在现场生产建议系统中,部署的DR方法在参与度量方面的表现明显优于调整良好的ANN基线。据我们所知,DR是成功部署以数亿个项目的工业推荐系统规模而成功部署的第一批非ANN算法之一。

One of the core problems in large-scale recommendations is to retrieve top relevant candidates accurately and efficiently, preferably in sub-linear time. Previous approaches are mostly based on a two-step procedure: first learn an inner-product model, and then use some approximate nearest neighbor (ANN) search algorithm to find top candidates. In this paper, we present Deep Retrieval (DR), to learn a retrievable structure directly with user-item interaction data (e.g. clicks) without resorting to the Euclidean space assumption in ANN algorithms. DR's structure encodes all candidate items into a discrete latent space. Those latent codes for the candidates are model parameters and learnt together with other neural network parameters to maximize the same objective function. With the model learnt, a beam search over the structure is performed to retrieve the top candidates for reranking. Empirically, we first demonstrate that DR, with sub-linear computational complexity, can achieve almost the same accuracy as the brute-force baseline on two public datasets. Moreover, we show that, in a live production recommendation system, a deployed DR approach significantly outperforms a well-tuned ANN baseline in terms of engagement metrics. To the best of our knowledge, DR is among the first non-ANN algorithms successfully deployed at the scale of hundreds of millions of items for industrial recommendation systems.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源