通过稳定学习改善多息网络

论文标题

通过稳定学习改善多息网络

Improving Multi-Interest Network with Stable Learning

论文作者

Liu, Zhaocheng, Luo, Yingtao, Zeng, Di, Liu, Qiang, Chang, Daqing, Kong, Dongying, Chen, Zhi

论文摘要

建模用户从历史行为中的动态偏好在于现代推荐系统的核心。由于用户兴趣的多样性，最近的进步建议多功能网络将历史行为编码为多个兴趣向量。在实际情况下，通常会一起检索相应的捕获兴趣项目，以获取曝光并收集到培训数据中，从而产生兴趣之间的依赖性。不幸的是，多息网络可能会错误地集中于被捕获的利益之间的微妙依赖性。被这些依赖性误导了，捕获了无关的利益和目标之间的虚假相关性，从而在训练和测试分布不匹配时导致预测结果不稳定。在本文中，我们介绍了广泛使用的Hilbert-Schmidt独立标准（HSIC）来衡量被捕获的利益之间的独立性程度，并经验表明，HSIC的持续增加可能会损害模型性能。基于此，我们提出了一个新型的多息网络，称为Deep稳定的多功能学习（Desmil），该网络试图通过学习权重以训练样本的学习权重消除微妙的依赖性在捕获的利益中的影响，并使模型更多地集中于潜在的真正因果关系。我们对公共建议数据集，大规模工业数据集和合成数据集进行了广泛的实验，这些数据集模拟了分布数据的数据集。实验结果表明，我们提出的Desmil的表现优于最先进的模型。此外，我们还进行了全面的模型分析，以揭示Desmil在一定程度上工作的原因。

Modeling users' dynamic preferences from historical behaviors lies at the core of modern recommender systems. Due to the diverse nature of user interests, recent advances propose the multi-interest networks to encode historical behaviors into multiple interest vectors. In real scenarios, the corresponding items of captured interests are usually retrieved together to get exposure and collected into training data, which produces dependencies among interests. Unfortunately, multi-interest networks may incorrectly concentrate on subtle dependencies among captured interests. Misled by these dependencies, the spurious correlations between irrelevant interests and targets are captured, resulting in the instability of prediction results when training and test distributions do not match. In this paper, we introduce the widely used Hilbert-Schmidt Independence Criterion (HSIC) to measure the degree of independence among captured interests and empirically show that the continuous increase of HSIC may harm model performance. Based on this, we propose a novel multi-interest network, named DEep Stable Multi-Interest Learning (DESMIL), which tries to eliminate the influence of subtle dependencies among captured interests via learning weights for training samples and make model concentrate more on underlying true causation. We conduct extensive experiments on public recommendation datasets, a large-scale industrial dataset and the synthetic datasets which simulate the out-of-distribution data. Experimental results demonstrate that our proposed DESMIL outperforms state-of-the-art models by a significant margin. Besides, we also conduct comprehensive model analysis to reveal the reason why DESMIL works to a certain extent.

下载PDF全文

下载文献需遵守相关版权规定

论文标题