论文标题
基于注意的微视频推荐的多模式特征表示模型
Attention-based Multimodal Feature Representation Model for Micro-video Recommendation
论文作者
论文摘要
在推荐系统中,模型主要使用嵌入层和多层馈电神经网络的组合。高维稀疏的原始特征在嵌入层中被缩小,然后馈入完全连接的网络以获得预测结果。但是,上述方法存在一个相当明显的问题,即,直接输入的功能被视为独立个人,实际上,功能和功能之间存在内部相关性,甚至不同的功能在建议中具有不同的重要性。在这方面,本文采用了一种自我激进的机制来挖掘特征及其相对重要性之间的内部相关性。近年来,作为一种特殊的注意力机制,自我发项机制受到许多研究人员的青睐。自我激进的机制通过学习本身来捕获数据或特征的内部相关性,从而降低了对外部来源的依赖。因此,本文采用了多头自我启动的机制来挖掘特征之间的内部相关性,从而学习了特征的内部表示。同时,考虑到在功能之间经常隐藏的丰富信息,两者之间通过交叉获得的新功能表示形式可能意味着用户的新描述喜欢该项目。但是,并非所有的交叉特征都是有意义的,即,特征组合表达有限的问题。因此,本文采用了一种基于注意力的方法来学习特征的外部交叉代理。
In recommender systems, models mostly use a combination of embedding layers and multilayer feedforward neural networks. The high-dimensional sparse original features are downscaled in the embedding layer and then fed into the fully connected network to obtain prediction results. However, the above methods have a rather obvious problem, that is, the features directly input are treated as independent individuals, and in fact there are internal correlations between features and features, and even different features have different importance in the recommendation. In this regard, this paper adopts a self-attentive mechanism to mine the internal correlations between features as well as their relative importance. In recent years, as a special form of attention mechanism, self-attention mechanism is favored by many researchers. The self-attentive mechanism captures the internal correlation of data or features by learning itself, thus reducing the dependence on external sources. Therefore, this paper adopts a multi-headed self-attentive mechanism to mine the internal correlations between features and thus learn the internal representation of features. At the same time, considering the rich information often hidden between features, the new feature representation obtained by crossover between the two is likely to imply the new description of the user likes the item. However, not all crossover features are meaningful, i.e., there is a problem of limited expression of feature combinations. Therefore, this paper adopts an attention-based approach to learn the external cross-representation of features.