vit-cat：具有交叉注意融合的平行视觉变压器，用于MEC网络中的受欢迎程度预测

论文标题

vit-cat：具有交叉注意融合的平行视觉变压器，用于MEC网络中的受欢迎程度预测

ViT-CAT: Parallel Vision Transformers with Cross Attention Fusion for Popularity Prediction in MEC Networks

论文作者

HajiAkhondi-Meybodi, Zohreh, Mohammadi, Arash, Hou, Ming, Abouei, Jamshid, Plataniotis, Konstantinos N.

论文摘要

移动边缘缓存（MEC）是第六代无线网络（6G）的革命性技术，有望通过在网络边缘提供存储能力来大大减少用户的延迟。但是，MEC网络的效率在很大程度上取决于其动态预测/更新使用TOP-K流行内容的缓存节点存储的能力。常规的统计缓存方案对内容请求的基本模式的时间变化性质并不强大，从而导致人们对使用深层神经网络（DNN）进行时间序列的流行度预测引起了人们的兴趣。但是，MEC背景下的现有DNN模型无法同时捕获历史请求模式的时间相关性和多个内容之间的依赖关系。这需要紧迫的寻求来开发和设计一种新的创新的知名度预测架构，以应对这一关键挑战。该论文通过根据注意机制提出一个新型的混合缓存框架来解决这一差距。所提出的结构称为具有交叉注意的平行视觉变压器（VIT-CAT）融合，由两个平行的VIT网络组成，一个用于收集时间相关，另一个用于捕获不同内容之间的依赖性。然后是互相注意（CA）模块作为融合中心（FC），所提出的VIT-CAT能够学习时间和空间相关性之间的相互信息，从而提高了分类准确性，并降低了模型的复杂性约8次。基于模拟结果，所提出的VIT-CAT架构在分类精度，复杂性和速率受欢迎的比率方面优于其对应物。

Mobile Edge Caching (MEC) is a revolutionary technology for the Sixth Generation (6G) of wireless networks with the promise to significantly reduce users' latency via offering storage capacities at the edge of the network. The efficiency of the MEC network, however, critically depends on its ability to dynamically predict/update the storage of caching nodes with the top-K popular contents. Conventional statistical caching schemes are not robust to the time-variant nature of the underlying pattern of content requests, resulting in a surge of interest in using Deep Neural Networks (DNNs) for time-series popularity prediction in MEC networks. However, existing DNN models within the context of MEC fail to simultaneously capture both temporal correlations of historical request patterns and the dependencies between multiple contents. This necessitates an urgent quest to develop and design a new and innovative popularity prediction architecture to tackle this critical challenge. The paper addresses this gap by proposing a novel hybrid caching framework based on the attention mechanism. Referred to as the parallel Vision Transformers with Cross Attention (ViT-CAT) Fusion, the proposed architecture consists of two parallel ViT networks, one for collecting temporal correlation, and the other for capturing dependencies between different contents. Followed by a Cross Attention (CA) module as the Fusion Center (FC), the proposed ViT-CAT is capable of learning the mutual information between temporal and spatial correlations, as well, resulting in improving the classification accuracy, and decreasing the model's complexity about 8 times. Based on the simulation results, the proposed ViT-CAT architecture outperforms its counterparts across the classification accuracy, complexity, and cache-hit ratio.

下载PDF全文

下载文献需遵守相关版权规定

论文标题