论文标题
基于强化学习的方法,用于分布式上下文管理系统中的自适应上下文缓存
Reinforcement Learning Based Approaches to Adaptive Context Caching in Distributed Context Management Systems
论文作者
论文摘要
性能指标驱动的上下文缓存对实时上下文查询的分布式上下文管理系统中的吞吐量和响应时间有深远的影响。本文提出了一种基于增强学习的方法,以适应缓存上下文,目的是最大程度地减少上下文管理系统在响应上下文查询时产生的成本。我们的新颖算法使上下文查询和子查询可以有效地重复使用和重新使用缓存的上下文。通过三个主要功能,这种方法与传统数据缓存方法不同。首先,我们不使用上下文或上下文查询负载进行选择性上下文缓存入院。其次,我们开发并合并了创新的启发式模型,以计算做出决策时缓存项目的预期性能。第三,我们的策略定义了时间感知的连续缓存动作空间。我们提出了两种强化学习代理,一个使用深层确定性策略梯度方法的值估计参与者 - 批评者的价值函数。本文还提出了自适应政策,例如驱逐和缓存记忆缩放,以补充我们的目标。使用合成生成的上下文子征值的负载以及从现实世界数据和查询样本启发的合成数据集评估我们的方法。我们进一步研究了不同设置下的最佳自适应缓存配置。本文介绍,比较并讨论了我们的发现,即提出的选择性缓存方法达到了短期和长期成本和绩效效率。本文表明,所提出的方法的表现优于上下文管理模式,例如重定向模式和数据库模式,以及将所有策略的成本效率高达60%。
Performance metrics-driven context caching has a profound impact on throughput and response time in distributed context management systems for real-time context queries. This paper proposes a reinforcement learning based approach to adaptively cache context with the objective of minimizing the cost incurred by context management systems in responding to context queries. Our novel algorithms enable context queries and sub-queries to reuse and repurpose cached context in an efficient manner. This approach is distinctive to traditional data caching approaches by three main features. First, we make selective context cache admissions using no prior knowledge of the context, or the context query load. Secondly, we develop and incorporate innovative heuristic models to calculate expected performance of caching an item when making the decisions. Thirdly, our strategy defines a time-aware continuous cache action space. We present two reinforcement learning agents, a value function estimating actor-critic agent and a policy search agent using deep deterministic policy gradient method. The paper also proposes adaptive policies such as eviction and cache memory scaling to complement our objective. Our method is evaluated using a synthetically generated load of context sub-queries and a synthetic data set inspired from real world data and query samples. We further investigate optimal adaptive caching configurations under different settings. This paper presents, compares, and discusses our findings that the proposed selective caching methods reach short- and long-term cost- and performance-efficiency. The paper demonstrates that the proposed methods outperform other modes of context management such as redirector mode, and database mode, and cache all policy by up to 60% in cost efficiency.