论文标题

有效的多用户延迟限制的调度,并进行深度重复的增强学习

Effective Multi-User Delay-Constrained Scheduling with Deep Recurrent Reinforcement Learning

论文作者

Hu, Pihe, Pan, Ling, Chen, Yu, Fang, Zhixuan, Huang, Longbo

论文摘要

多用户延迟约束调度在许多现实世界应用中都很重要,包括无线通信,实时流和云计算。然而,这构成了一个关键的挑战,因为调度程序需要做出实时决策,以确保没有系统动态的事先信息,这可能是时间变化且难以估算的。此外,许多实际情况都遭受了部分可观察性问题,例如,由于感知噪声或隐藏的相关性。为了应对这些挑战,我们提出了一种深入的强化学习(DRL)算法,称为经常性SoftMax延迟了深层双重确定性策略梯度($ \ Mathtt {rsd4} $),这是一种基于数据驱动的方法,基于部分观察到的Markov决策过程(POMDP)表述。 $ \ mathtt {rsd4} $分别通过拉格朗日双重和延迟敏感的队列保证资源和延迟约束。它还可以通过经常性神经网络(RNN)启用的内存机制有效地解决部分可观察性,并引入用户级分解和节点级别的合并以确保可扩展性。对模拟/现实世界数据集进行的广泛实验表明,$ \ Mathtt {RSD4} $对系统动力学和部分可观察到的环境是可靠的,并且比现有的DRL和非基于DRL的方法实现了卓越的性能。

Multi-user delay constrained scheduling is important in many real-world applications including wireless communication, live streaming, and cloud computing. Yet, it poses a critical challenge since the scheduler needs to make real-time decisions to guarantee the delay and resource constraints simultaneously without prior information of system dynamics, which can be time-varying and hard to estimate. Moreover, many practical scenarios suffer from partial observability issues, e.g., due to sensing noise or hidden correlation. To tackle these challenges, we propose a deep reinforcement learning (DRL) algorithm, named Recurrent Softmax Delayed Deep Double Deterministic Policy Gradient ($\mathtt{RSD4}$), which is a data-driven method based on a Partially Observed Markov Decision Process (POMDP) formulation. $\mathtt{RSD4}$ guarantees resource and delay constraints by Lagrangian dual and delay-sensitive queues, respectively. It also efficiently tackles partial observability with a memory mechanism enabled by the recurrent neural network (RNN) and introduces user-level decomposition and node-level merging to ensure scalability. Extensive experiments on simulated/real-world datasets demonstrate that $\mathtt{RSD4}$ is robust to system dynamics and partially observable environments, and achieves superior performances over existing DRL and non-DRL-based methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源