使用微电网进行边缘计算的风险感知能源调度：一种多代理的深入增强学习方法

论文标题

使用微电网进行边缘计算的风险感知能源调度：一种多代理的深入增强学习方法

Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A Multi-Agent Deep Reinforcement Learning Approach

论文作者

Munir, Md. Shirajum, Abedin, Sarder Fakhrul, Tran, Nguyen H., Han, Zhu, Huh, Eui-Nam, Hong, Choong Seon

论文摘要

近年来，多访问边缘计算（MEC）是处理物联网应用程序和服务大量扩展的关键推动力。但是，MEC网络的能源消耗取决于引起能源需求估算风险的动荡任务。作为能源供应商，微电网可以促进无缝的能源供应。但是，由于可再生和不可再生来源的能源产生不可预测的能源，与能源供应相关的风险也增加了。特别是，能源短缺的风险与能源消耗和发电的不确定性有关。在本文中，我们研究了微电网驱动的MEC网络的风险感知能源调度问题。首先，考虑到能源消耗和发电的条件价值（CVAR）测量，我们制定了一个优化问题，在这种情况下，目的是最大程度地减少MEC网络的预期剩余能量，我们表明此问题是NP-HARD问题。其次，我们使用多代理随机游戏来分析我们的制定问题，该游戏可确保联合策略NASH平衡，并显示提出的模型的收敛性。第三，我们通过应用具有共同神经网络的基于多代理的深钢筋学习（MADRL）算法（A3C）算法来得出解决方案。这种方法减轻了状态空间维度的诅咒，并选择了拟议问题的代理商中的最佳政策。最后，实验结果通过考虑CVAR的高精度能量调度，比单个和随机代理模型来确定了显着的性能增长。

In recent years, multi-access edge computing (MEC) is a key enabler for handling the massive expansion of Internet of Things (IoT) applications and services. However, energy consumption of a MEC network depends on volatile tasks that induces risk for energy demand estimations. As an energy supplier, a microgrid can facilitate seamless energy supply. However, the risk associated with energy supply is also increased due to unpredictable energy generation from renewable and non-renewable sources. Especially, the risk of energy shortfall is involved with uncertainties in both energy consumption and generation. In this paper, we study a risk-aware energy scheduling problem for a microgrid-powered MEC network. First, we formulate an optimization problem considering the conditional value-at-risk (CVaR) measurement for both energy consumption and generation, where the objective is to minimize the expected residual of scheduled energy for the MEC networks and we show this problem is an NP-hard problem. Second, we analyze our formulated problem using a multi-agent stochastic game that ensures the joint policy Nash equilibrium, and show the convergence of the proposed model. Third, we derive the solution by applying a multi-agent deep reinforcement learning (MADRL)-based asynchronous advantage actor-critic (A3C) algorithm with shared neural networks. This method mitigates the curse of dimensionality of the state space and chooses the best policy among the agents for the proposed problem. Finally, the experimental results establish a significant performance gain by considering CVaR for high accuracy energy scheduling of the proposed model than both the single and random agent models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题