论文标题
通过实时汇总灵活性基于学习的预测控制
Learning-Based Predictive Control via Real-Time Aggregate Flexibility
论文作者
论文摘要
聚合器已成为分布式可控载荷协调的关键工具。为了有效使用,聚合器必须能够传达其控制负载的可用灵活性,称为系统操作员的汇总灵活性。但是,大多数现有的总体灵活性措施通常是缓慢的估计,并且对聚合器和操作员之间的实时协调的关注要少得多。在本文中,我们考虑在闭环系统中求解在线优化,并介绍实时汇总灵活性反馈的设计,称为最大熵反馈(MEF)。除了得出MEF的分析性能,结合学习和控制外,我们还表明,可以使用增强学习将其近似,并用作新型控制算法中的惩罚项 - 受惩罚的预测性控制(PPC),该惩罚的预测控制(PPC)修改了香草模型预测性控制(MPC)。我们计划的好处是(1)。有效的沟通。运行PPC的操作员不需要知道负载的确切状态和约束,而仅知道MEF。 (2)。快速计算。 PPC通常比MPC公式的变量数量要小得多。 (3)。降低成本。我们表明,在某些规律性假设下,PPC是最佳的。我们使用自适应电动汽车充电网络中的数据集说明了PPC的功效,并表明PPC的表现优于经典MPC。
Aggregators have emerged as crucial tools for the coordination of distributed, controllable loads. To be used effectively, an aggregator must be able to communicate the available flexibility of the loads they control, as known as the aggregate flexibility to a system operator. However, most of existing aggregate flexibility measures often are slow-timescale estimations and much less attention has been paid to real-time coordination between an aggregator and an operator. In this paper, we consider solving an online optimization in a closed-loop system and present a design of real-time aggregate flexibility feedback, termed the maximum entropy feedback (MEF). In addition to deriving analytic properties of the MEF, combining learning and control, we show that it can be approximated using reinforcement learning and used as a penalty term in a novel control algorithm -- the penalized predictive control (PPC), which modifies vanilla model predictive control (MPC). The benefits of our scheme are (1). Efficient Communication. An operator running PPC does not need to know the exact states and constraints of the loads, but only the MEF. (2). Fast Computation. The PPC often has much less number of variables than an MPC formulation. (3). Lower Costs. We show that under certain regularity assumptions, the PPC is optimal. We illustrate the efficacy of the PPC using a dataset from an adaptive electric vehicle charging network and show that PPC outperforms classical MPC.