通过无衍生的强化学习，性能驱动的控制器调整

论文标题

通过无衍生的强化学习，性能驱动的控制器调整

Performance-Driven Controller Tuning via Derivative-Free Reinforcement Learning

论文作者

Lei, Yuheng, Chen, Jianyu, Li, Shengbo Eben, Zheng, Sifa

论文摘要

为设计控制器选择适当的参数集对于最终性能至关重要，但通常需要一个乏味，仔细的调整过程，这意味着强烈需要自动调整方法。但是，在现有的方法中，无衍生物的可扩展性或效率低下，而基于梯度的方法可能由于可能是非差异的控制器结构而无法使用。为了解决这些问题，我们使用新颖的无衍生化强化学习（RL）框架解决了控制器调整问题，该框架在经验收集过程中在参数领域中执行时间段的扰动，并将无衍生策略更新集成到高级参与者 - 批判性RL架构中，以实现高的多功能性和效率。为了证明该框架的功效，我们在自主驾驶的两个具体示例上进行了数值实验，即使用PID控制器和MPC控制器的PID控制器和轨迹跟踪进行自适应巡航控制。实验结果表明，所提出的方法的表现优于流行的基线，并突出了其强大的控制器调整潜力。

Choosing an appropriate parameter set for the designed controller is critical for the final performance but usually requires a tedious and careful tuning process, which implies a strong need for automatic tuning methods. However, among existing methods, derivative-free ones suffer from poor scalability or low efficiency, while gradient-based ones are often unavailable due to possibly non-differentiable controller structure. To resolve the issues, we tackle the controller tuning problem using a novel derivative-free reinforcement learning (RL) framework, which performs timestep-wise perturbation in parameter space during experience collection and integrates derivative-free policy updates into the advanced actor-critic RL architecture to achieve high versatility and efficiency. To demonstrate the framework's efficacy, we conduct numerical experiments on two concrete examples from autonomous driving, namely, adaptive cruise control with PID controller and trajectory tracking with MPC controller. Experimental results show that the proposed method outperforms popular baselines and highlight its strong potential for controller tuning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题