不确定的分段仿射系统的强大行动调速器具有非凸限制和安全的强化学习

论文标题

不确定的分段仿射系统的强大行动调速器具有非凸限制和安全的强化学习

Robust Action Governor for Uncertain Piecewise Affine Systems with Non-convex Constraints and Safe Reinforcement Learning

论文作者

Li, Yutong, Li, Nan, Tseng, H. Eric, Girard, Anouck, Filev, Dimitar, Kolmanovsky, Ilya

论文摘要

行动调速器是标称控制循环的附加方案，该方案监视和调整控制措施以强制执行以末端状态和控制约束表示的安全规范。在本文中，我们介绍了系统的强大动作调速器（RAG），可以使用具有参数和添加性不确定性的离散时间分段仿射（PWA）模型来表示其动力学，并受到非convex约束。我们开发了抹布的理论属性和计算方法。之后，我们介绍了抹布来实现安全加固学习（RL），即确保在线RL探索过程中确保有史以来的约束满意度。该开发使控制策略的安全实时演变以及适应操作环境和系统参数的变化（由于老化，损坏等）。我们通过考虑将其应用于质量 - 弹簧式抑制系统的软地面问题来说明抹布在约束执法和安全RL中的有效性。

The action governor is an add-on scheme to a nominal control loop that monitors and adjusts the control actions to enforce safety specifications expressed as pointwise-in-time state and control constraints. In this paper, we introduce the Robust Action Governor (RAG) for systems the dynamics of which can be represented using discrete-time Piecewise Affine (PWA) models with both parametric and additive uncertainties and subject to non-convex constraints. We develop the theoretical properties and computational approaches for the RAG. After that, we introduce the use of the RAG for realizing safe Reinforcement Learning (RL), i.e., ensuring all-time constraint satisfaction during online RL exploration-and-exploitation process. This development enables safe real-time evolution of the control policy and adaptation to changes in the operating environment and system parameters (due to aging, damage, etc.). We illustrate the effectiveness of the RAG in constraint enforcement and safe RL using the RAG by considering their applications to a soft-landing problem of a mass-spring-damper system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题