具有参数不确定性的系统的安全基于模型的增强学习

论文标题

具有参数不确定性的系统的安全基于模型的增强学习

Safe Model-Based Reinforcement Learning for Systems with Parametric Uncertainties

论文作者

Mahmud, S M Nahid, Nivison, Scott A, Bell, Zachary I., Kamalapurkar, Rushikesh

论文摘要

在过去的十年中，已经建立了强化学习，作为寻找动态系统最佳控制政策的有效工具，最近专注于确保在学习和/或执行阶段安全性安全的方法。通常，当系统至关重要和/或任务重新启动时，安全保证对于加强学习至关重要。在最佳控制理论中，安全要求通常以状态和/或控制约束表示。近年来，依靠持续激发的强化学习方法与障碍转化相结合，以学习状态约束下的最佳控制政策。为了软化激发要求，基于模型的增强学习方法依靠确切的模型知识也与障碍转换框架集成在一起。本文的目的是为确定性的非线性系统开发安全的加强学习方法，并在模型中使用参数不确定性，以学习近似约束的最佳策略，而不依赖严格的激发条件。为此，本文开发了一种基于模型的增强学习技术，该技术利用一种新型过滤的并发学习方法以及障碍转换，以同时学习未知的模型参数以及对安全临界系统的最佳最佳状态限制的控制策略。

Reinforcement learning has been established over the past decade as an effective tool to find optimal control policies for dynamical systems, with recent focus on approaches that guarantee safety during the learning and/or execution phases. In general, safety guarantees are critical in reinforcement learning when the system is safety-critical and/or task restarts are not practically feasible. In optimal control theory, safety requirements are often expressed in terms of state and/or control constraints. In recent years, reinforcement learning approaches that rely on persistent excitation have been combined with a barrier transformation to learn the optimal control policies under state constraints. To soften the excitation requirements, model-based reinforcement learning methods that rely on exact model knowledge have also been integrated with the barrier transformation framework. The objective of this paper is to develop safe reinforcement learning method for deterministic nonlinear systems, with parametric uncertainties in the model, to learn approximate constrained optimal policies without relying on stringent excitation conditions. To that end, a model-based reinforcement learning technique that utilizes a novel filtered concurrent learning method, along with a barrier transformation, is developed in this paper to realize simultaneous learning of unknown model parameters and approximate optimal state-constrained control policies for safety-critical systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题