为离散时间随机系统学习可证明稳定神经控制器

论文标题

为离散时间随机系统学习可证明稳定神经控制器

Learning Provably Stabilizing Neural Controllers for Discrete-Time Stochastic Systems

论文作者

Ansaripour, Matin, Chatterjee, Krishnendu, Henzinger, Thomas A., Lechner, Mathias, Žikelić, Đorđe

论文摘要

我们考虑了离散时间随机系统中学习控制政策的问题，该系统确保该系统在某些指定的稳定区域内稳定，概率约为$ 1 $。我们的方法是基于我们在这项工作中介绍的稳定排名超级马特林（SRSM）的新颖概念。我们的SRSM克服了以前的作品中提出的方法的限制，其适用性仅限于在任何控制策略下输入一旦输入的系统。我们提出了一个学习过程，该过程将学习控制策略以及正式认证概率〜$ 1 $稳定性的SRSM，均为神经网络。我们表明，此过程也可以正式验证，在给定的Lipschitz连续控制策略下，随机系统在某些稳定区域内稳定，概率约为$ 1 $。我们的实验评估表明，我们的学习程序可以成功地学习实践中的稳定政策。

We consider the problem of learning control policies in discrete-time stochastic systems which guarantee that the system stabilizes within some specified stabilization region with probability~$1$. Our approach is based on the novel notion of stabilizing ranking supermartingales (sRSMs) that we introduce in this work. Our sRSMs overcome the limitation of methods proposed in previous works whose applicability is restricted to systems in which the stabilizing region cannot be left once entered under any control policy. We present a learning procedure that learns a control policy together with an sRSM that formally certifies probability~$1$ stability, both learned as neural networks. We show that this procedure can also be adapted to formally verifying that, under a given Lipschitz continuous control policy, the stochastic system stabilizes within some stabilizing region with probability~$1$. Our experimental evaluation shows that our learning procedure can successfully learn provably stabilizing policies in practice.

下载PDF全文

下载文献需遵守相关版权规定

论文标题