通过间隔MDP抽象的不确定性了解计划的安全学习

论文标题

通过间隔MDP抽象的不确定性了解计划的安全学习

Safe Learning for Uncertainty-Aware Planning via Interval MDP Abstraction

论文作者

Jiang, Jesse, Zhao, Ye, Coogan, Samuel

论文摘要

我们研究了针对使用语法上的句法线性时间性时间逻辑（SCLTL）定义的计划规范的部分随机系统的可满足性界限的问题。我们提出了一种基于抽象的方法，该方法迭代地生成了系统的高信心马尔可夫决策过程（IMDP）从高信心界限上对通过高斯过程回归获得的动力学的未知组件的摘要。特别是，我们通过找到避免使用产品IMDP的规范竞争状态的路径来制定合成策略来采样未知动力学。我们进一步提供了一种启发式方法，可以在各种候选途径中进行选择，以最大程度地提高信息增益。最后，我们提出了一种迭代算法，以综合产品IMDP系统令人满意的控制策略。我们通过有关移动机器人导航的案例研究演示了我们的工作。

We study the problem of refining satisfiability bounds for partially-known stochastic systems against planning specifications defined using syntactically co-safe Linear Temporal Logic (scLTL). We propose an abstraction-based approach that iteratively generates high-confidence Interval Markov Decision Process (IMDP) abstractions of the system from high-confidence bounds on the unknown component of the dynamics obtained via Gaussian process regression. In particular, we develop a synthesis strategy to sample the unknown dynamics by finding paths which avoid specification-violating states using a product IMDP. We further provide a heuristic to choose among various candidate paths to maximize the information gain. Finally, we propose an iterative algorithm to synthesize a satisfying control policy for the product IMDP system. We demonstrate our work with a case study on mobile robot navigation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题