从平衡中积极学习的千古措施

论文标题

从平衡中积极学习的千古措施

An Ergodic Measure for Active Learning From Equilibrium

论文作者

Abraham, Ian, Prabhakar, Ahalya, Murphey, Todd D.

论文摘要

本文从平衡（$ \ text {kl-e}^3 $）中开发了KL-ergodic探索，这是一种机器人系统的方法，是将稳定性整合到通过ergodic探索积极生成信息测量中的方法。 Ergodic Exploration使机器人系统能够在全球信息空间分布中间接采样，避免局部Optima，而无需评估针对机器人动力学的分布的衍生物。使用混合系统理论，我们得出了一个控制器，该控制器允许机器人利用均衡策略（即解决任务的策略），同时允许机器人使用可以扩展到高维状态的Ergodic措施来探索和生成信息数据。我们表明，我们的方法能够在均衡任务上保持Lyapunov的吸引力，同时积极生成用于学习任务的数据，例如贝叶斯优化，模型学习和非政策钢筋学习。在每个示例中，我们都表明我们所提出的方法能够生成数据的信息分布，同时综合平滑控制信号。我们使用模拟系统说明了这些示例，并简化了我们在机器人系统中实时在线学习的方法。

This paper develops KL-Ergodic Exploration from Equilibrium ($\text{KL-E}^3$), a method for robotic systems to integrate stability into actively generating informative measurements through ergodic exploration. Ergodic exploration enables robotic systems to indirectly sample from informative spatial distributions globally, avoiding local optima, and without the need to evaluate the derivatives of the distribution against the robot dynamics. Using hybrid systems theory, we derive a controller that allows a robot to exploit equilibrium policies (i.e., policies that solve a task) while allowing the robot to explore and generate informative data using an ergodic measure that can extend to high-dimensional states. We show that our method is able to maintain Lyapunov attractiveness with respect to the equilibrium task while actively generating data for learning tasks such, as Bayesian optimization, model learning, and off-policy reinforcement learning. In each example, we show that our proposed method is capable of generating an informative distribution of data while synthesizing smooth control signals. We illustrate these examples using simulated systems and provide simplification of our method for real-time online learning in robotic systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题