基于实时采样的模型预测控制基于反向kullback-leibler差异及其自适应加速度

论文标题

基于实时采样的模型预测控制基于反向kullback-leibler差异及其自适应加速度

Real-time Sampling-based Model Predictive Control based on Reverse Kullback-Leibler Divergence and Its Adaptive Acceleration

论文作者

Kobayashi, Taisuke, Fukumoto, Kota

论文摘要

基于抽样的模型预测控制（MPC）可以应用于多功能机器人系统。但是，由于其不稳定的更新和融合不良，因此对其进行实时控制是一个巨大的挑战。本文通过反向Kullback-Leibler Divergence的新颖推论来应对这一挑战，该差异具有寻求模式的行为，很可能很早就找到了亚最佳解决方案之一。通过这种推导，获得了带有正/负权重的加权最大似然估计，通过镜下降（MD）算法求解。负面权重消除了不必要的动作，但需要开发实施实施，从而避免基于拒绝抽样的正/负面更新的干扰。此外，尽管可以使用Nesterov的加速方法加速MD的收敛性，但它通过提议的MPC进行了修改，其启发式的步骤大小适应于更新量估计的噪声。在实时模拟中，所提出的方法可以比常规方法更统计地求解任务，并且仅由于加速度的改进而使用CPU完成更复杂的任务。此外，在力驱动的移动机器人的可变阻抗控制中也证明了其适用性。 https://youtu.be/d8bfmzct1xm

Sampling-based model predictive control (MPC) can be applied to versatile robotic systems. However, the real-time control with it is a big challenge due to its unstable updates and poor convergence. This paper tackles this challenge with a novel derivation from reverse Kullback-Leibler divergence, which has a mode-seeking behavior and is likely to find one of the sub-optimal solutions early. With this derivation, a weighted maximum likelihood estimation with positive/negative weights is obtained, solving by mirror descent (MD) algorithm. While the negative weights eliminate unnecessary actions, that requires to develop a practical implementation that avoids the interference with positive/negative updates based on rejection sampling. In addition, although the convergence of MD can be accelerated with Nesterov's acceleration method, it is modified for the proposed MPC with a heuristic of a step size adaptive to the noise estimated in update amounts. In the real-time simulations, the proposed method can solve more tasks statistically than the conventional method and accomplish more complex tasks only with a CPU due to the improved acceleration. In addition, its applicability is also demonstrated in a variable impedance control of a force-driven mobile robot. https://youtu.be/D8bFMzct1XM

下载PDF全文

下载文献需遵守相关版权规定

论文标题