论文标题
基于实时采样的模型预测控制基于反向kullback-leibler差异及其自适应加速度
Real-time Sampling-based Model Predictive Control based on Reverse Kullback-Leibler Divergence and Its Adaptive Acceleration
论文作者
论文摘要
基于抽样的模型预测控制(MPC)可以应用于多功能机器人系统。但是,由于其不稳定的更新和融合不良,因此对其进行实时控制是一个巨大的挑战。本文通过反向Kullback-Leibler Divergence的新颖推论来应对这一挑战,该差异具有寻求模式的行为,很可能很早就找到了亚最佳解决方案之一。通过这种推导,获得了带有正/负权重的加权最大似然估计,通过镜下降(MD)算法求解。负面权重消除了不必要的动作,但需要开发实施实施,从而避免基于拒绝抽样的正/负面更新的干扰。此外,尽管可以使用Nesterov的加速方法加速MD的收敛性,但它通过提议的MPC进行了修改,其启发式的步骤大小适应于更新量估计的噪声。在实时模拟中,所提出的方法可以比常规方法更统计地求解任务,并且仅由于加速度的改进而使用CPU完成更复杂的任务。此外,在力驱动的移动机器人的可变阻抗控制中也证明了其适用性。 https://youtu.be/d8bfmzct1xm
Sampling-based model predictive control (MPC) can be applied to versatile robotic systems. However, the real-time control with it is a big challenge due to its unstable updates and poor convergence. This paper tackles this challenge with a novel derivation from reverse Kullback-Leibler divergence, which has a mode-seeking behavior and is likely to find one of the sub-optimal solutions early. With this derivation, a weighted maximum likelihood estimation with positive/negative weights is obtained, solving by mirror descent (MD) algorithm. While the negative weights eliminate unnecessary actions, that requires to develop a practical implementation that avoids the interference with positive/negative updates based on rejection sampling. In addition, although the convergence of MD can be accelerated with Nesterov's acceleration method, it is modified for the proposed MPC with a heuristic of a step size adaptive to the noise estimated in update amounts. In the real-time simulations, the proposed method can solve more tasks statistically than the conventional method and accomplish more complex tasks only with a CPU due to the improved acceleration. In addition, its applicability is also demonstrated in a variable impedance control of a force-driven mobile robot. https://youtu.be/D8bFMzct1XM