论文标题
元学习MPC使用有限维高斯过程近似
Meta Learning MPC using Finite-Dimensional Gaussian Process Approximations
论文作者
论文摘要
近年来,数据可用性已大大增加,以基于模型的控制方法利用学习技术来改善系统描述,从而控制性能。阻碍学习方法在控制中的实际适用性的两个关键因素是它们的计算复杂性很高,并且对看不见的条件有限。元学习是一种强大的工具,可以在一组有限的相关任务中进行有效的学习,从而减轻对新看不见的任务的适应。本文通过学习一种系统模型来利用从以前的相关任务中利用数据的系统模型,同时在闭环操作期间对当前任务进行快速微调,利用元学习方法来进行自适应模型预测控制。该动力学是通过高斯过程回归建模的,并在karhunen-lo {è} ve膨胀上建立,可以将其近似重新重新构成,以作为内核特征函数的有限线性组合。使用在一组任务上收集的数据,通过最大化对数 - 边界可能性的变异结合,在元训练阶段进行了特征功能超参数。在元测试期间,本本函数是固定的,因此仅通过贝叶斯线性回归以在线自适应方式将线性参数以在线自适应方式适应了新的看不见的任务,从而提供了一种简单有效的推理方案。为自动赛车提供了仿真结果,该赛车的微型赛车适应了看不见的道路状况。
Data availability has dramatically increased in recent years, driving model-based control methods to exploit learning techniques for improving the system description, and thus control performance. Two key factors that hinder the practical applicability of learning methods in control are their high computational complexity and limited generalization capabilities to unseen conditions. Meta-learning is a powerful tool that enables efficient learning across a finite set of related tasks, easing adaptation to new unseen tasks. This paper makes use of a meta-learning approach for adaptive model predictive control, by learning a system model that leverages data from previous related tasks, while enabling fast fine-tuning to the current task during closed-loop operation. The dynamics is modeled via Gaussian process regression and, building on the Karhunen-Lo{è}ve expansion, can be approximately reformulated as a finite linear combination of kernel eigenfunctions. Using data collected over a set of tasks, the eigenfunction hyperparameters are optimized in a meta-training phase by maximizing a variational bound for the log-marginal likelihood. During meta-testing, the eigenfunctions are fixed, so that only the linear parameters are adapted to the new unseen task in an online adaptive fashion via Bayesian linear regression, providing a simple and efficient inference scheme. Simulation results are provided for autonomous racing with miniature race cars adapting to unseen road conditions.