基于二元性的随机策略优化，以估算未知的噪声协方差

论文标题

基于二元性的随机策略优化，以估算未知的噪声协方差

Duality-Based Stochastic Policy Optimization for Estimation with Unknown Noise Covariances

论文作者

Talebi, Shahriar, Taghvaei, Amirhossein, Mesbahi, Mehran

论文摘要

控制和估计的双重性允许将数据引导控制的最新进展映射到估计设置。本文正式化并利用了这样的映射来考虑学习最佳（稳态）卡尔曼在未知的过程和测量噪声统计数据时增益。具体而言，基于合成最佳控制和估计收益之间的双重性，过滤器设计问题被形式化为直接策略学习。在这个方向上，双重性用于扩展线性二次调节器（LQR）直接策略更新的现有理论保证，以建立估计问题梯度下降（GD）算法的全球收敛性 - 而解决两个综合问题之间的微妙差异。随后，采用了随机梯度下降（SGD）方法，以学习最佳的卡尔曼增益，而无需了解噪声协方差。结果通过几个数值示例进行了说明。

Duality of control and estimation allows mapping recent advances in data-guided control to the estimation setup. This paper formalizes and utilizes such a mapping to consider learning the optimal (steady-state) Kalman gain when process and measurement noise statistics are unknown. Specifically, building on the duality between synthesizing optimal control and estimation gains, the filter design problem is formalized as direct policy learning. In this direction, the duality is used to extend existing theoretical guarantees of direct policy updates for Linear Quadratic Regulator (LQR) to establish global convergence of the Gradient Descent (GD) algorithm for the estimation problem--while addressing subtle differences between the two synthesis problems. Subsequently, a Stochastic Gradient Descent (SGD) approach is adopted to learn the optimal Kalman gain without the knowledge of noise covariances. The results are illustrated via several numerical examples.

下载PDF全文

下载文献需遵守相关版权规定

论文标题