通过Riemannian梯度下降和随机初始化，快速全局收敛，用于低级矩阵恢复

论文标题

通过Riemannian梯度下降和随机初始化，快速全局收敛，用于低级矩阵恢复

Fast Global Convergence for Low-rank Matrix Recovery via Riemannian Gradient Descent with Random Initialization

论文作者

Hou, Thomas Y., Li, Zhenzhen, Zhang, Ziyun

论文摘要

在本文中，我们为Riemannian歧管上的一类低排名矩阵恢复问题提供了一个新的全球分析框架。我们通过随机初始化分析了Riemannian优化的全局行为。我们使用Riemannian梯度下降算法来最小化最小二乘损失函数，并研究渐近行为以及确切的收敛速率。我们揭示了低级矩阵歧管的先前未知的几何特性，这是对歧管上简单最小二乘功能的虚假临界点的存在。我们表明，在某些假设下，riemannian梯度下降是从具有高概率的随机初始化开始的，避免了这些虚假的临界点，并且仅以几乎线性的收敛速率收敛到地面真理，即$ \ nathcal {o}（\ text {log}（log} {log}（\ frac {\ frac {1} {1} {1}+ \ texters $ texters $ texter（nog） $ε$ - 精确解决方案。我们使用两个应用程序作为我们全球分析的示例。第一个是等级-1矩阵恢复问题。第二个是对高斯相位检索问题的概括。它只能满足弱的轴测特性，但其行为与第一个的行为相似，除了额外的马鞍集。我们的收敛保证几乎是最佳的，几乎不含尺寸，这充分说明了数值观察。通过随机测量结构和经验最小二乘损失函数，全局分析可能会扩展到其他数据问题。

In this paper, we propose a new global analysis framework for a class of low-rank matrix recovery problems on the Riemannian manifold. We analyze the global behavior for the Riemannian optimization with random initialization. We use the Riemannian gradient descent algorithm to minimize a least squares loss function, and study the asymptotic behavior as well as the exact convergence rate. We reveal a previously unknown geometric property of the low-rank matrix manifold, which is the existence of spurious critical points for the simple least squares function on the manifold. We show that under some assumptions, the Riemannian gradient descent starting from a random initialization with high probability avoids these spurious critical points and only converges to the ground truth in nearly linear convergence rate, i.e. $\mathcal{O}(\text{log}(\frac{1}ε)+ \text{log}(n))$ iterations to reach an $ε$-accurate solution. We use two applications as examples for our global analysis. The first one is a rank-1 matrix recovery problem. The second one is a generalization of the Gaussian phase retrieval problem. It only satisfies the weak isometry property, but has behavior similar to that of the first one except for an extra saddle set. Our convergence guarantee is nearly optimal and almost dimension-free, which fully explains the numerical observations. The global analysis can be potentially extended to other data problems with random measurement structures and empirical least squares loss functions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题