论文标题
通过严格互补性优化梯度方法的局部线性收敛
Local Linear Convergence of Gradient Methods for Subspace Optimization via Strict Complementarity
论文作者
论文摘要
我们考虑的优化问题在其中找到了$ \ Mathbb {r}^n $,$ k << n $的$ K $维二维子空间,该子空间可最大程度地减少凸和平稳的损失。此类问题概括了主成分分析(PCA)的基本任务,包括鲁棒和稀疏的对应物以及二进制数据的逻辑PCA等。可以通过具有高效迭代的非凸梯度方法来解决这个问题,但是对于那些关于快速收敛到全球最小化器的争论很困难,或者是通过凸的放松来争论与全球最小化的融合,但相应的方法通常在高尺寸中效率低。在这项工作中,我们在严格的互补性假设下桥接了这两种方法,这特别意味着凸放松的最佳解决方案是独一无二的,并且也是原始非凸问题的最佳解决方案。我们的主要结果是证明了\ textit {svd-free}的天然非凸梯度方法,并且每次迭代只需要一个$ n \ times k $矩阵的单个qR-factorization,并以线性速率在本地收敛。我们还为非凸的投影梯度方法建立了线性收敛结果,当应用于凸松弛时,Frank-Wolfe方法。
We consider optimization problems in which the goal is find a $k$-dimensional subspace of $\mathbb{R}^n$, $k<<n$, which minimizes a convex and smooth loss. Such problems generalize the fundamental task of principal component analysis (PCA) to include robust and sparse counterparts, and logistic PCA for binary data, among others. This problem could be approached either via nonconvex gradient methods with highly-efficient iterations, but for which arguing about fast convergence to a global minimizer is difficult or, via a convex relaxation for which arguing about convergence to a global minimizer is straightforward, but the corresponding methods are often inefficient in high dimensions. In this work we bridge these two approaches under a strict complementarity assumption, which in particular implies that the optimal solution to the convex relaxation is unique and is also the optimal solution to the original nonconvex problem. Our main result is a proof that a natural nonconvex gradient method which is \textit{SVD-free} and requires only a single QR-factorization of an $n\times k$ matrix per iteration, converges locally with a linear rate. We also establish linear convergence results for the nonconvex projected gradient method, and the Frank-Wolfe method when applied to the convex relaxation.