论文标题
在低级矩阵恢复中,有多少个样本值是一个很好的初始点?
How Many Samples is a Good Initial Point Worth in Low-rank Matrix Recovery?
论文作者
论文摘要
给定足够大的标记数据,非凸低级数矩阵恢复问题不包含虚假的局部最小值,因此从任何初始猜测开始,局部优化算法可以收敛到全球最小值。但是,这种理论保证所需的实际数据量非常悲观,因为它必须防止现有的任何地方,包括在对抗性位置,包括在现有的任何地方。相比之下,基于良好初始猜测的先前工作具有更现实的数据要求,因为它们允许在解决方案附近存在虚假的本地最小值。在本文中,我们量化了初始猜测的质量与数据要求相应降低之间的关系。使用限制的等距常数作为样品复杂性的替代物,我们计算了防止优化景观上每个特定点所需的较高阈值数量的样本,使其成为虚假的局部最小值。优化对景观区域的阈值,我们看到,对于地面真相的初始点,初始猜测质量的线性提高等于样本复杂性的恒定因素改善。
Given a sufficiently large amount of labeled data, the non-convex low-rank matrix recovery problem contains no spurious local minima, so a local optimization algorithm is guaranteed to converge to a global minimum starting from any initial guess. However, the actual amount of data needed by this theoretical guarantee is very pessimistic, as it must prevent spurious local minima from existing anywhere, including at adversarial locations. In contrast, prior work based on good initial guesses have more realistic data requirements, because they allow spurious local minima to exist outside of a neighborhood of the solution. In this paper, we quantify the relationship between the quality of the initial guess and the corresponding reduction in data requirements. Using the restricted isometry constant as a surrogate for sample complexity, we compute a sharp threshold number of samples needed to prevent each specific point on the optimization landscape from becoming a spurious local minimum. Optimizing the threshold over regions of the landscape, we see that for initial points around the ground truth, a linear improvement in the quality of the initial guess amounts to a constant factor improvement in the sample complexity.