在最小化polyak-lojasiewicz功能的下限

论文标题

在最小化polyak-lojasiewicz功能的下限

On the Lower Bound of Minimizing Polyak-Łojasiewicz Functions

论文作者

Yue, Pengyun, Fang, Cong, Lin, Zhouchen

论文摘要

Polyak-lojasiewicz（PL）[Polyak，1963]条件比强凸度弱，但足以确保梯度下降算法的全局收敛。在本文中，我们使用一阶甲壳研究算法的下限，以找到近似的最佳解决方案。我们表明，任何一阶算法都需要至少$ω\ left（\ frac {l}μ\ log \ frac {1} {\ varepsilon} \ right）$梯度成本才能找到$ \ varepsilon $ -approximate $ -approximate $ -sapproximate $ -sapproximate $ - $ - $ - $ - $ - $ -SMOOTH $ - $ - SMOTH $ $ $ $ $ - 该结果证明了梯度下降算法在存在``硬''PL函数的意义上最小化平滑PL函数的最佳性，因此当忽略数值常数时，没有一阶算法可以比梯度下降更快。相反，众所周知，动量技术，例如[Nesterov，2003年，第一章。 2]可以证明可以将梯度下降到$ {o} \ left（\ sqrt {\ frac {\ frac {l} {\hatμ}} \ log \ frac {1} {\ varepsilon} \ varepsilon} \ right）$用于$ l $ l $ smoth and $ smoth and $ hat $的功能的梯度成本。因此，我们的结果区分了最小化平滑的PL功能和平滑凸功能的硬度，因为通常任何多项式顺序都无法改善前者的复杂性。

Polyak-Łojasiewicz (PL) [Polyak, 1963] condition is a weaker condition than the strong convexity but suffices to ensure a global convergence for the Gradient Descent algorithm. In this paper, we study the lower bound of algorithms using first-order oracles to find an approximate optimal solution. We show that any first-order algorithm requires at least $Ω\left(\frac{L}μ\log\frac{1}{\varepsilon}\right)$ gradient costs to find an $\varepsilon$-approximate optimal solution for a general $L$-smooth function that has an $μ$-PL constant. This result demonstrates the optimality of the Gradient Descent algorithm to minimize smooth PL functions in the sense that there exists a ``hard'' PL function such that no first-order algorithm can be faster than Gradient Descent when ignoring a numerical constant. In contrast, it is well-known that the momentum technique, e.g. [Nesterov, 2003, chap. 2] can provably accelerate Gradient Descent to ${O}\left(\sqrt{\frac{L}{\hatμ}}\log\frac{1}{\varepsilon}\right)$ gradient costs for functions that are $L$-smooth and $\hatμ$-strongly convex. Therefore, our result distinguishes the hardness of minimizing a smooth PL function and a smooth strongly convex function as the complexity of the former cannot be improved by any polynomial order in general.

下载PDF全文

下载文献需遵守相关版权规定

论文标题