论文标题
求解凸平的光滑函数受约束优化几乎与不受约束的优化一样容易
Solving Convex Smooth Function Constrained Optimization Is Almost As Easy As Unconstrained Optimization
论文作者
论文摘要
Consider applying first-order methods to solve the smooth convex constrained optimization problem of the form $\min_{x \in X} F(x).$ For a simple closed convex set $X$ which is easy to project onto, Nesterov proposed the Accelerated Gradient Descent (AGD) method to solve the constrained problem as efficiently as an unconstrained problem in terms of the number of gradient computations of $F$ (i.e., Oracle复杂性)。对于通过函数约束描述的更复杂的$ \ nathcal {x} $,即$ \ nathcal {x} = \ {x \ {x \ in x:g(x)\ leq 0 \} $,在$ \ mathcal {x} $上的投影不可能是一个问题,因为它是否可能限制了问题,因为该问题是否可以解决问题,是否有效地构成了一个问题。 $ f $和$ g $的梯度计算。在本文中,我们通过提出单循环加速约束梯度下降(ACGD)方法来为问题提供肯定的答案。 ACGD方法通过将下降步骤更改为受约束的下降步骤来修改AGD方法,这仅在Prox映射中增加了几个线性约束。它的甲骨文复杂性几乎与最大化最佳Lagrangian功能的甲骨文复杂性相同,即,固定在最佳乘数$λ^*$的Lagrangian乘数$λ$。这些上甲骨文的复杂性边界显示在具有新的低甲骨文复杂性范围的特定最佳状态下是无法改善的。为了提高其对大规模问题的效率,并具有许多函数限制,我们引入了一个带有滑动(ACGD-S)方法的ACGD,该方法用一系列基本的矩阵矢量乘法序列代替了可能的计算苛刻的约束下降步骤。 ACGD-S方法具有与ACGD方法相同的Oracle复杂性,并且其计算复杂性以矩阵矢量乘法的数量测量,也不可解决。
Consider applying first-order methods to solve the smooth convex constrained optimization problem of the form $\min_{x \in X} F(x).$ For a simple closed convex set $X$ which is easy to project onto, Nesterov proposed the Accelerated Gradient Descent (AGD) method to solve the constrained problem as efficiently as an unconstrained problem in terms of the number of gradient computations of $F$ (i.e., oracle complexity). For a more complicated $\mathcal{X}$ described by function constraints, i.e., $\mathcal{X} = \{x \in X: g(x) \leq 0\}$, where the projection onto $\mathcal{X}$ is not possible, it is an open question whether the function constrained problem can be solved as efficiently as an unconstrained problem in terms of the number of gradient computations for $F$ and $g$. In this paper, we provide an affirmative answer to the question by proposing a single-loop Accelerated Constrained Gradient Descent (ACGD) method. The ACGD method modifies the AGD method by changing the descent step to a constrained descent step, which adds only a few linear constraints to the prox mapping. It enjoys almost the same oracle complexity as the optimal one for minimizing the optimal Lagrangian function, i.e., the Lagrangian multiplier $λ$ being fixed to the optimal multiplier $λ^*$. These upper oracle complexity bounds are shown to be unimprovable under a certain optimality regime with new lower oracle complexity bounds. To enhance its efficiency for large-scale problems with many function constraints, we introduce an ACGD with Sliding (ACGD-S) method which replaces the possibly computationally demanding constrained descent step with a sequence of basic matrix-vector multiplications. The ACGD-S method shares the same oracle complexity as the ACGD method, and its computation complexity, measured by the number of matrix-vector multiplications, is also unimprovable.