Wasserstein近端梯度算法

论文标题

Wasserstein近端梯度算法

The Wasserstein Proximal Gradient Algorithm

论文作者

Salim, Adil, Korba, Anna, Luise, Giulia

论文摘要

Wasserstein梯度流是连续的时间动力学，它定义了最陡峭下降的曲线，以最大程度地减少概率测量空间（即Wasserstein空间）的目标函数。这个目标通常是差异W.R.T.固定目标分布。近年来，这些连续的时间动力学已用于研究旨在近似概率分布的机器学习算法的收敛性。但是，这些算法的离散时间行为可能与连续的时间动力学不同。此外，尽管文献中已经提出了离散的梯度流，但对它们的最小化能力知之甚少。在这项工作中，我们提出了一个向后的（FB）离散化方案，该方案可以解决目标函数是平滑且非平滑地测量凸的术语的总和。使用凸优化和最佳传输的技术，我们将FB方案分析为Wasserstein空间上的最小化算法。更确切地说，我们在轻度假设下表明，FB方案的收敛保证与欧几里得空间中的近端梯度算法相似。

Wasserstein gradient flows are continuous time dynamics that define curves of steepest descent to minimize an objective function over the space of probability measures (i.e., the Wasserstein space). This objective is typically a divergence w.r.t. a fixed target distribution. In recent years, these continuous time dynamics have been used to study the convergence of machine learning algorithms aiming at approximating a probability distribution. However, the discrete-time behavior of these algorithms might differ from the continuous time dynamics. Besides, although discretized gradient flows have been proposed in the literature, little is known about their minimization power. In this work, we propose a Forward Backward (FB) discretization scheme that can tackle the case where the objective function is the sum of a smooth and a nonsmooth geodesically convex terms. Using techniques from convex optimization and optimal transport, we analyze the FB scheme as a minimization algorithm on the Wasserstein space. More precisely, we show under mild assumptions that the FB scheme has convergence guarantees similar to the proximal gradient algorithm in Euclidean spaces.

下载PDF全文

下载文献需遵守相关版权规定

论文标题