论文标题

通过Wasserstein梯度流量流出离散的Langevin采样器

Discrete Langevin Sampler via Wasserstein Gradient Flow

论文作者

Sun, Haoran, Dai, Hanjun, Dai, Bo, Zhou, Haomin, Schuurmans, Dale

论文摘要

众所周知,可以将基于梯度的MCMC采样器(例如Langevin Monte Carlo(LMC))推导为梯度流的粒子版本,以最大程度地减少Wasserstein歧管上的KL差异。此类采样器的卓越效率激发了最近将LMC推广到离散空间的几项尝试。但是,由于样品空间中缺乏定义明确的梯度,将Langevin动力学的完全原则性扩展到离散空间尚未实现。在这项工作中,我们展示了如何自然地将Wasserstein梯度流概括为离散空间。鉴于提出的公式,我们证明了如何随后开发Langevin动力学的离散类似物。有了这种新的理解,我们揭示了如何通过选择特定的离散化来作为特殊情况获得的最新基于梯度的采样器。更重要的是,该框架还允许推导新型算法,其中一种是通过对过渡矩阵的分解估计来获得的。 DLMC方法接受了方便的并行实现和时均匀的采样,可实现更大的跳跃距离。我们证明了DLMC在各种二进制和分类分布方面的优势。

It is known that gradient-based MCMC samplers for continuous spaces, such as Langevin Monte Carlo (LMC), can be derived as particle versions of a gradient flow that minimizes KL divergence on a Wasserstein manifold. The superior efficiency of such samplers has motivated several recent attempts to generalize LMC to discrete spaces. However, a fully principled extension of Langevin dynamics to discrete spaces has yet to be achieved, due to the lack of well-defined gradients in the sample space. In this work, we show how the Wasserstein gradient flow can be generalized naturally to discrete spaces. Given the proposed formulation, we demonstrate how a discrete analogue of Langevin dynamics can subsequently be developed. With this new understanding, we reveal how recent gradient-based samplers in discrete spaces can be obtained as special cases by choosing particular discretizations. More importantly, the framework also allows for the derivation of novel algorithms, one of which, \textit{Discrete Langevin Monte Carlo} (DLMC), is obtained by a factorized estimate of the transition matrix. The DLMC method admits a convenient parallel implementation and time-uniform sampling that achieves larger jump distances. We demonstrate the advantages of DLMC on various binary and categorical distributions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源