压缩经验措施（以有限维度）

论文标题

压缩经验措施（以有限维度）

Compressed Empirical Measures (in finite dimensions)

论文作者

Grünewälder, Steffen

论文摘要

我们研究了在有限尺寸再现Hilbert Spaces（RKHSS）的背景下压缩经验度量的方法。在这种情况下，经验度量包含在天然凸集内，可以使用凸优化方法近似。这样的近似产生了数据点的核心。控制这种核心必须大大的关键数量是经验凸组中包含的经验度量最大球的大小。我们的大部分工作与在各种条件下和各种环境中得出这样的球大小的高概率下限有关：我们显示数据密度和核函数的条件如何用于推断此类下限；我们进一步开发了一种方法，该方法在协方差操作员的最小特征值上使用下限，以在这种球的大小上提供下限。我们将方法扩展到近似协方差运算符，并展示如何在内核脊回归的背景下使用它。当使用条件梯度方法（如条件梯度方法）时，我们还会得出压缩保证，并且我们讨论了此类算法的变化以改善这些标准算法的运行时。最后，我们结束了无限尺寸的RKH，压缩很差，突出了人们试图移动无限尺寸RKHSS时所面临的一些困难。

We study approaches for compressing the empirical measure in the context of finite dimensional reproducing kernel Hilbert spaces (RKHSs). In this context, the empirical measure is contained within a natural convex set and can be approximated using convex optimization methods. Such an approximation gives rise to a coreset of data points. A key quantity that controls how large such a coreset has to be is the size of the largest ball around the empirical measure that is contained within the empirical convex set. The bulk of our work is concerned with deriving high probability lower bounds on the size of such a ball under various conditions and in various settings: we show how conditions on the density of the data and the kernel function can be used to infer such lower bounds; we further develop an approach that uses a lower bound on the smallest eigenvalue of a covariance operator to provide lower bounds on the size of such a ball; we extend the approach to approximate covariance operators and we show how it can be used in the context of kernel ridge regression. We also derive compression guarantees when standard algorithms like the conditional gradient method are used and we discuss variations of such algorithms to improve the runtime of these standard algorithms. We conclude with a construction of an infinite dimensional RKHS for which the compression is poor, highlighting some of the difficulties one faces when trying to move to infinite dimensional RKHSs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题