小型变压器计算通用度量嵌入

论文标题

小型变压器计算通用度量嵌入

Small Transformers Compute Universal Metric Embeddings

论文作者

Kratsios, Anastasis, Debarnot, Valentin, Dokmanić, Ivan

论文摘要

我们使用运输公制（Delon和Desolneux 2020）中的单变量高斯混合物中的任意度量空间$ \ MATHCAL {x} $研究数据表示。我们得出了由称为\ emph {Probabilistic Transformers}的小神经网络实现的特征图的保证。我们的保证是记忆类型：我们证明，深度的概率变压器约为$ n \ log（n）$，宽度约为$ n^2 $ can bi-hölder嵌入了$ \ Mathcal {x} $的任何$ n $ - 点数据集，并带有低度量失真，从而避免了降级的质疑。我们进一步得出了概率的Bi-lipschitz保证，这些保证可以折断失真量和随机选择的点与这种失真嵌入的概率。如果$ \ Mathcal {X} $的几何形状足够规律，则我们可以为数据集中的所有点获得更强大的Bi-Lipschitz保证。作为应用程序，我们从riemannian歧管，公制树和某些类型的组合图中得出数据集的神经嵌入保证。相反，当嵌入多元高斯混合物中时，我们表明概率变压器可以计算bi-hölder嵌入的嵌入方式很小。

We study representations of data from an arbitrary metric space $\mathcal{X}$ in the space of univariate Gaussian mixtures with a transport metric (Delon and Desolneux 2020). We derive embedding guarantees for feature maps implemented by small neural networks called \emph{probabilistic transformers}. Our guarantees are of memorization type: we prove that a probabilistic transformer of depth about $n\log(n)$ and width about $n^2$ can bi-Hölder embed any $n$-point dataset from $\mathcal{X}$ with low metric distortion, thus avoiding the curse of dimensionality. We further derive probabilistic bi-Lipschitz guarantees, which trade off the amount of distortion and the probability that a randomly chosen pair of points embeds with that distortion. If $\mathcal{X}$'s geometry is sufficiently regular, we obtain stronger, bi-Lipschitz guarantees for all points in the dataset. As applications, we derive neural embedding guarantees for datasets from Riemannian manifolds, metric trees, and certain types of combinatorial graphs. When instead embedding into multivariate Gaussian mixtures, we show that probabilistic transformers can compute bi-Hölder embeddings with arbitrarily small distortion.

下载PDF全文

下载文献需遵守相关版权规定

论文标题