minimax最佳算法，带固定算法 - $ k $ - 最近的邻居

论文标题

minimax最佳算法，带固定算法 - $ k $ - 最近的邻居

Minimax Optimal Algorithms with Fixed-$k$-Nearest Neighbors

论文作者

Ryu, J. Jon, Kim, Young-Han

论文摘要

本文介绍了如何基于固定的 - $ k $最近的邻居（NN）搜索执行最小值最佳分类，回归和密度估计。我们考虑了一个分布式学习方案，其中大量数据集分为较小的组，在该组中，对于每个数据子集找到了$ k $ -nns的查询点。我们建议\ emph {optimal}规则，以汇总固定的 - $ k $ -nn信息，用于分类，回归和密度估计，以实现各自问题的最小值最佳利率。我们表明，在足够大量的组上，具有固定$ K $的分布式算法在某些规律性条件下达到了最小的最佳错误率，最高的误差率是最高的乘法对数因子。粗略地说，带有$ M $组的分布式$ k $ -nn规则的性能与标准$θ（km）$ - NN规则相当，即使对于固定$ k $也是如此。

This paper presents how to perform minimax optimal classification, regression, and density estimation based on fixed-$k$ nearest neighbor (NN) searches. We consider a distributed learning scenario, in which a massive dataset is split into smaller groups, where the $k$-NNs are found for a query point with respect to each subset of data. We propose \emph{optimal} rules to aggregate the fixed-$k$-NN information for classification, regression, and density estimation that achieve minimax optimal rates for the respective problems. We show that the distributed algorithm with a fixed $k$ over a sufficiently large number of groups attains a minimax optimal error rate up to a multiplicative logarithmic factor under some regularity conditions. Roughly speaking, distributed $k$-NN rules with $M$ groups has a performance comparable to the standard $Θ(kM)$-NN rules even for fixed $k$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题