论文标题
minimax最佳算法,带固定算法 - $ k $ - 最近的邻居
Minimax Optimal Algorithms with Fixed-$k$-Nearest Neighbors
论文作者
论文摘要
本文介绍了如何基于固定的 - $ k $最近的邻居(NN)搜索执行最小值最佳分类,回归和密度估计。我们考虑了一个分布式学习方案,其中大量数据集分为较小的组,在该组中,对于每个数据子集找到了$ k $ -nns的查询点。我们建议\ emph {optimal}规则,以汇总固定的 - $ k $ -nn信息,用于分类,回归和密度估计,以实现各自问题的最小值最佳利率。我们表明,在足够大量的组上,具有固定$ K $的分布式算法在某些规律性条件下达到了最小的最佳错误率,最高的误差率是最高的乘法对数因子。粗略地说,带有$ M $组的分布式$ k $ -nn规则的性能与标准$θ(km)$ - NN规则相当,即使对于固定$ k $也是如此。
This paper presents how to perform minimax optimal classification, regression, and density estimation based on fixed-$k$ nearest neighbor (NN) searches. We consider a distributed learning scenario, in which a massive dataset is split into smaller groups, where the $k$-NNs are found for a query point with respect to each subset of data. We propose \emph{optimal} rules to aggregate the fixed-$k$-NN information for classification, regression, and density estimation that achieve minimax optimal rates for the respective problems. We show that the distributed algorithm with a fixed $k$ over a sufficiently large number of groups attains a minimax optimal error rate up to a multiplicative logarithmic factor under some regularity conditions. Roughly speaking, distributed $k$-NN rules with $M$ groups has a performance comparable to the standard $Θ(kM)$-NN rules even for fixed $k$.