最大$ \ ell_1 $ -margin分类器的紧密界限

论文标题

最大$ \ ell_1 $ -margin分类器的紧密界限

Tight bounds for maximum $\ell_1$-margin classifiers

论文作者

Stojanovic, Stefan, Donhauser, Konstantin, Yang, Fanny

论文摘要

流行的迭代算法，例如增强方法和线性模型上的坐标下降，将最大$ \ ell_1 $ -margin分类器收敛，又称稀疏的硬质量svm，在数据中，数据是可分离的，在高维度下。以前的作品始终表明，许多依赖$ \ ell_1 $ norm的估计器可提高稀疏地面真相的统计率。我们表明，令人惊讶的是，对于标准判别设置，这种适应性不适用于最大$ \ ell_1 $ -margin分类器。特别是，对于无噪声设置，我们证明了与现有的订单率$ \ frac {\ | w^*\ | _1^{2/3}} {n^{1/3}} $的预测率相匹配的预测误差的紧密上限和下限。为了完成图片，我们表明，在插值嘈杂的观察值时，错误以$ \ frac {1} {\ sqrt {\ sqrt {\ log（d/n）}} $消失。因此，我们首先以最大$ \ ell_1 $ -margin分类器显示良性过拟合。

Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum $\ell_1$-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|w^*\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order $\frac{1}{\sqrt{\log(d/n)}}$. We are therefore first to show benign overfitting for the maximum $\ell_1$-margin classifier.

下载PDF全文

下载文献需遵守相关版权规定

论文标题