论文标题
最大$ \ ell_1 $ -margin分类器的紧密界限
Tight bounds for maximum $\ell_1$-margin classifiers
论文作者
论文摘要
流行的迭代算法,例如增强方法和线性模型上的坐标下降,将最大$ \ ell_1 $ -margin分类器收敛,又称稀疏的硬质量svm,在数据中,数据是可分离的,在高维度下。以前的作品始终表明,许多依赖$ \ ell_1 $ norm的估计器可提高稀疏地面真相的统计率。我们表明,令人惊讶的是,对于标准判别设置,这种适应性不适用于最大$ \ ell_1 $ -margin分类器。特别是,对于无噪声设置,我们证明了与现有的订单率$ \ frac {\ | w^*\ | _1^{2/3}} {n^{1/3}} $的预测率相匹配的预测误差的紧密上限和下限。为了完成图片,我们表明,在插值嘈杂的观察值时,错误以$ \ frac {1} {\ sqrt {\ sqrt {\ log(d/n)}} $消失。因此,我们首先以最大$ \ ell_1 $ -margin分类器显示良性过拟合。
Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum $\ell_1$-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|w^*\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order $\frac{1}{\sqrt{\log(d/n)}}$. We are therefore first to show benign overfitting for the maximum $\ell_1$-margin classifier.