通过随机矩阵理论进行高维回归和分类的正则化

论文标题

通过随机矩阵理论进行高维回归和分类的正则化

Regularization in High-Dimensional Regression and Classification via Random Matrix Theory

论文作者

Lolas, Panagiotis

论文摘要

当特征的数量和样本量与无穷大的增长时，我们研究了高维回归和分类中的一般单数值收缩估计量。我们允许具有一般协方差矩阵的模型，其中包括大量的数据生成分布。就我们的结果的含义而言，我们发现渐变下降拟合的回归模型中的训练和测试误差的确切渐近公式，这为早期停止作为正则化方法提供了理论见解。此外，我们提出了一种基于线性判别分析中最佳特征值收缩分类器的协方差矩阵的经验光谱。最后，我们得出了高维分布的致密平均向量的最佳估计器。在我们的整个分析中，我们依靠随机矩阵理论的最新进展，并发展了独立数学兴趣的进一步结果。

We study general singular value shrinkage estimators in high-dimensional regression and classification, when the number of features and the sample size both grow proportionally to infinity. We allow models with general covariance matrices that include a large class of data generating distributions. As far as the implications of our results are concerned, we find exact asymptotic formulas for both the training and test errors in regression models fitted by gradient descent, which provides theoretical insights for early stopping as a regularization method. In addition, we propose a numerical method based on the empirical spectra of covariance matrices for the optimal eigenvalue shrinkage classifier in linear discriminant analysis. Finally, we derive optimal estimators for the dense mean vectors of high-dimensional distributions. Throughout our analysis we rely on recent advances in random matrix theory and develop further results of independent mathematical interest.

下载PDF全文

下载文献需遵守相关版权规定

论文标题