论文标题
在过度散光模型中的不确定性定量中双重研究
On double-descent in uncertainty quantification in overparametrized models
论文作者
论文摘要
不确定性量化是可靠且值得信赖的机器学习的核心挑战。众所周知,诸如最后层得分之类的天真措施在过度散热的神经网络中产生过度自信的估计。已经提出了从温度缩放到不同贝叶斯治疗神经网络的几种方法,以减轻过度自信,最常通过数值观察结果支持,即它们产生更好的校准不确定性测量。在这项工作中,我们在数学上可拖延的模型中为二进制分类的流行不确定性度量进行了彻底的比较:过度参数化的神经网络:随机特征模型。我们讨论了分类精度和校准之间的权衡,在最佳正则化估计器的校准曲线中揭示了双重下降的行为,这是过度参数化的函数。这与经验贝叶斯法相反,尽管概括误差和过度参数化,但我们在环境中表现出了很好的校准。
Uncertainty quantification is a central challenge in reliable and trustworthy machine learning. Naive measures such as last-layer scores are well-known to yield overconfident estimates in the context of overparametrized neural networks. Several methods, ranging from temperature scaling to different Bayesian treatments of neural networks, have been proposed to mitigate overconfidence, most often supported by the numerical observation that they yield better calibrated uncertainty measures. In this work, we provide a sharp comparison between popular uncertainty measures for binary classification in a mathematically tractable model for overparametrized neural networks: the random features model. We discuss a trade-off between classification accuracy and calibration, unveiling a double descent like behavior in the calibration curve of optimally regularized estimators as a function of overparametrization. This is in contrast with the empirical Bayes method, which we show to be well calibrated in our setting despite the higher generalization error and overparametrization.