论文标题
一般功能类别的有效且可区分的共形预测
Efficient and Differentiable Conformal Prediction with General Function Classes
论文作者
论文摘要
量化学习任务中数据不确定性通常是通过学习输入的标签的预测间隔或预测集来完成的。学习预测集的两个通常所需的属性是\ emph {有效的覆盖范围}和\ emph {良好的效率}(例如低长度或低基数)。共形预测是一种具有有效覆盖范围的学习预测集的强大技术,但默认情况下,其共形步骤只能学习一个参数,并且不会优化效率在更具表现力的函数类别上。 在本文中,我们提出了对多个可学习参数的共形预测的概括,通过考虑限制的经验风险最小化(ERM)问题,即发现最有效的预测设置为有效的经验覆盖范围。这种元估计值概括了现有的保形预测算法,我们表明,每当在保构步骤中的函数类别在一定意义上都是低功能,它在类内实现了近似的有效人口覆盖范围和近乎最佳的效率。接下来,此ERM问题优化是具有挑战性的,因为它涉及非不同的覆盖范围约束。我们通过使用可替代的替代损失和拉格朗日人近似于原始的约束ERM来开发基于梯度的算法。实验表明,我们的算法能够学习有效的预测集,并在多种应用程序中显着提高效率,例如在多种应用程序中,例如预测间隔,具有改进的长度,最小值预测集,用于多出输出回归,以及用于图像分类的标签预测集。
Quantifying the data uncertainty in learning tasks is often done by learning a prediction interval or prediction set of the label given the input. Two commonly desired properties for learned prediction sets are \emph{valid coverage} and \emph{good efficiency} (such as low length or low cardinality). Conformal prediction is a powerful technique for learning prediction sets with valid coverage, yet by default its conformalization step only learns a single parameter, and does not optimize the efficiency over more expressive function classes. In this paper, we propose a generalization of conformal prediction to multiple learnable parameters, by considering the constrained empirical risk minimization (ERM) problem of finding the most efficient prediction set subject to valid empirical coverage. This meta-algorithm generalizes existing conformal prediction algorithms, and we show that it achieves approximate valid population coverage and near-optimal efficiency within class, whenever the function class in the conformalization step is low-capacity in a certain sense. Next, this ERM problem is challenging to optimize as it involves a non-differentiable coverage constraint. We develop a gradient-based algorithm for it by approximating the original constrained ERM using differentiable surrogate losses and Lagrangians. Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly over existing approaches in several applications such as prediction intervals with improved length, minimum-volume prediction sets for multi-output regression, and label prediction sets for image classification.