论文标题

预测性推断与弱监督

Predictive Inference with Weak Supervision

论文作者

Cauchois, Maxime, Gupta, Suyash, Ali, Alnur, Duchi, John

论文摘要

在大规模统计机器学习中获取标签的费用使部分和标记的数据具有吸引力,尽管并不总是很明显如何利用此类数据进行模型拟合或验证。我们提出了一种弥合部分监督和验证之间差距的方法,开发了一个保形预测框架,以提供有效的预测置信集 - 使用弱标记的数据覆盖具有规定概率的真实标签的集合。为此,我们引入了(必要的)覆盖范围和预测有效性的新概念,然后开发几种应用程序,为分类提供有效的算法和几个大规模的结构化预测问题。我们证实了这样的假设,即新的覆盖范围定义可以通过多个实验设定更紧密,更有用(但有效)的置信度。

The expense of acquiring labels in large-scale statistical machine learning makes partially and weakly-labeled data attractive, though it is not always apparent how to leverage such data for model fitting or validation. We present a methodology to bridge the gap between partial supervision and validation, developing a conformal prediction framework to provide valid predictive confidence sets -- sets that cover a true label with a prescribed probability, independent of the underlying distribution -- using weakly labeled data. To do so, we introduce a (necessary) new notion of coverage and predictive validity, then develop several application scenarios, providing efficient algorithms for classification and several large-scale structured prediction problems. We corroborate the hypothesis that the new coverage definition allows for tighter and more informative (but valid) confidence sets through several experiments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源