二进制分类和成对置信度比较

论文标题

二进制分类和成对置信度比较

Pointwise Binary Classification with Pairwise Confidence Comparisons

论文作者

Feng, Lei, Shu, Senlin, Lu, Nan, Han, Bo, Xu, Miao, Niu, Gang, An, Bo, Sugiyama, Masashi

论文摘要

为了减轻培训有效的二进制分类器的数据需求，在二进制分类中，已经提出了许多弱监督的学习环境。其中，有些人考虑使用成对但不点击标签，当由于隐私，机密性或安全原因而无法访问点标签时。但是，作为成对标签表示两个数据点是否共享一个点标签，如果两个点同样可能是正面或负面的，则不能轻易收集它。因此，在本文中，我们提出了一种称为成对比较（PCOMP）分类的新颖设置，其中我们只有一对未标记的数据，我们知道一个数据比另一个更可能是正面的。首先，我们给出了PCOMP数据生成过程，通过理论保证得出一个公正的风险估计器（URE），并使用校正功能进一步改善了URE。其次，我们将PCOMP分类与嘈杂的标签学习联系起来，以开发渐进的ure并通过施加一致性正则化来改进它。最后，我们通过实验证明了方法的有效性，这表明PCOMP除了成对标签外是一种有价值且实际上有用的成对监督类型。

To alleviate the data requirement for training effective binary classifiers in binary classification, many weakly supervised learning settings have been proposed. Among them, some consider using pairwise but not pointwise labels, when pointwise labels are not accessible due to privacy, confidentiality, or security reasons. However, as a pairwise label denotes whether or not two data points share a pointwise label, it cannot be easily collected if either point is equally likely to be positive or negative. Thus, in this paper, we propose a novel setting called pairwise comparison (Pcomp) classification, where we have only pairs of unlabeled data that we know one is more likely to be positive than the other. Firstly, we give a Pcomp data generation process, derive an unbiased risk estimator (URE) with theoretical guarantee, and further improve URE using correction functions. Secondly, we link Pcomp classification to noisy-label learning to develop a progressive URE and improve it by imposing consistency regularization. Finally, we demonstrate by experiments the effectiveness of our methods, which suggests Pcomp is a valuable and practically useful type of pairwise supervision besides the pairwise label.

下载PDF全文

下载文献需遵守相关版权规定

论文标题