论文标题
班级不平衡的半监督学习
Class-Imbalanced Semi-Supervised Learning
论文作者
论文摘要
半监督学习(SSL)在克服标记和充分利用未标记数据的困难方面取得了巨大的成功。但是,SSL的假设有限,即不同类别中的样本数量是平衡的,并且许多SSL算法显示出具有不平衡类别分布的数据集的性能较低。在本文中,我们介绍了一项类不平衡的半监督学习(CISSL)的任务,该任务是指使用类不平衡数据的半监督学习。在此过程中,我们考虑了标签和未标记的集合中的类不平衡。首先,我们分析了不平衡环境中现有的SSL方法,并检查类失衡如何影响SSL方法。然后,我们提出了抑制一致性损失(SCL),这是一种适合类失衡的正则化方法。我们的方法比CISSL环境中的常规方法显示出更好的性能。特别是,类不平衡越严重,标记数据的大小越小,我们的方法越好。
Semi-Supervised Learning (SSL) has achieved great success in overcoming the difficulties of labeling and making full use of unlabeled data. However, SSL has a limited assumption that the numbers of samples in different classes are balanced, and many SSL algorithms show lower performance for the datasets with the imbalanced class distribution. In this paper, we introduce a task of class-imbalanced semi-supervised learning (CISSL), which refers to semi-supervised learning with class-imbalanced data. In doing so, we consider class imbalance in both labeled and unlabeled sets. First, we analyze existing SSL methods in imbalanced environments and examine how the class imbalance affects SSL methods. Then we propose Suppressed Consistency Loss (SCL), a regularization method robust to class imbalance. Our method shows better performance than the conventional methods in the CISSL environment. In particular, the more severe the class imbalance and the smaller the size of the labeled data, the better our method performs.