在对比实例歧视中，所有负面因素是否相等？

论文标题

在对比实例歧视中，所有负面因素是否相等？

Are all negatives created equal in contrastive instance discrimination?

论文作者

Cai, Tiffany Tianhui, Frankle, Jonathan, Schwab, David J., Morcos, Ari S.

论文摘要

自我监督的学习最近已经开始与计算机视觉任务的监督学习相媲美。最近的许多方法都是基于对比实例歧视（CID），在该实例歧视（CID）中，网络经过培训，可以识别同一实例的两个增强版本（查询和正面），同时歧视其他实例（负）。然后将学习的表示形式用于下游任务，例如图像分类。使用Moco V2的方法（Chen等，2020），我们通过难以给定查询的难度来划分消极因素，并研究了哪些难度范围对于学习有用的表示最重要。我们发现，少数否定性（最难的5％）是必要的，足以使下游任务几乎完全准确。相反，最简单的95％的负面因素是不必要和不足的。此外，最难的0.1％的否定性是不必要的，有时是有害的。最后，我们研究了影响其硬度的负面因素的属性，并发现硬质底片在语义上与查询更为相似，并且某些负面因素比我们偶然的偶然性更容易或艰难。总之，我们的结果表明，负面因素有所不同，而CID可能会受益于更智能的负面处理。

Self-supervised learning has recently begun to rival supervised learning on computer vision tasks. Many of the recent approaches have been based on contrastive instance discrimination (CID), in which the network is trained to recognize two augmented versions of the same instance (a query and positive) while discriminating against a pool of other instances (negatives). The learned representation is then used on downstream tasks such as image classification. Using methodology from MoCo v2 (Chen et al., 2020), we divided negatives by their difficulty for a given query and studied which difficulty ranges were most important for learning useful representations. We found a minority of negatives -- the hardest 5% -- were both necessary and sufficient for the downstream task to reach nearly full accuracy. Conversely, the easiest 95% of negatives were unnecessary and insufficient. Moreover, the very hardest 0.1% of negatives were unnecessary and sometimes detrimental. Finally, we studied the properties of negatives that affect their hardness, and found that hard negatives were more semantically similar to the query, and that some negatives were more consistently easy or hard than we would expect by chance. Together, our results indicate that negatives vary in importance and that CID may benefit from more intelligent negative treatment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题