自我监督的预训练和艰难的例子改善了视觉表示

论文标题

自我监督的预训练和艰难的例子改善了视觉表示

Self-supervised Pre-training with Hard Examples Improves Visual Representations

论文作者

Li, Chunyuan, Li, Xiujun, Zhang, Lei, Peng, Baolin, Zhou, Mingyuan, Gao, Jianfeng

论文摘要

自我监督的预训练（SSP）采用随机图像转换来生成训练数据，以进行视觉表示学习。在本文中，我们首先提出一个建模框架，该框架将现有的SSP方法统一为学习伪标记。然后，我们提出了新的数据增强方法来生成训练示例，其伪标签比通过随机图像转换生成的伪标记更难预测。具体来说，我们使用对抗性训练和cutmix创建硬示例（HEXA），用作MOCO-V2和DEEPCLUSTER-V2的增强视图，从而分别导致两个变体Hexa_ {moco}和Hexa__ {dcluster}。在我们的实验中，我们在ImageNet上预先培训模型，并在多个公共基准上对其进行评估。我们的评估表明，这两种新算法变体的表现优于其原始算法，并在有限的任务监督可用于微调的各种任务上实现新的最新任务。这些结果证明了硬例子有助于改善预训练模型的概括。

Self-supervised pre-training (SSP) employs random image transformations to generate training data for visual representation learning. In this paper, we first present a modeling framework that unifies existing SSP methods as learning to predict pseudo-labels. Then, we propose new data augmentation methods of generating training examples whose pseudo-labels are harder to predict than those generated via random image transformations. Specifically, we use adversarial training and CutMix to create hard examples (HEXA) to be used as augmented views for MoCo-v2 and DeepCluster-v2, leading to two variants HEXA_{MoCo} and HEXA_{DCluster}, respectively. In our experiments, we pre-train models on ImageNet and evaluate them on multiple public benchmarks. Our evaluation shows that the two new algorithm variants outperform their original counterparts, and achieve new state-of-the-art on a wide range of tasks where limited task supervision is available for fine-tuning. These results verify that hard examples are instrumental in improving the generalization of the pre-trained models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题