改善视觉上均匀的火星漫游者图像的对比度学习

论文标题

改善视觉上均匀的火星漫游者图像的对比度学习

Improving Contrastive Learning on Visually Homogeneous Mars Rover Images

论文作者

Ward, Isaac Ronald, Moore, Charles, Pak, Kai, Chen, Jingdao, Goh, Edwin

论文摘要

尽管不需要培训标签，但最近的对比学习最近表现出优于监督学习的表现。我们探讨了如何将对比度学习应用于数十万个未标记的火星地形图像，从火星漫游者的好奇心和毅力中收集，以及从火星侦察轨道轨道上收集。这种方法具有吸引力，因为绝大多数火星图像都没有标记，因为手动注释是劳动密集型的，需要广泛的领域知识。然而，对比度学习假设任何给定的不同图像都包含不同的语义内容。这对于火星图像数据集来说是一个问题，因为由于地球表面缺乏视觉多样性，因此任何两对火星图像都可能在语义上相似。假设成对的图像将在视觉对比度中 - 实际上不是 - 结果 - 将其视为否定的成对，从而影响训练性能。在这项研究中，我们提出了两种解决此问题的方法：1）在MARS数据集上无监督的深度聚类步骤，该步骤识别包含相似语义内容的图像簇并纠正训练过程中的错误负面错误，2）一种简单的方法，该方法将来自不同领域的数据混合以增加训练数据集的视觉多样性。两种情况都降低了假阴对的速率，从而最大程度地减少了模型在对比度训练期间不正确惩罚的速率。这些修改后的方法仍然完全无监督的端到端。为了评估其性能，我们添加了一个线性层，该线性层训练以基于这些对比的特征来生成类预测，并与监督模型相比表明性能提高；仅使用标记数据的10％观察分类精度的提高3.06％。

Contrastive learning has recently demonstrated superior performance to supervised learning, despite requiring no training labels. We explore how contrastive learning can be applied to hundreds of thousands of unlabeled Mars terrain images, collected from the Mars rovers Curiosity and Perseverance, and from the Mars Reconnaissance Orbiter. Such methods are appealing since the vast majority of Mars images are unlabeled as manual annotation is labor intensive and requires extensive domain knowledge. Contrastive learning, however, assumes that any given pair of distinct images contain distinct semantic content. This is an issue for Mars image datasets, as any two pairs of Mars images are far more likely to be semantically similar due to the lack of visual diversity on the planet's surface. Making the assumption that pairs of images will be in visual contrast - when they are in fact not - results in pairs that are falsely considered as negatives, impacting training performance. In this study, we propose two approaches to resolve this: 1) an unsupervised deep clustering step on the Mars datasets, which identifies clusters of images containing similar semantic content and corrects false negative errors during training, and 2) a simple approach which mixes data from different domains to increase visual diversity of the total training dataset. Both cases reduce the rate of false negative pairs, thus minimizing the rate in which the model is incorrectly penalized during contrastive training. These modified approaches remain fully unsupervised end-to-end. To evaluate their performance, we add a single linear layer trained to generate class predictions based on these contrastively-learned features and demonstrate increased performance compared to supervised models; observing an improvement in classification accuracy of 3.06% using only 10% of the labeled data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题