论文标题
SSL4EO-S12:一种大型多模式的多模式,多阶梯数据集,用于地球观察中的自我监督学习
SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for Self-Supervised Learning in Earth Observation
论文作者
论文摘要
自我监督的预训练具有产生无人注释的表现形式的潜力。地球观察中的大多数预训练(EO)基于ImageNet或中型,标记为遥感(RS)数据集。我们共享一个未标记的RS数据集SSL4EO-S12(对地球观察的自我监督学习 - 哨兵1/2),以从ESA Sentinel-1 \&-2卫星卫星卫星卫星任务中组装大型,全球,多模式和多季节卫星图像。对于EO应用程序,我们证明SSL4EO-S12成功进行了一系列方法的自我监管的预训练:MOCO-V2,Dino,Mae和Data2Vec。由此产生的模型产生的下游性能接近或超过了监督学习的准确性度量。此外,与现有数据集相比,SSL4EO-S12的预训练非常出色。我们在https://github.com/zhu-xlab/ssl4eo-s12上公开提供数据集,相关源代码和预训练的模型。
Self-supervised pre-training bears potential to generate expressive representations without human annotation. Most pre-training in Earth observation (EO) are based on ImageNet or medium-size, labeled remote sensing (RS) datasets. We share an unlabeled RS dataset SSL4EO-S12 (Self-Supervised Learning for Earth Observation - Sentinel-1/2) to assemble a large-scale, global, multimodal, and multi-seasonal corpus of satellite imagery from the ESA Sentinel-1 \& -2 satellite missions. For EO applications we demonstrate SSL4EO-S12 to succeed in self-supervised pre-training for a set of methods: MoCo-v2, DINO, MAE, and data2vec. Resulting models yield downstream performance close to, or surpassing accuracy measures of supervised learning. In addition, pre-training on SSL4EO-S12 excels compared to existing datasets. We make openly available the dataset, related source code, and pre-trained models at https://github.com/zhu-xlab/SSL4EO-S12.