论文标题
迈向可持续的自学学习
Towards Sustainable Self-supervised Learning
论文作者
论文摘要
尽管越来越昂贵的培训,但大多数自我监督的学习(SSL)模型已反复从头开始训练,但并未完全使用,因为只有少数SOTA用于下游任务。在这项工作中,我们探索了一个可持续的SSL框架,面临两个主要的挑战:i)基于现有的经过审计的SSL模型,学习更强大的新SSL模型,也称为“基础”模型,以成本友好的方式,ii)允许对新模型进行培训,以与各种基础模型兼容。我们提出了一个目标增强条件方案(TEC)方案,该方案将两个组件引入了现有的基于蒙版的SSL。首先,我们提出了贴片连接增强的目标,该目标增强了基本模型给出的目标,并鼓励新模型通过使用不完整的输入从基本模型中学习语义关系知识。这种硬化和目标增强的帮助有助于新模型超过基本模型,因为它们强制执行其他补丁关系建模以处理不完整的输入。其次,我们引入了一个条件适配器,该适配器可自适应地调整新模型预测以与不同基本模型的目标保持一致。广泛的实验结果表明,我们的TEC方案可以加速学习速度,并改善SOTA SSL基本模型,例如MAE和IBOT,迈出了探索性的一步。
Although increasingly training-expensive, most self-supervised learning (SSL) models have repeatedly been trained from scratch but not fully utilized, since only a few SOTAs are employed for downstream tasks. In this work, we explore a sustainable SSL framework with two major challenges: i) learning a stronger new SSL model based on the existing pretrained SSL model, also called as "base" model, in a cost-friendly manner, ii) allowing the training of the new model to be compatible with various base models. We propose a Target-Enhanced Conditional (TEC) scheme which introduces two components to the existing mask-reconstruction based SSL. Firstly, we propose patch-relation enhanced targets which enhances the target given by base model and encourages the new model to learn semantic-relation knowledge from the base model by using incomplete inputs. This hardening and target-enhancing help the new model surpass the base model, since they enforce additional patch relation modeling to handle incomplete input. Secondly, we introduce a conditional adapter that adaptively adjusts new model prediction to align with the target of different base models. Extensive experimental results show that our TEC scheme can accelerate the learning speed, and also improve SOTA SSL base models, e.g., MAE and iBOT, taking an explorative step towards sustainable SSL.