型号ratatouille：回收各种模型以分布概括

论文标题

型号ratatouille：回收各种模型以分布概括

Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization

论文作者

Ramé, Alexandre, Ahuja, Kartik, Zhang, Jianyu, Cord, Matthieu, Bottou, Léon, Lopez-Paz, David

论文摘要

基础模型正在重新定义AI系统的构建方式。从业者现在遵循标准程序来构建其机器学习解决方案：从预先训练的基础模型中，他们对目标的目标任务进行了微调。因此，互联网因在许多不同的任务上进行了微调而被少数基础模型蜂拥而至：这些单独的微调孤立地存在而又不会彼此受益。我们认为，这是一个错过的机会，因为这些专业模型包含丰富而多样的功能。在本文中，我们提出了模型ratatouille，这是一种新的策略，旨在回收有关不同辅助任务的同一基础模型的多次微调。具体而言，我们将这些辅助权重作为目标任务的多个平行微调的初始化；然后，我们平均所有微调权重以获得最终模型。这种回收策略旨在通过利用辅助任务中的多样性来最大程度地提高权重的多样性。从经验上讲，它改善了参考域基准测试的最新技术，用于分布概括。展望未来，这项工作有助于更新机器学习的新兴范式，类似于开源软件开发，社区可以可靠地更新机器学习模型。我们的代码发布：https：//github.com/facebookresearch/modelratatouille。

Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions: from a pre-trained foundation model, they fine-tune the weights on the target task of interest. So, the Internet is swarmed by a handful of foundation models fine-tuned on many diverse tasks: these individual fine-tunings exist in isolation without benefiting from each other. In our opinion, this is a missed opportunity, as these specialized models contain rich and diverse features. In this paper, we thus propose model ratatouille, a new strategy to recycle the multiple fine-tunings of the same foundation model on diverse auxiliary tasks. Specifically, we repurpose these auxiliary weights as initializations for multiple parallel fine-tunings on the target task; then, we average all fine-tuned weights to obtain the final model. This recycling strategy aims at maximizing the diversity in weights by leveraging the diversity in auxiliary tasks. Empirically, it improves the state of the art on the reference DomainBed benchmark for out-of-distribution generalization. Looking forward, this work contributes to the emerging paradigm of updatable machine learning where, akin to open-source software development, the community collaborates to reliably update machine learning models. Our code is released: https://github.com/facebookresearch/ModelRatatouille.

下载PDF全文

下载文献需遵守相关版权规定

论文标题