理想：从黑框模型中查询有效的无数据学习

论文标题

理想：从黑框模型中查询有效的无数据学习

IDEAL: Query-Efficient Data-Free Learning from Black-box Models

论文作者

Zhang, Jie, Chen, Chen, Lyu, Lingjuan

论文摘要

知识蒸馏（KD）是借助训练有素的教师模型来训练轻量级学生模型的典型方法。但是，大多数KD方法都需要访问教师的培训数据或模型参数，这是不现实的。为了解决这个问题，最近的Works在无数据和黑色盒子设置下研究KD。然而，这些作品需要对教师模型进行大量查询，这会产生巨大的货币和计算成本。为了解决这些问题，我们提出了一种称为\ emph {查询有效的数据从黑框模型}（理想）学习的新方法，该方法旨在从黑色框模型API中进行查询，以训练一个没有任何真实数据的好学生。详细说明，理想分为两个阶段训练学生模型：数据生成和模型蒸馏。请注意，理想在数据生成阶段不需要任何查询，并且在蒸馏阶段仅查询每个样本的老师一次。在各种现实世界数据集上进行的广泛实验显示了所提出的理想的有效性。例如，理想可以将最佳基线方法DFME的性能提高到CIFAR10数据集的5.83％，而DFME的查询预算仅为0.02x。

Knowledge Distillation (KD) is a typical method for training a lightweight student model with the help of a well-trained teacher model. However, most KD methods require access to either the teacher's training data or model parameters, which is unrealistic. To tackle this problem, recent works study KD under data-free and black-box settings. Nevertheless, these works require a large number of queries to the teacher model, which incurs significant monetary and computational costs. To address these problems, we propose a novel method called \emph{query-effIcient Data-free lEarning from blAck-box modeLs} (IDEAL), which aims to query-efficiently learn from black-box model APIs to train a good student without any real data. In detail, IDEAL trains the student model in two stages: data generation and model distillation. Note that IDEAL does not require any query in the data generation stage and queries the teacher only once for each sample in the distillation stage. Extensive experiments on various real-world datasets show the effectiveness of the proposed IDEAL. For instance, IDEAL can improve the performance of the best baseline method DFME by 5.83% on CIFAR10 dataset with only 0.02x the query budget of DFME.

下载PDF全文

下载文献需遵守相关版权规定

论文标题