通过预先训练的条件生成模型转移学习

论文标题

通过预先训练的条件生成模型转移学习

Transfer Learning with Pre-trained Conditional Generative Models

论文作者

Yamaguchi, Shin'ya, Kanai, Sekitoshi, Kumagai, Atsutoshi, Chijiwa, Daiki, Kashima, Hisashi

论文摘要

转移学习对于培训有关新目标任务的深层神经网络至关重要。当前的传输学习方法始终假设（i）源和目标任务标签空间至少一个重叠，（ii）源数据集可用，并且（iii）目标网络架构与源架构一致。但是，在实际设置中持有这些假设很难，因为目标任务很少具有与源任务相同的标签，由于存储成本和隐私，源数据集访问受到限制，并且目标体系结构通常专门用于每个任务。为了传递没有这些假设的源知识，我们提出了一种使用深层生成模型的转移学习方法，并由以下两个阶段组成：伪预训练（PP）和伪半监督学习（PP）。 PP通过使用条件源生成模型合成的人工数据集训练目标体系结构。 P-SSL将SSL算法应用于标记的目标数据和未标记的伪样品，这些样本是通过级联源分类器和生成模型来生成的，以用目标样品调节它们。我们的实验结果表明，我们的方法可以优于刮擦训练和知识蒸馏的基准。

Transfer learning is crucial in training deep neural networks on new target tasks. Current transfer learning methods always assume at least one of (i) source and target task label spaces overlap, (ii) source datasets are available, and (iii) target network architectures are consistent with source ones. However, holding these assumptions is difficult in practical settings because the target task rarely has the same labels as the source task, the source dataset access is restricted due to storage costs and privacy, and the target architecture is often specialized to each task. To transfer source knowledge without these assumptions, we propose a transfer learning method that uses deep generative models and is composed of the following two stages: pseudo pre-training (PP) and pseudo semi-supervised learning (P-SSL). PP trains a target architecture with an artificial dataset synthesized by using conditional source generative models. P-SSL applies SSL algorithms to labeled target data and unlabeled pseudo samples, which are generated by cascading the source classifier and generative models to condition them with target samples. Our experimental results indicate that our method can outperform the baselines of scratch training and knowledge distillation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题