元学习的想法是否能够改善对标准监督学习的深度神经网络的概括？

论文标题

元学习的想法是否能够改善对标准监督学习的深度神经网络的概括？

Is the Meta-Learning Idea Able to Improve the Generalization of Deep Neural Networks on the Standard Supervised Learning?

论文作者

Deng, Xiang, Zhang, Zhongfei

论文摘要

为了在不引入更多参数的情况下获得更好的性能，已经做出了重大努力，以提高深神经网络（DNN）的概括能力。另一方面，元学习方法在几次学习中对新任务表现出强大的概括。在直觉上，很少有学习的学习比标准监督学习更具挑战性，因为每个目标课程只有很少或没有培训样本。出现的自然问题是，是否可以使用元学习思想来改善对标准监督学习的DNN的概括。在本文中，我们提出了一种针对DNNS的新型基于元学习的训练程序（MLTP），并证明元学习的想法确实可以提高DNN的概括能力。 MLTP通过考虑一批培训样本作为一项任务来模拟元训练过程。关键的想法是，改善当前任务绩效的梯度下降步骤也应提高新的任务绩效，这被当前训练神经网络的标准程序所忽略。 MLTP还受益于所有现有的培训技术，例如辍学，体重衰减和批处理标准化。我们通过在三个基准数据集（即CIFAR-10，CIFAR-100和Tiny Imagenet）上训练各种小型和大型神经网络来评估MLTP。实验结果表明，对所有大小不同的DNN的概括性能始终提高，这验证了MLTP的承诺，并证明了元学习的想法确实能够改善DNN对标准监督学习的概括。

Substantial efforts have been made on improving the generalization abilities of deep neural networks (DNNs) in order to obtain better performances without introducing more parameters. On the other hand, meta-learning approaches exhibit powerful generalization on new tasks in few-shot learning. Intuitively, few-shot learning is more challenging than the standard supervised learning as each target class only has a very few or no training samples. The natural question that arises is whether the meta-learning idea can be used for improving the generalization of DNNs on the standard supervised learning. In this paper, we propose a novel meta-learning based training procedure (MLTP) for DNNs and demonstrate that the meta-learning idea can indeed improve the generalization abilities of DNNs. MLTP simulates the meta-training process by considering a batch of training samples as a task. The key idea is that the gradient descent step for improving the current task performance should also improve a new task performance, which is ignored by the current standard procedure for training neural networks. MLTP also benefits from all the existing training techniques such as dropout, weight decay, and batch normalization. We evaluate MLTP by training a variety of small and large neural networks on three benchmark datasets, i.e., CIFAR-10, CIFAR-100, and Tiny ImageNet. The experimental results show a consistently improved generalization performance on all the DNNs with different sizes, which verifies the promise of MLTP and demonstrates that the meta-learning idea is indeed able to improve the generalization of DNNs on the standard supervised learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题