更少的是：NAS方法中的代理数据集

论文标题

更少的是：NAS方法中的代理数据集

Less is More: Proxy Datasets in NAS approaches

论文作者

Moser, Brian, Raue, Federico, Hees, Jörn, Dengel, Andreas

论文摘要

神经体系结构搜索（NAS）将神经网络的设计定义为搜索问题。不幸的是，NAS在计算上是密集的，因为各种可能性取决于设计中的元素数量以及它们之间可能的连接。在这项工作中，我们根据几种采样方法来广泛分析数据集大小的作用，以减少数据集大小（无监督和监督案例）作为减少搜索时间的不可知论方法。我们将这些技术与在CIFAR-100上大约1,400个实验中的NAS Bench-201中的四种常见NAS方法进行了比较。我们令人惊讶的发现之一是，在大多数情况下，我们可以将培训数据的数量减少到25 \％，因此将搜索时间降低到25 \％，同时保持与完整数据集中培训相同的准确性。此外，某些来自子集衍生的设计超出完整数据集的设计，最多可达22 P.P.准确性。

Neural Architecture Search (NAS) defines the design of Neural Networks as a search problem. Unfortunately, NAS is computationally intensive because of various possibilities depending on the number of elements in the design and the possible connections between them. In this work, we extensively analyze the role of the dataset size based on several sampling approaches for reducing the dataset size (unsupervised and supervised cases) as an agnostic approach to reduce search time. We compared these techniques with four common NAS approaches in NAS-Bench-201 in roughly 1,400 experiments on CIFAR-100. One of our surprising findings is that in most cases we can reduce the amount of training data to 25\%, consequently reducing search time to 25\%, while at the same time maintaining the same accuracy as if training on the full dataset. Additionally, some designs derived from subsets out-perform designs derived from the full dataset by up to 22 p.p. accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题