用于顺序和平行神经架构搜索的最佳传输内核

论文标题

用于顺序和平行神经架构搜索的最佳传输内核

Optimal Transport Kernels for Sequential and Parallel Neural Architecture Search

论文作者

Nguyen, Vu, Le, Tam, Yamada, Makoto, Osborne, Michael A

论文摘要

神经架构搜索（NAS）自动设计深神经网络的设计。搜索复杂和非连续体系结构的主要挑战之一是比较传统欧几里得指标可能无法捕获的网络的相似性。最佳运输（OT）通过考虑将网络运输到另一个网络的最低成本，对这种复杂的结构具有弹性。但是，OT通常不是负定确定的，这可能会限制其在许多依赖内核依赖性框架中构建正定核的能力。基于Tree-Wasserstein（TW），这是OT的负面变体，我们为神经体系结构开发了一种新颖的差异，并在连续NAS设置的高斯过程替代模型中演示了它。此外，我们使用GP后部的质量k-二确定点过程得出了一种新颖的平行NA，从而从一组离散的候选者组中选择了多样化和高性能的体系结构。从经验上讲，我们证明了我们的TW方法在顺序和平行NAS中的其他基线都优于其他基线。

Neural architecture search (NAS) automates the design of deep neural networks. One of the main challenges in searching complex and non-continuous architectures is to compare the similarity of networks that the conventional Euclidean metric may fail to capture. Optimal transport (OT) is resilient to such complex structure by considering the minimal cost for transporting a network into another. However, the OT is generally not negative definite which may limit its ability to build the positive-definite kernels required in many kernel-dependent frameworks. Building upon tree-Wasserstein (TW), which is a negative definite variant of OT, we develop a novel discrepancy for neural architectures, and demonstrate it within a Gaussian process surrogate model for the sequential NAS settings. Furthermore, we derive a novel parallel NAS, using quality k-determinantal point process on the GP posterior, to select diverse and high-performing architectures from a discrete set of candidates. Empirically, we demonstrate that our TW-based approaches outperform other baselines in both sequential and parallel NAS.

下载PDF全文

下载文献需遵守相关版权规定

论文标题