论文标题
MSR-DARTS:可区分体系结构搜索的最低稳定等级
MSR-DARTS: Minimum Stable Rank of Differentiable Architecture Search
论文作者
论文摘要
在神经体系结构搜索(NAS)中,由于其高效率,最近引起了可区分的架构搜索(DARTS)。它定义了一个具有混合边缘的过度参数化网络,每个网络代表所有操作员候选者,并以交替的方式共同优化网络及其体系结构的权重。但是,该方法找到了一个比其他模型更快地收敛的模型,而这种最快收敛的模型通常会导致过度拟合。因此,最终的模型不能总是被很好地化。为了克服这个问题,我们提出了一种称为最低稳定等级飞镖(MSR-DARTS)的方法,以通过使用最小稳定的等级标准将架构优化替换为选择过程,以找到具有最佳概括误差的模型。具体而言,卷积运算符由矩阵表示,MSR-DARTS选择具有最小稳定等级的操作员。我们评估了CIFAR-10和Imagenet数据集上的MSR点。它的错误率为2.54%,在CIFAR-10上的0.3 GPU-DASE内,ImageNet的TOP-1错误率为23.9%。该官方代码可在https://github.com/mtaecchhi/msrdarts.git上找到。
In neural architecture search (NAS), differentiable architecture search (DARTS) has recently attracted much attention due to its high efficiency. It defines an over-parameterized network with mixed edges, each of which represents all operator candidates, and jointly optimizes the weights of the network and its architecture in an alternating manner. However, this method finds a model with the weights converging faster than the others, and such a model with fastest convergence often leads to overfitting. Accordingly, the resulting model cannot always be well-generalized. To overcome this problem, we propose a method called minimum stable rank DARTS (MSR-DARTS), for finding a model with the best generalization error by replacing architecture optimization with the selection process using the minimum stable rank criterion. Specifically, a convolution operator is represented by a matrix, and MSR-DARTS selects the one with the smallest stable rank. We evaluated MSR-DARTS on CIFAR-10 and ImageNet datasets. It achieves an error rate of 2.54% with 4.0M parameters within 0.3 GPU-days on CIFAR-10, and a top-1 error rate of 23.9% on ImageNet. The official code is available at https://github.com/mtaecchhi/msrdarts.git.