挖掘资源受限的automl的可靠默认配置

论文标题

挖掘资源受限的automl的可靠默认配置

Mining Robust Default Configurations for Resource-constrained AutoML

论文作者

Kayali, Moe, Wang, Chi

论文摘要

自动机器学习（AUTOML）是下一代机器学习系统大规模部署的关键推动力。对未来ML系统的关键避税是自动选择模型和超参数。我们提出了一种新颖的方法，可以通过执行离线自动化并在各种任务上进行挖掘来为给定任务选择性能配置。通过挖掘培训任务，我们可以选择一项紧凑的配置组合，这些配置组合在各种任务中都表现良好，并学习一种策略来为尚未看到的任务选择投资组合配置。该算法以零拍的方式运行，该算法除所选的算法没有在线培训任何模型。在计算或时间约束的设置中，这种几乎即时选择的性能高。此外，我们表明我们的方法有效地启动了现有的汽车平台。在这两种情况下，我们通过测试超过62个分类和回归数据集来证明最先进的改进。我们还演示了推荐数据依赖数据的默认配置的实用性，该配置超过了广泛使用的手工制作的默认值。

Automatic machine learning (AutoML) is a key enabler of the mass deployment of the next generation of machine learning systems. A key desideratum for future ML systems is the automatic selection of models and hyperparameters. We present a novel method of selecting performant configurations for a given task by performing offline autoML and mining over a diverse set of tasks. By mining the training tasks, we can select a compact portfolio of configurations that perform well over a wide variety of tasks, as well as learn a strategy to select portfolio configurations for yet-unseen tasks. The algorithm runs in a zero-shot manner, that is without training any models online except the chosen one. In a compute- or time-constrained setting, this virtually instant selection is highly performant. Further, we show that our approach is effective for warm-starting existing autoML platforms. In both settings, we demonstrate an improvement on the state-of-the-art by testing over 62 classification and regression datasets. We also demonstrate the utility of recommending data-dependent default configurations that outperform widely used hand-crafted defaults.

下载PDF全文

下载文献需遵守相关版权规定

论文标题