论文标题

通过可扩展的搜索空间分解有效的端到端汽车

Efficient End-to-End AutoML via Scalable Search Space Decomposition

论文作者

Li, Yang, Shen, Yu, Zhang, Wentao, Zhang, Ce, Cui, Bin

论文摘要

端到端的Automl吸引了学术界和行业的密集兴趣,这些兴趣会自动搜索由功能工程,算法/模型选择和超参数调整所引起的空间中的ML管道。但是,现有的AutoML系统在应用于具有较大高维搜索空间的应用程序域时会遇到可伸缩性问题。我们提出了火山,这是一个可扩展且可扩展的框架,可促进对大型汽车搜索空间的系统探索。 Volcanoml引入并实施了将大型搜索空间分解为较小的基本构建块,并允许用户利用这些构建块来制定手头上的汽车问题的执行计划。 Volcanoml进一步支持火山风格的执行模型(类似于现代数据库系统支持的火山执行模型)来执行构建的计划。我们的评估表明,不仅火山l不仅提高了汽车中搜索空间分解的表达水平,而且还导致了分解策略的实际发现,这些发现比最先进的汽车系统(如自动烟)所采用的策略要高得多。本文是初始火山纸的扩展版本出现在VLDB 2021中。

End-to-end AutoML has attracted intensive interests from both academia and industry which automatically searches for ML pipelines in a space induced by feature engineering, algorithm/model selection, and hyper-parameter tuning. Existing AutoML systems, however, suffer from scalability issues when applying to application domains with large, high-dimensional search spaces. We present VolcanoML, a scalable and extensible framework that facilitates systematic exploration of large AutoML search spaces. VolcanoML introduces and implements basic building blocks that decompose a large search space into smaller ones, and allows users to utilize these building blocks to compose an execution plan for the AutoML problem at hand. VolcanoML further supports a Volcano-style execution model -- akin to the one supported by modern database systems -- to execute the plan constructed. Our evaluation demonstrates that, not only does VolcanoML raise the level of expressiveness for search space decomposition in AutoML, it also leads to actual findings of decomposition strategies that are significantly more efficient than the ones employed by state-of-the-art AutoML systems such as auto-sklearn. This paper is the extended version of the initial VolcanoML paper appeared in VLDB 2021.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源