论文标题
MLOPS环境中的质量保证:工业视角
Quality Assurance in MLOps Setting: An Industrial Perspective
论文作者
论文摘要
如今,机器学习(ML)被广泛用于行业,以提供生产系统的核心功能。但是,实际上,它总是在生产系统中用作较大的端到端软件系统的一部分,该系统除ML模型外还由其他几个组件组成。由于生产需求和时间限制,自动化软件工程实践非常适用。在制造业和公用事业等行业中,自动化ML软件工程实践的使用越来越多,就需要自动质量保证(QA)方法作为ML软件不可或缺的一部分。在这里,质量保证通过对软件任务的客观观点有助于降低风险。尽管常规的软件工程具有用于数据驱动的ML的质量检查数据分析的自动化工具,但缺乏使用质量检查实践(MLOPS)的质量检查实践。本文研究了工业MLOP中出现的质量检查挑战,并概念化了处理数据完整性和数据质量(DQ)的模块化策略。该论文伴随着工业伙伴的实际工业用例。该论文还提出了一些挑战,这些挑战可能是未来研究的基础。
Today, machine learning (ML) is widely used in industry to provide the core functionality of production systems. However, it is practically always used in production systems as part of a larger end-to-end software system that is made up of several other components in addition to the ML model. Due to production demand and time constraints, automated software engineering practices are highly applicable. The increased use of automated ML software engineering practices in industries such as manufacturing and utilities requires an automated Quality Assurance (QA) approach as an integral part of ML software. Here, QA helps reduce risk by offering an objective perspective on the software task. Although conventional software engineering has automated tools for QA data analysis for data-driven ML, the use of QA practices for ML in operation (MLOps) is lacking. This paper examines the QA challenges that arise in industrial MLOps and conceptualizes modular strategies to deal with data integrity and Data Quality (DQ). The paper is accompanied by real industrial use-cases from industrial partners. The paper also presents several challenges that may serve as a basis for future studies.