论文标题
迁移命令深度学习计划的挑战:一项实证研究
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study
论文作者
论文摘要
效率对于支持响应能力W.R.T.至关重要不断增长的数据集,特别是用于深度学习(DL)系统。传统上,DL框架采用了支持符号,基于图的深神经网络(DNN)计算的延期执行式DL代码。尽管可扩展,但这种开发倾向于产生容易出错,不直觉且难以调试的DL代码。因此,出现了更自然,易于错误的命令式DL框架,鼓励急切执行,但以运行时性能为代价。尽管混合方法的目的是“两全其美”,但将它们应用于现实世界的挑战在很大程度上是未知的。我们通过研究250个由19.7 MLOC组成的250个开源项目,以及470和446手动检查的代码补丁和错误报告,对挑战(以及由此产生的错误)进行了数据驱动分析。结果表明杂交:(i)容易滥用API,(ii)可能导致性能降解 - 其意图相反,并且(iii)由于执行模式不兼容而具有有限的应用。我们提出了一些建议,最佳实践和反诉讼,以有效地融合了命令式DL代码,有可能使DL从业者,API设计师,工具开发人员和教育工作者受益。
Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code that supports symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce DL code that is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, less error-prone imperative DL frameworks encouraging eager execution have emerged but at the expense of run-time performance. While hybrid approaches aim for the "best of both worlds," the challenges in applying them in the real world are largely unknown. We conduct a data-driven analysis of challenges -- and resultant bugs -- involved in writing reliable yet performant imperative DL code by studying 250 open-source projects, consisting of 19.7 MLOC, along with 470 and 446 manually examined code patches and bug reports, respectively. The results indicate that hybridization: (i) is prone to API misuse, (ii) can result in performance degradation -- the opposite of its intention, and (iii) has limited application due to execution mode incompatibility. We put forth several recommendations, best practices, and anti-patterns for effectively hybridizing imperative DL code, potentially benefiting DL practitioners, API designers, tool developers, and educators.