论文标题
在功耗方面,“常见”数据科学的可持续性如何?
How sustainable is "common" data science in terms of power consumption?
论文作者
论文摘要
数据科学的持续发展提出了机器学习模型的复杂性的指数增长。此外,数据科学家在私人市场,学术环境甚至是一种爱好中都变得无处不在。所有这些趋势都在稳定上升,并且与功耗和相关的碳足迹的增加有关。大型高级数据科学的碳足迹越来越多,但后一种趋势却没有。这项工作旨在估算日益流行的“共同”数据科学对全球碳足迹的贡献。为此,将测量和比较上述常见数据科学任务中几个典型任务的功耗,并与:大规模的“高级”数据科学,与计算机相关的常见任务以及日常的非计算机相关的任务。这是通过将测量值转换为“由CAR驱动”的等效单位来完成的。我们的主要发现是:“常见”数据科学的消耗$ 2.57 $ $比常规计算机使用的功率高,但比某些常见的日常功率耗电任务(例如照明或加热)少;大规模数据科学比普通数据科学所消耗的功率要大得多。
Continuous developments in data science have brought forth an exponential increase in complexity of machine learning models. Additionally, data scientists have become ubiquitous in the private market, academic environments and even as a hobby. All of these trends are on a steady rise, and are associated with an increase in power consumption and associated carbon footprint. The increasing carbon footprint of large-scale advanced data science has already received attention, but the latter trend has not. This work aims to estimate the contribution of the increasingly popular "common" data science to the global carbon footprint. To this end, the power consumption of several typical tasks in the aforementioned common data science tasks will be measured and compared to: large-scale "advanced" data science, common computer-related tasks, and everyday non-computer related tasks. This is done by converting the measurements to the equivalent unit of "km driven by car". Our main findings are: "common" data science consumes $2.57$ more power than regular computer usage, but less than some common everyday power-consuming tasks such as lighting or heating; large-scale data science consumes substantially more power than common data science.