论文标题
Excel电子表格分析仪
Excel Spreadsheet Analyzer
论文作者
论文摘要
电子表格广泛用于各个领域以进行大型数值分析。几十年来,几家公司一直依靠电子表格,但由于支持,社区和大量图书馆,数据科学家朝着使用科学编程语言(例如Python)进行数据分析的方向进行数据分析。在使用Python分析公司的电子表格时,丢失了一些细胞的配方和依赖性等信息。我们提出了一种创建电子表格中抽象中间表示(AIR)的工具。这种表示有助于从电子表格转移到科学编程语言,同时保留有关数据的相互依存信息。除此之外,我们在工具上构建了一个Python库,以在Python中执行一些数据分析。
Spreadsheets are widely used in various fields to do large numerical analysis. While several companies have relied on spreadsheets for decades, data scientists are going in the direction of using scientific programming languages such as python to do their data analysis due to the support, community, and vast amount of libraries. While using python to analyze a company's spreadsheets, some information such as the formulas and dependencies of a cell are lost. We propose a tool that creates an abstract intermediate representation (AIR) of a spreadsheet. This representation facilitates the transfer from spreadsheets into scientific programming languages while preserving inter-dependency information about data. In addition to that, we build a python library on top of our tool to perform some data analysis in python.