论文标题
Automesc:用于采矿和分类以太坊智能合约漏洞及其修复程序的自动框架
AutoMESC: Automatic Framework for Mining and Classifying Ethereum Smart Contract Vulnerabilities and Their Fixes
论文作者
论文摘要
由于与智能合约中漏洞相关的风险,近年来,其安全性引起了人们的关注。但是,缺乏有关智能合同漏洞的开放数据集及其修复程序,可以进行数据驱动的研究。为此,我们提出了一种自动化方法,用于采矿和分类以太坊的智能合同漏洞及其来自GitHub的相应修复以及来自国家漏洞数据库中的常见漏洞和暴露(CVE)记录。我们在全自动框架中实现了所提出的方法,我们称之为Automesc。 AUTOMESC使用七个最著名的智能合同安全工具来根据漏洞类型对收集的漏洞进行分类和标记。此外,它收集了可用于数据密集型智能合同安全研究(例如,漏洞检测,漏洞分类,严重性预测和自动化维修)的元数据。我们使用AutomESC构建了一个示例数据集并使其公开可用。当前,数据集包含6.7k Smart Contracts的漏洞 - 框架对,以坚固性编写。我们根据准确性,出处和相关性评估构造数据集的质量,并将其与现有数据集进行比较。 AUTOMESC旨在连续收集数据,并使用新发现的智能合约漏洞及其从GitHub和CVE记录中保持最新数据集。
Due to the risks associated with vulnerabilities in smart contracts, their security has gained significant attention in recent years. However, there is a lack of open datasets on smart contract vulnerabilities and their fixes that allows for data-driven research. Towards this end, we propose an automated method for mining and classifying Ethereum's smart contract vulnerabilities and their corresponding fixes from GitHub and from the Common Vulnerabilities and Exposures (CVE) records in the National Vulnerability Database. We implemented the proposed method in a fully automated framework, which we call AutoMESC. AutoMESC uses seven of the most well-known smart contract security tools to classify and label the collected vulnerabilities based on vulnerability types. Furthermore, it collects metadata that can be used in data-intensive smart contract security research (e.g., vulnerability detection, vulnerability classification, severity prediction, and automated repair). We used AutoMESC to construct a sample dataset and made it publicly available. Currently, the dataset contains 6.7K smart contracts' vulnerability-fix pairs written in Solidity. We assess the quality of the constructed dataset in terms of accuracy, provenance, and relevance, and compare it with existing datasets. AutoMESC is designed to collect data continuously and keep the corresponding dataset up-to-date with newly discovered smart contract vulnerabilities and their fixes from GitHub and CVE records.