对基准的全面调查，以自动改进软件的非功能性属性

论文标题

对基准的全面调查，以自动改进软件的非功能性属性

A Comprehensive Survey of Benchmarks for Automated Improvement of Software's Non-Functional Properties

论文作者

Blot, Aymeric, Petke, Justyna

论文摘要

性能是现代软件的关键质量。尽管近年来对软件执行时间，能源，内存消耗等自动改善的研究有所激增，但对于此类工作来说，缺乏标准的基准。还不清楚此类基准如何代表当前软件。此外，软件的通常非功能性能是针对一次性改进，忽略了对其他特性的潜在负面影响。为了促进有关软件非功能性能自动改善的更多研究，我们进行了一项调查，收集了以前工作中使用的基准。我们考虑了5个主要在线软件工程工作的在线存储库：ACM数字图书馆，IEEE Xplore，Scopus，Google Scholar和Arxiv。我们收集了5000个出版物（3749个唯一），这些出版物经过系统地进行审查，以确定从经验上改善软件非功能属性的工作。我们确定了386篇相关论文。我们发现，执行时间是最常见的改进属性（在62％的相关论文中），而多目标改进很少被考虑（5％）。静态方法很普遍（在53％的论文中），探索性方法（18％的进化论和14％的论文中的非进化）在过去10年中越来越流行。在386篇论文中，只有40％描述了使用这些规格的基准套件而不是单个软件的工作，最受欢迎（涵盖了33篇论文）。我们还提供了建议在未来工作中选择基准的建议，例如，缺乏涵盖Python或JavaScript的工作。我们在https://bloa.github.io/nfunc_survey/上提供了386篇论文中找到的所有程序我们希望这项工作能够促进有关软件非功能性属性自动改进的主题的更多研究。

Performance is a key quality of modern software. Although recent years have seen a spike in research on automated improvement of software's execution time, energy, memory consumption, etc., there is a noticeable lack of standard benchmarks for such work. It is also unclear how such benchmarks are representative of current software. Furthermore, frequently non-functional properties of software are targeted for improvement one-at-a-time, neglecting potential negative impact on other properties. In order to facilitate more research on automated improvement of non-functional properties of software, we conducted a survey gathering benchmarks used in previous work. We considered 5 major online repositories of software engineering work: ACM Digital Library, IEEE Xplore, Scopus, Google Scholar, and ArXiV. We gathered 5000 publications (3749 unique), which were systematically reviewed to identify work that empirically improves non-functional properties of software. We identified 386 relevant papers. We find that execution time is the most frequently targeted property for improvement (in 62% of relevant papers), while multi-objective improvement is rarely considered (5%). Static approaches are prevalent (in 53% of papers), with exploratory approaches (evolutionary in 18% and non-evolutionary in 14% of papers) increasingly popular in the last 10 years. Only 40% of 386 papers describe work that uses benchmark suites, rather than single software, of those SPEC is most popular (covered in 33 papers). We also provide recommendations for choice of benchmarks in future work, noting, e.g., lack of work that covers Python or JavaScript. We provide all programs found in the 386 papers on our dedicated webpage at https://bloa.github.io/nfunc_survey/ We hope that this effort will facilitate more research on the topic of automated improvement of software's non-functional properties.

下载PDF全文

下载文献需遵守相关版权规定

论文标题