AdaptivePaste：通过学习语义意识可变用法表示的代码适应代码

论文标题

AdaptivePaste：通过学习语义意识可变用法表示的代码适应代码

AdaptivePaste: Code Adaptation through Learning Semantics-aware Variable Usage Representations

论文作者

Liu, Xiaoyu, Jang, Jinu, Sundaresan, Neel, Allamanis, Miltiadis, Svyatkovskiy, Alexey

论文摘要

在软件开发中，程序员通常复制paste或port代码段，然后将其调整到其用例中。这种情况激发了代码适应任务 - 程序修复的一种变体，旨在将粘贴代码片段中的变量标识符调整为周围的源代码。但是，尚未证明现有的方法可以有效解决此任务。在本文中，我们介绍了基于变形金刚的基于学习的源代码适应方法的AudaptivePaste，并基于变形金刚和专用的数据流意识到的DEOBFUSCATION预训练前训练任务，以了解可变用法模式的有意义表示。我们在Python的代码片段数据集上评估了AdaptivePaste。结果表明，我们的模型可以学会以79.8％的精度调整源代码。为了评估在实践中自适应档案的价值，我们在一百个现实世界复制实例上与10位Python开发人员进行了一项用户研究。结果表明，AdaptivePaste将停留时间降低到手动代码适应所需的几乎一半，并有助于避免错误。此外，我们利用参与者的反馈来确定改善自适应档案的潜在途径。

In software development, it is common for programmers to copy-paste or port code snippets and then adapt them to their use case. This scenario motivates the code adaptation task -- a variant of program repair which aims to adapt variable identifiers in a pasted snippet of code to the surrounding, preexisting source code. However, no existing approach has been shown to effectively address this task. In this paper, we introduce AdaptivePaste, a learning-based approach to source code adaptation, based on transformers and a dedicated dataflow-aware deobfuscation pre-training task to learn meaningful representations of variable usage patterns. We evaluate AdaptivePaste on a dataset of code snippets in Python. Results suggest that our model can learn to adapt source code with 79.8% accuracy. To evaluate how valuable is AdaptivePaste in practice, we perform a user study with 10 Python developers on a hundred real-world copy-paste instances. The results show that AdaptivePaste reduces the dwell time to nearly half the time it takes for manual code adaptation, and helps to avoid bugs. In addition, we utilize the participant feedback to identify potential avenues for improvement of AdaptivePaste.

下载PDF全文

下载文献需遵守相关版权规定

论文标题