论文标题
模型生成的代码如何可读?检查GitHub Copilot的可读性和视觉检查
How Readable is Model-generated Code? Examining Readability and Visual Inspection of GitHub Copilot
论文作者
论文摘要
背景:大语言模型的最新进步促使在代码生成和程序合成中实际使用此类模型。但是,对于此类工具对实践中代码可读性和视觉关注的影响知之甚少。 目的:在本文中,我们专注于GitHub Copilot,以解决模型生成的代码的可读性和视觉检查问题。可读性和低复杂性是良好的源代码的重要方面,并且根据自动化偏差,对生成的代码的目视检查很重要。 方法:通过人类实验(n = 21),我们将模型生成的代码与人类程序员完全编写的代码进行了比较。我们使用静态代码分析和人体注释的组合来评估代码的可读性,并使用眼睛跟踪来评估代码的目视检查。 结果:我们的结果表明,模型生成的代码的复杂性和可读性与人类对程序员编写的代码相当。眼睛跟踪数据同时提出,在统计上显着的水平上,程序员将视觉上的关注减少到模型生成的代码。 结论:我们的发现强调,阅读代码比以往任何时候都重要,程序员应提防与模型生成的代码一起自满和自动化偏见。
Background: Recent advancements in large language models have motivated the practical use of such models in code generation and program synthesis. However, little is known about the effects of such tools on code readability and visual attention in practice. Objective: In this paper, we focus on GitHub Copilot to address the issues of readability and visual inspection of model generated code. Readability and low complexity are vital aspects of good source code, and visual inspection of generated code is important in light of automation bias. Method: Through a human experiment (n=21) we compare model generated code to code written completely by human programmers. We use a combination of static code analysis and human annotators to assess code readability, and we use eye tracking to assess the visual inspection of code. Results: Our results suggest that model generated code is comparable in complexity and readability to code written by human pair programmers. At the same time, eye tracking data suggests, to a statistically significant level, that programmers direct less visual attention to model generated code. Conclusion: Our findings highlight that reading code is more important than ever, and programmers should beware of complacency and automation bias with model generated code.