论文标题

SE冷冻巨型代码模型中的事实知识:FQN及其检索的研究

SE Factual Knowledge in Frozen Giant Code Model: A Study on FQN and its Retrieval

论文作者

Huang, Qing, Liao, Dianshu, Xing, Zhenchang, Yuan, Zhiqiang, Lu, Qinghua, Xu, Xiwei, Lu, Jiaxing

论文摘要

预训练的巨型代码模型(PCM)开始进入开发商的日常实践。了解PCM的哪些类型和多少软件知识是将PCM纳入软件工程(SE)任务并充分发行其潜力的基础。在这项工作中,我们对最先进的PCM副驾驶中的事实知识进行了首次系统研究,重点是API的完全合格的名称(FQN),这是有效代码分析,搜索和重复使用的基本知识。在FQNS的数据分发属性的驱动下,我们设计了一种新颖的轻巧在FQN推理上的轻巧的在副本学习中,该学习不需要代码编译作为传统方法或梯度更新,而通过最近的FQN提示。我们系统地尝试了五个秘密学习设计因素,以确定开发人员可以在实践中可以采用的最佳内在学习配置。通过这种最佳配置,我们研究了示例提示和FQN数据属性对Copilot FQN推理能力的影响。我们的结果证实,Copilot存储了不同的FQN知识,并且由于其高推理的准确性和对代码分析的不依赖而可以应用于FQN推断。根据我们与Copilot相互作用的经验,我们讨论了各种机会,以改善FQN推理任务中的人类流行互动。

Pre-trained giant code models (PCMs) start coming into the developers' daily practices. Understanding what types of and how much software knowledge is packed into PCMs is the foundation for incorporating PCMs into software engineering (SE) tasks and fully releasing their potential. In this work, we conduct the first systematic study on the SE factual knowledge in the state-of-the-art PCM CoPilot, focusing on APIs' Fully Qualified Names (FQNs), the fundamental knowledge for effective code analysis, search and reuse. Driven by FQNs' data distribution properties, we design a novel lightweight in-context learning on Copilot for FQN inference, which does not require code compilation as traditional methods or gradient update by recent FQN prompt-tuning. We systematically experiment with five in-context-learning design factors to identify the best in-context learning configuration that developers can adopt in practice. With this best configuration, we investigate the effects of amount of example prompts and FQN data properties on Copilot's FQN inference capability. Our results confirm that Copilot stores diverse FQN knowledge and can be applied for the FQN inference due to its high inference accuracy and non-reliance on code analysis. Based on our experience interacting with Copilot, we discuss various opportunities to improve human-CoPilot interaction in the FQN inference task.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源