buglistener：从协作实时聊天中识别和综合错误报告

论文标题

buglistener：从协作实时聊天中识别和综合错误报告

BugListener: Identifying and Synthesizing Bug Reports from Collaborative Live Chats

论文作者

Shi, Lin, Mu, Fangwen, Zhang, Yumin, Yang, Ye, Chen, Junjie, Chen, Xiao, Jiang, Hanzhi, Jiang, Ziyou, Wang, Qing

论文摘要

在基于社区的软件开发中，开发人员经常依靠实时聊天来讨论他们在日常开发任务中遇到的紧急错误/错误。但是，由于实时聊天数据中交错的对话框的嘈杂性质，准确记录此类知识仍然是一项艰巨的任务。在本文中，我们首先制定了从社区实时聊天中识别和综合错误报告的任务，并提出了一种名为Buglistener的新方法来应对挑战。具体而言，错误列表可以自动使用三个子任务：1）使用馈送前传神经网络将对话框与大规模聊天日志解开； 2）通过将原始对话框建模到图形结构对话框并利用图形神经网络来了解上下文信息，从分开对话框中识别错误报告对话框； 3）通过利用TextCNN模型和转移学习网络将句子分类为三个组：观察到的行为（OB），预期行为（EB）以及重现错误（SR）的步骤（SR）来综合错误报告。 BugListener对六个开源项目进行了评估。结果表明：对于错误报告识别，BugListener的平均F1为74.21％，将最佳基线提高了10.37％；对于Bug报告综合任务，BugListener可以将OB，EB和SR句子分类为67.37％，87.14％和65.03％，分别将最佳的基准提高7.21％，7.38％，5.30％。人类评估还证实了错误列列者在生成相关和准确的错误报告中的有效性。这些证明了将Buglistener应用于基于社区的软件开发中的重要潜力，以促进虫子发现和质量改进。

In community-based software development, developers frequently rely on live-chatting to discuss emergent bugs/errors they encounter in daily development tasks. However, it remains a challenging task to accurately record such knowledge due to the noisy nature of interleaved dialogs in live chat data. In this paper, we first formulate the task of identifying and synthesizing bug reports from community live chats, and propose a novel approach, named BugListener, to address the challenges. Specifically, BugListener automates three sub-tasks: 1) Disentangle the dialogs from massive chat logs by using a Feed-Forward neural network; 2) Identify the bug-report dialogs from separated dialogs by modeling the original dialog to the graph-structured dialog and leveraging the graph neural network to learn the contextual information; 3) Synthesize the bug reports by utilizing the TextCNN model and Transfer Learning network to classify the sentences into three groups: observed behaviors (OB), expected behaviors (EB), and steps to reproduce the bug (SR). BugListener is evaluated on six open source projects. The results show that: for bug report identification, BugListener achieves the average F1 of 74.21%, improving the best baseline by 10.37%; and for bug report synthesis task, BugListener could classify the OB, EB, and SR sentences with the F1 of 67.37%, 87.14%, and 65.03%, improving the best baselines by 7.21%, 7.38%, 5.30%, respectively. A human evaluation also confirms the effectiveness of BugListener in generating relevant and accurate bug reports. These demonstrate the significant potential of applying BugListener in community-based software development, for promoting bug discovery and quality improvement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题