Machine Translation-Based Bug Localization Technique for Bridging Lexical Gap
Context: The challenge of locating bugs in mostly large-scale software systems has led to the development of bug localization techniques. However, the lexical mismatch between bug reports and source codes degrades the performances of existing information retrieval or machine learning-based approaches. Objective: To bridge the lexical gap and improve the effectiveness of localizing buggy files by leveraging the extracted semantic information from bug reports and source code. Method: We present BugTranslator, a novel deep learning-based machine translation technique composed of an attention-based recurrent neural network (RNN) Encoder-Decoder with long short-term memory cells. One RNN encodes bug reports into several context vectors that are decoded by another RNN into code tokens of buggy files. The technique studies and adopts the relevance between the extracted semantic information from bug reports and source files. Results: The experimental results show that BugTranslator outperforms a current state-of-the-art word embed- ding technique on three open-source projects with higher MAP and MRR. The results show that BugTranslator can rank actual buggy files at the second or third places on average.
Conclusion: BugTranslator distinguishes bug reports and source code into different symbolic classes and then extracts deep semantic similarity and relevance between bug reports and the corresponding buggy files to bridge the lexical gap at its source, thereby further improving the performance of bug localization.
Tue 12 NovDisplayed time zone: Tijuana, Baja California change
10:40 - 12:20 | AI and SEResearch Papers / Journal First Presentations / Demonstrations at Cortez 2&3 Chair(s): Kaiyuan Wang Google, Inc. | ||
10:40 20mTalk | Assessing the Generalizability of code2vec Token Embeddings Research Papers Hong Jin Kang School of Information Systems, Singapore Management University, Tegawendé F. Bissyandé SnT, University of Luxembourg, David Lo Singapore Management University Pre-print | ||
11:00 20mTalk | Multi-Modal Attention Network Learning for Semantic Source Code Retrieval Research Papers Yao Wan Zhejiang University, Jingdong Shu Zhejiang University, Yulei Sui University of Technology Sydney, Australia, Guandong Xu University of Technology, Sydney, Zhou Zhao Zhejiang University, Jian Wu Zhejiang University, philip yu University of Illinois at Chicago | ||
11:20 20mTalk | Experience Paper: Search-based Testing in Automated Driving Control ApplicationsACM SIGSOFT Distinguished Paper Award Research Papers Christoph Gladisch Corporate Research, Robert Bosch GmbH, Thomas Heinz Corporate Research, Robert Bosch GmbH, Christian Heinzemann Corporate Research, Robert Bosch GmbH, Jens Oehlerking Corporate Research, Robert Bosch GmbH, Anne von Vietinghoff Corporate Research, Robert Bosch GmbH, Tim Pfitzer Robert Bosch Automotive Steering GmbH | ||
11:40 20mTalk | Machine Translation-Based Bug Localization Technique for Bridging Lexical Gap Journal First Presentations Yan Xiao Department of Computer Science, City University of Hong Kong, Jacky Keung Department of Computer Science, City University of Hong Kong, Kwabena E. Bennin Blekinge Institute of Technology, SERL Sweden, Qing Mi Department of Computer Science, City University of Hong Kong Link to publication | ||
12:00 10mTalk | AutoFocus: Interpreting Attention-based Neural Networks by Code Perturbation Research Papers Nghi D. Q. Bui Singapore Management University, Singapore, Yijun Yu The Open University, UK, Lingxiao Jiang Singapore Management University Pre-print | ||
12:10 10mDemonstration | A Quantitative Analysis Framework for Recurrent Neural Network Demonstrations Xiaoning Du Nanyang Technological University, Xiaofei Xie Nanyang Technological University, Yi Li Nanyang Technological University, Lei Ma Kyushu University, Yang Liu Nanyang Technological University, Singapore, Jianjun Zhao Kyushu University |