Multi-Modal Attention Network Learning for Semantic Source Code Retrieval (ASE 2019 - Research Papers)

Who

Yao Wan, Jingdong Shu, Yulei Sui, Guandong Xu, Zhou Zhao, Jian Wu, philip yu

Track

ASE 2019 Research Papers

Time Zone

The program is currently displayed in (GMT-08:00) Tijuana, Baja California.

Use conference time zone: (GMT-08:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 12 Nov 2019 11:00 - 11:20 at Cortez 2&3 - AI and SE Chair(s): Kaiyuan Wang

Abstract

Code retrieval techniques and tools have been playing a key role in facilitating software developers to retrieve existing code fragments from available open-source repositories given a user query (e.g., a short natural language text describing the functionality for retrieving a particular code snippet). Despite the existing efforts in improving the effectiveness of code retrieval, there are still two main issues hindering them from being used to accurately retrieve satisfiable code fragments from large-scale repositories when answering complicated queries. First, the existing approaches only consider shallow features of source code such as method names and code tokens, but ignoring structured features such as abstract syntax trees (ASTs) and control-flow graphs (CFGs) of source code, which contains rich and well-defined semantics of source code. Second, although the deep-learning-based approach performs well on the representation of source code, it lacks the explainability, making it hard to interpret the retrieval results and almost impossible to understand which features of source code contribute more to the final results. To tackle the two aforementioned issues, this paper proposes MMAN, a novel \underline{M}ulti-\underline{M}odal \underline{A}ttention \underline{N}etwork for semantic source code retrieval. A comprehensive multi-modal representation is developed for representing unstructured and structured features of source code, with one LSTM for the sequential tokens of the code, a Tree-LSTM for the ASTs of the code and a GGNN (Gated Graph Neural Network) for the CFG of the code. Furthermore, a multi-modal attention fusion layer is applied to assign weights to different parts of each modality of source code and then integrate them into a single hybrid representation. Comprehensive experiments and analysis on a large-scale real-world dataset show that our proposed model can accurately retrieve code snippets and outperforms the state-of-the-art methods.

Yao Wan

Zhejiang University

China

Jingdong Shu

Zhejiang University

Yulei Sui

University of Technology Sydney, Australia

Australia

Guandong Xu

University of Technology, Sydney

Australia

Zhou Zhao

Zhejiang University

Jian Wu

Zhejiang University

philip yu

University of Illinois at Chicago

Time Zone

The program is currently displayed in (GMT-08:00) Tijuana, Baja California.

Use conference time zone: (GMT-08:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 12 Nov
Displayed time zone: Tijuana, Baja California change

10:40 - 12:20	AI and SEResearch Papers / Journal First Presentations / Demonstrations at Cortez 2&3 Chair(s): Kaiyuan Wang Google, Inc.

10:40 20m Talk		Assessing the Generalizability of code2vec Token Embeddings Research Papers Hong Jin Kang School of Information Systems, Singapore Management University, Tegawendé F. Bissyandé SnT, University of Luxembourg, David Lo Singapore Management University Pre-print
11:00 20m Talk		Multi-Modal Attention Network Learning for Semantic Source Code Retrieval Research Papers Yao Wan Zhejiang University, Jingdong Shu Zhejiang University, Yulei Sui University of Technology Sydney, Australia, Guandong Xu University of Technology, Sydney, Zhou Zhao Zhejiang University, Jian Wu Zhejiang University, philip yu University of Illinois at Chicago
11:20 20m Talk		Experience Paper: Search-based Testing in Automated Driving Control ApplicationsACM SIGSOFT Distinguished Paper Award Research Papers Christoph Gladisch Corporate Research, Robert Bosch GmbH, Thomas Heinz Corporate Research, Robert Bosch GmbH, Christian Heinzemann Corporate Research, Robert Bosch GmbH, Jens Oehlerking Corporate Research, Robert Bosch GmbH, Anne von Vietinghoff Corporate Research, Robert Bosch GmbH, Tim Pfitzer Robert Bosch Automotive Steering GmbH
11:40 20m Talk		Machine Translation-Based Bug Localization Technique for Bridging Lexical Gap Journal First Presentations Yan Xiao Department of Computer Science, City University of Hong Kong, Jacky Keung Department of Computer Science, City University of Hong Kong, Kwabena E. Bennin Blekinge Institute of Technology, SERL Sweden, Qing Mi Department of Computer Science, City University of Hong Kong Link to publication
12:00 10m Talk		AutoFocus: Interpreting Attention-based Neural Networks by Code Perturbation Research Papers Nghi D. Q. Bui Singapore Management University, Singapore, Yijun Yu The Open University, UK, Lingxiao Jiang Singapore Management University Pre-print
12:10 10m Demonstration		A Quantitative Analysis Framework for Recurrent Neural Network Demonstrations Xiaoning Du Nanyang Technological University, Xiaofei Xie Nanyang Technological University, Yi Li Nanyang Technological University, Lei Ma Kyushu University, Yang Liu Nanyang Technological University, Singapore, Jianjun Zhao Kyushu University