Machine Learning Based Automated Method Name Recommendation: How Far Are We (ASE 2019 - Research Papers)

Who

Lin Jiang, Hui Liu, He Jiang

Track

ASE 2019 Research Papers

Time Zone

The program is currently displayed in (GMT-08:00) Tijuana, Baja California.

Use conference time zone: (GMT-08:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 13 Nov 2019 16:20 - 16:40 at Cortez 2&3 - API and Renaming Chair(s): Massimiliano Di Penta

Abstract

High quality method names are critical for the readability and maintainability of programs. However, constructing concise and consistent method names is often challenging, especially for inexperienced developers. To this end, advanced machine learning techniques have been recently leveraged to recommend method names automatically for given method bodies/implementation. Recent large-scale evaluations also suggest that such approaches are accurate. However, little is known about where and why such approaches work or don’t work. To figure out the state of the art as well as the rationale for the success/failure, in this paper we conduct an empirical study on the state-of-the-art approach code2vec. We assess code2vec on a new dataset with more realistic settings. Our evaluation results suggest that although switching to new dataset does not significantly influence the performance, more realistic settings do significantly reduce the performance of code2vec. Further analysis on the successfully recommended method names also reveals the following findings: 1) around half (48.3%) of the accepted recommendations are made on getter/setter methods; 2) a large portion (19.2%) of the successfully recommended method names could be copied from the given bodies. To further validate its usefulness, we ask developers to manually score the difficulty in naming methods they developed. Code2vec is then applied to such manually scored methods to evaluate how often it works in need. Our evaluation results suggest that code2vec rarely works when it is really needed. Finally, to intuitively reveal the state of the art and to investigate the possibility of designing simple and straightforward alternative approaches, we propose a heuristics based approach to recommending method names. Evaluation results on large-scale dataset suggest that this simple heuristics-based approach significantly outperforms the state-of-the-art machine learning based approach, improving precision and recall by 65.25% and 22.45%, respectively. The comparison suggests that machine learning based recommendation of method names still has a long way to go.

Link to Publication

https://liuhuigmail.github.io/2019ASE.pdf

Link to Preprint

https://liuhuigmail.github.io/2019ASE.pdf

Lin Jiang

beijing university of posts and telecommunication

Hui Liu

Beijing Institute of Technology

China

He Jiang

School of Software, Dalian University of Technology