Machine Learning Based Automated Method Name Recommendation: How Far Are We
High quality method names are critical for the readability and maintainability of programs. However, constructing concise and consistent method names is often challenging, especially for inexperienced developers. To this end, advanced machine learning techniques have been recently leveraged to recommend method names automatically for given method bodies/implementation. Recent large-scale evaluations also suggest that such approaches are accurate. However, little is known about where and why such approaches work or don’t work. To figure out the state of the art as well as the rationale for the success/failure, in this paper we conduct an empirical study on the state-of-the-art approach code2vec. We assess code2vec on a new dataset with more realistic settings. Our evaluation results suggest that although switching to new dataset does not significantly influence the performance, more realistic settings do significantly reduce the performance of code2vec. Further analysis on the successfully recommended method names also reveals the following findings: 1) around half (48.3%) of the accepted recommendations are made on getter/setter methods; 2) a large portion (19.2%) of the successfully recommended method names could be copied from the given bodies. To further validate its usefulness, we ask developers to manually score the difficulty in naming methods they developed. Code2vec is then applied to such manually scored methods to evaluate how often it works in need. Our evaluation results suggest that code2vec rarely works when it is really needed. Finally, to intuitively reveal the state of the art and to investigate the possibility of designing simple and straightforward alternative approaches, we propose a heuristics based approach to recommending method names. Evaluation results on large-scale dataset suggest that this simple heuristics-based approach significantly outperforms the state-of-the-art machine learning based approach, improving precision and recall by 65.25% and 22.45%, respectively. The comparison suggests that machine learning based recommendation of method names still has a long way to go.
Wed 13 NovDisplayed time zone: Tijuana, Baja California change
16:00 - 17:40 | API and RenamingResearch Papers / Journal First Presentations at Cortez 2&3 Chair(s): Massimiliano Di Penta University of Sannio | ||
16:00 20mTalk | CodeKernel: A Graph Kernel based Approach to the Selection of API Usage Examples Research Papers Xiaodong Gu The Hong Kong University of Science and Technology, Hongyu Zhang The University of Newcastle, Sunghun Kim Hong Kong University of Science and Technology Pre-print | ||
16:20 20mTalk | Machine Learning Based Automated Method Name Recommendation: How Far Are We Research Papers Lin Jiang beijing university of posts and telecommunication, Hui Liu Beijing Institute of Technology, He Jiang School of Software, Dalian University of Technology Link to publication Pre-print | ||
16:40 20mTalk | MARBLE: Mining for Boilerplate Code to Identify API Usability Problems Research Papers Daye Nam Carnegie Mellon University, Amber Horvath Carnegie Mellon University, Andrew Macvean Google, Inc., Brad A. Myers Carnegie Mellon University, Bogdan Vasilescu Carnegie Mellon University Pre-print | ||
17:00 20mTalk | DIRE: A Neural Approach to Decompiled Identifier Renaming Research Papers Jeremy Lacomis Carnegie Mellon University, Pengcheng Yin Carnegie Mellon University, Edward J. Schwartz Carnegie Mellon University Software Engineering Institute, Miltiadis Allamanis Microsoft Research, Cambridge, Claire Le Goues Carnegie Mellon University, Graham Neubig Carnegie Mellon University, Bogdan Vasilescu Carnegie Mellon University Pre-print Media Attached | ||
17:20 20mTalk | Automatic Detection and Update Suggestion for Outdated API Names in Documentation Journal First Presentations Seonah Lee Gyeongsang National University, Rongxin Wu Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Shing-Chi Cheung Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Sungwon Kang Korea Advanced Institute of Science and Technology Link to publication |