Balancing the trade-off between accuracy and interpretability in software defect prediction
Context: Classification techniques of supervised machine learning have been successfully applied to various domains of practice. When building a predictive model, there are two important criteria: predictive accuracy and interpretability, which generally have a trade-off relationship. In particular, interpretability should be accorded greater emphasis in the domains where the incorporation of expert knowledge into a predictive model is required. Objective: The aim of this research is to propose a new classification model, called superposed naive Bayes (SNB), which transforms a naive Bayes ensemble into a simple naive Bayes model by linear approximation. Method: In order to evaluate the predictive accuracy and interpretability of the proposed method, we conducted a comparative study using well-known classification techniques such as rule-based learners, decision trees, regression models, support vector machines, neural networks, Bayesian learners, and ensemble learners, over 13 real-world public datasets. Results: A trade-off analysis between the accuracy and interpretability of different classification techniques was performed with a scatter plot comparing relative ranks of accuracy with those of interpretability. The experiment results show that the proposed method (SNB) can produce a balanced output that satisfies both accuracy and interpretability criteria. Conclusions: SNB offers a comprehensible predictive model based on a simple and transparent model structure, which can provide an effective way for balancing the trade-off between accuracy and interpretability.
Balancing the Trade-off between Accuracy and Interpretability in Software Defect Prediction (ASE2019_20191112a.pdf) | 927KiB |
Wed 13 NovDisplayed time zone: Tijuana, Baja California change
16:00 - 17:40 | PredictionResearch Papers / Journal First Presentations at Cortez 1 Chair(s): Xin Xia Monash University | ||
16:00 20mTalk | Predicting Licenses for Changed Source Code Research Papers Xiaoyu Liu Department of Computer Science and Engineering, Southern Methodist University, Liguo Huang Dept. of Computer Science, Southern Methodist University, Dallas, TX, 75205, Jidong Ge State Key Laboratory for Novel Software and Technology, Nanjing University, Vincent Ng Human Language Technology Research Institute, University of Texas at Dallas, Richardson, TX 75083-0688 | ||
16:20 20mTalk | Empirical evaluation of the impact of class overlap on software defect prediction Research Papers Lina Gong China University of Mining and Technology, Shujuan Jiang China University of Mining and Technology, Rongcun Wang China University of Mining and Technology, Li Jiang China University of Mining and Technology | ||
16:40 20mTalk | Combining Program Analysis and Statistical Language Model for Code Statement Completion Research Papers Son Nguyen The University of Texas at Dallas, Tien N. Nguyen University of Texas at Dallas, Yi Li New Jersey Institute of Technology, USA, Shaohua Wang New Jersey Institute of Technology, USA | ||
17:00 20mTalk | Balancing the trade-off between accuracy and interpretability in software defect prediction Journal First Presentations Toshiki Mori Corporate Software Engineering & Technology Center, Toshiba Corporation, Naoshi Uchihira School of Knowledge Science, Japan Advanced Institute of Science and Technology (JAIST) Link to publication File Attached | ||
17:20 20mTalk | Fine-grained just-in-time defect prediction Journal First Presentations Luca Pascarella Delft University of Technology, Fabio Palomba Department of Informatics, University of Zurich, Alberto Bacchelli University of Zurich Link to publication |