Combining Spectrum-Based Fault Localization and Statistical Debugging: An Empirical Study
Program debugging is a time-consuming task, and researchers have proposed different kinds of automatic fault localization techniques to mitigate the burden of manual debugging. Among these techniques, two popular families are spectrum-based fault localization (SBFL) and statistical debugging (SD), both localizing faults by collecting statistical information at runtime. Though the ideas are similar, the two families have been developed independently and their combinations have not been systematically explored.
In this paper we perform a systematical empirical study on the combination of SBFL and SD. We first build a unified model of the two techniques, and systematically explore four types of variations, different predicates, different risk evaluation formulas, different granularities of data collection, and different methods of combining suspicious scores.
Our study leads to several findings. First, most of the effectiveness of the combined approach contributed by a simple type of predicates: branch conditions. Second, the risk evaluation formulas of SBFL significantly outperform that of SD. Third, fine-grained data collection significantly outperforms coarse-grained data collection with a little extra execution overhead. Fourth, a linear combination of SBFL and SD predicates outperforms both individual approaches.
According to our empirical study, we propose a new fault localization approach, PREDFL (Predicate-based Fault Localization), with the best configuration for each dimension under the unified model. Then, we explore its complementarity to existing techniques by integrating PREDFL with a state-of-the-art fault localization framework. The experimental results show that PREDFL can further improve the effectiveness of state-of-the-art fault localization techniques. More concretely, integrating PREDFL results in an up to 20.8% improvement w.r.t the faults successfully located at Top-1, which reveals that PREDFL complements existing techniques.
Wed 13 NovDisplayed time zone: Tijuana, Baja California change
13:40 - 15:20 | Systems and LocalizationIndustry Showcase / Research Papers / Demonstrations at Cortez 2&3 Chair(s): Tegawendé F. Bissyandé SnT, University of Luxembourg | ||
13:40 20mTalk | Combining Spectrum-Based Fault Localization and Statistical Debugging: An Empirical Study Research Papers Jiajun Jiang Peking University, Ran Wang Peking University, Yingfei Xiong Peking University, Xiangping Chen Sun Yat-sen University, Lu Zhang Peking University Pre-print | ||
14:00 20mTalk | SCMiner: Localizing System-Level Concurrency Faults from Large System Call Traces Research Papers Tarannum Shaila Zaman University of Kentucky, Xue Han University of Kentucky, Tingting Yu University of Kentucky Pre-print File Attached | ||
14:20 20mTalk | Root Cause Localization for Unreproducible Builds via Causality Analysis over System Call Tracing Research Papers Zhilei Ren Dalian University of Technology, Changlin Liu Case Western Reserve University, Xusheng Xiao Case Western Reserve University, He Jiang School of Software, Dalian University of Technology, Tao Xie Peking University | ||
14:40 20mTalk | PTracer: A Linux Kernel Patch Trace Bot Industry Showcase | ||
15:00 10mDemonstration | Pangolin: An SFL-based Toolset for Feature Localization Demonstrations Bruno Miguel Sotto-Mayor de Castro Machado IST, University of Lisbon, Alexandre Perez Palo Alto Research Center, Rui Abreu Instituto Superior Técnico, U. Lisboa & INESC-ID | ||
15:10 10mDemonstration | SiMPOSE - Configurable N-Way Program Merging Strategies for Superimposition-based Analysis of Variant-Rich Software Demonstrations Dennis Reuling Software Engineering Group, University of Siegen, Udo Kelter Software Engineering Group, University of Siegen, Sebastian Ruland TU Darmstadt, Real-time Systems Lab, Malte Lochau TU Darmstadt Pre-print Media Attached File Attached |