Should We Add Repair Time to an Unfixed Bug? An Exploratory Study of Automated Program Repair on 2980 Small-Scale Programs
The goal of automated program repair is to automate patch generation for buggy programs to reduce the manual effort by developers. A generate-and-validate method, such as GenProg, is a kind of typical repair methods that continuously generate potential patches and then validate the patches with a given test suite. A generate-and-validate method can accumulate patches when the execution time of repair methods increases. However, how many buggy programs can be newly patched when the time increase? In this paper, we conducted an exploratory study of repairing 2980 small-scale buggy programs from the CODEFLAWS benchmark with three repair methods GENPROG, SPR, and PROPHET. The aim of this study is to understand the execution time of repair methods via investigating four research questions. Experimental results show that the time of patch generation correlates with the number of executable lines of code and the Cyclomatic complexity. That is, a complex program is difficult to be repaired. This motivates us to explore a new repair method that can weaken such correlation with the lines of code and the complexity. We designed VANFIX, a simple and effective repair method for small-scale C programs. VANFIX leverages the probability of exploring the search space to conduct a variable search neighborhood for potential patches, rather than patching suspicious statements one by one. The comparison among repair methods shows that VANFIX can generate patches for 653 buggy programs, which contains 408 correctly patched buggy programs. This makes VANFIX achieve 24% to 30% better precision than GENPROG, SPR, and PROPHET.