Blogs (1) >>
ASE 2019
Sun 10 - Fri 15 November 2019 San Diego, California, United States
Wed 13 Nov 2019 11:00 - 11:20 at Cortez 1 - Testing and Program Analysis Chair(s): Jun Sun

The regular expression (regex) practices of software engineers affect the maintainability, correctness, and security of their software applications. Empirical research has described characteristics like the distribution of regex feature usage, the structural complexity of regexes, and worst-case regex match behaviors. But researchers have not critically examined the methodology they follow to extract regexes, and findings to date are typically generalized from regexes written in only 1– 2 programming languages. This is an incomplete foundation.

Generalizing existing research depends on validating two hypotheses: (1) Various regex extraction methodologies yield similar results, and (2) Regex characteristics are similar across programming languages. To test these hypotheses, we defined eight regex metrics to capture the dimensions of regex representation, string language diversity, and worst-case match complexity. We report that the two competing regex extraction methodologies yield comparable corpuses, suggesting that simpler regex extraction techniques will still yield sound corpuses. But in comparing regexes across programming languages, we found significant differences in some characteristics by programming language. Our findings have bearing on future empirical methodology, as the programming language should be considered, and generalizability will not be assured. Our measurements on a corpus of 537,806 regexes can guide data-driven designs of a new generation of regex tools and regex engines.

J. Davis's slides for "Testing Regex Generalizability and its Implications" (DavisMoyerKazerouniLee-RegexGeneralizability-ASE19-slides.pptx)6.11MiB

Wed 13 Nov

Displayed time zone: Tijuana, Baja California change

10:40 - 12:20
Testing and Program AnalysisResearch Papers / Demonstrations at Cortez 1
Chair(s): Jun Sun Singapore Management University, Singapore
10:40
20m
Talk
Regexes are Hard: Decision-making, Difficulties, and Risks in Programming Regular ExpressionsACM SIGSOFT Distinguished Paper Award
Research Papers
Louis G. Michael IV Virginia Tech, James Donohue University of Bradford, James C. Davis Virginia Tech, USA, Dongyoon Lee Stony Brook University, Francisco Servant Virginia Tech
Pre-print File Attached
11:00
20m
Talk
Testing Regex Generalizability And Its Implications: A Large-Scale Many-Language Measurement Study
Research Papers
James C. Davis Virginia Tech, USA, Daniel Moyer Virginia Tech, Ayaan M. Kazerouni Virginia Tech, Dongyoon Lee Stony Brook University
Pre-print File Attached
11:20
20m
Talk
Accurate String Constraints Solution Counting with Weighted Automata
Research Papers
Elena Sherman Boise State University, Andrew Harris Boise State University
11:40
20m
Talk
Subformula Caching for Model Counting and Quantitative Program Analysis
Research Papers
William Eiers University of California at Santa Barbara, USA, Seemanta Saha University of California Santa Barbara, Tegan Brennan University of California, Santa Barbara, Tevfik Bultan University of California, Santa Barbara
12:00
10m
Demonstration
SPrinter: A Static Checker for Finding Smart Pointer Errors in C++ Programs
Demonstrations
Xutong Ma Institute of Software, Chinese Academy of Sciences, Jiwei Yan Institute of Software, Chinese Academy of Sciences, Yaqi Li Institute of Software, Chinese Academy of Sciences, Jun Yan Institute of Software, Chinese Academy of Sciences, Jian Zhang Institute of Software, Chinese Academy of Sciences
12:10
10m
Demonstration
FPChecker: Detecting Floating-Point Exceptions in GPU Applications
Demonstrations
Ignacio Laguna Lawrence Livermore National Laboratory