How Well Does Test Case Prioritization Integrate with Statistical Fault Localization?

Information and Software Technology 54 (7): 739-758 (2012)

How Well Does Test Case Prioritization Integrate with Statistical Fault Localization? ¹

Bo Jiang ², Zhenyu Zhang ³, W.K. Chan ⁴, T.H. Tse ⁵, and Tsong Yueh Chen ⁶

[paper from ScienceDirect | technical report TR-2012-02]

ABSTRACT

Context: Effective test case prioritization shortens the time to detect failures, and yet the use of fewer test cases may compromise the effectiveness of subsequent fault localization.
Objective: The paper aims at finding whether several previously identified effectiveness factors of test case prioritization techniques, namely strategy, coverage granularity, and time cost, have observable consequences on the effectiveness of statistical fault localization techniques.
Method: This paper uses a controlled experiment to examine these factors. The experiment includes 16 test case prioritization techniques and 4 statistical fault localization techniques using the Siemens suite of programs as well as grep, gzip, zed, and flex as subjects. The experiment studies the effects of the percentage of code examined to locate faults from these benchmark subjects after a given number of failures have been observed.
Result: We find that if testers have a budgetary concern on the number of test cases for regression testing, the use of test case prioritization can save up to 40% of test case executions for commit builds without significantly affecting the effectiveness of fault localization. A statistical fault localization technique using a smaller fraction of a prioritized test suite is found to compromise its effectiveness seriously. Despite the presence of some variations, the inclusion of more failed test cases will generally improve the fault localization effectiveness during the integration process. Interestingly, during the variation periods, adding more failed test cases actually deteriorates the fault localization effectiveness. In terms of strategies, Random is found to be the most effective, followed by the ART and Additional strategies, while the Total strategy is the least effective. We do not observe sufficient empirical evidence to conclude that using different coverage granularity levels have different overall effects.
Conclusion: The paper empirically identifies that strategy and time-cost of test case prioritization techniques are key factors affecting the effectiveness of statistical fault localization, while coverage granularity is not a significant factor. It also identifies a mid-range deterioration in fault localization effectiveness when adding more test cases to facilitate debugging.
Keywords: Software process integration; continuous integration; test case prioritization; statistical fault localization; adaptive random testing; coverage

1. This research is supported in part by the General Research Fund of the Research Grants Council of Hong Kong (project no 717308), grants of the Natural Science Foundation of China (project no. 61003027), a strategy research grant of City University of Hong Kong (project no. 7008039), and a discovery grant of the Australian Research Council (project no. DP120104773).

2. School of Computer Science and Engineering, Beihang University, Beijing, China.

3. State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China.

4. (Corresponding author.)

Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Hong Kong.
Email:

5. Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong.

6. Centre for Software Analysis and Testing, Swinburne University of Technology, Melbourne, Australia.

EVERY VISITOR COUNTS:

Cumulative visitor count