Fast test suite-driven model-based fault localisation with application to pinpointing defects in student programs

Software and Systems Modeling, Jul 2017

Fault localisation, i.e. the identification of program locations that cause errors, takes significant effort and cost. We describe a fast model-based fault localisation algorithm that, given a test suite, uses symbolic execution methods to fully automatically identify a small subset of program locations where genuine program repairs exist. Our algorithm iterates over failing test cases and collects locations where an assignment change can repair exhibited faulty behaviour. Our main contribution is an improved search through the test suite, reducing the effort for the symbolic execution of the models and leading to speed-ups of more than two orders of magnitude over the previously published implementation by Griesmayer et al. We implemented our algorithm for C programs, using the KLEE symbolic execution engine, and demonstrate its effectiveness on the Siemens TCAS variants. Its performance is in line with recent alternative model-based fault localisation techniques, but narrows the location set further without rejecting any genuine repair locations where faults can be fixed by changing a single assignment. We also show how our tool can be used in an educational context to improve self-guided learning and accelerate assessment. We apply our algorithm to a large selection of actual student coursework submissions, providing precise localisation within a sub-second response time. We show this using small test suites, already provided in the coursework management system, and on expanded test suites, demonstrating the scalability of our approach. We also show compliance with test suites does not reliably grade a class of “almost-correct” submissions, which our tool highlights, as being close to the correct answer. Finally, we show an extension to our tool that extends our fast localisation results to a selection of student submissions that contain two faults.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1007%2Fs10270-017-0612-y.pdf

Fast test suite-driven model-based fault localisation with application to pinpointing defects in student programs

Fast test suite-driven model-based fault localisation with application to pinpointing defects in student programs Geoff Birch 0 1 2 3 Bernd Fischer 0 1 2 3 Michael Poppleton 0 1 2 3 0 Michael Poppleton 1 Bernd Fischer 2 Communicated by Prof. Alfonso Pierantonio , Jasmin Blanchette, Francis Bordeleau, Nikolai Kosmatov, Gabi Taentzer, and Manuel Wimmer 3 Stellenbosch University , Matieland 7602 , South Africa Fault localisation, i.e. the identification of program locations that cause errors, takes significant effort and cost. We describe a fast model-based fault localisation algorithm that, given a test suite, uses symbolic execution methods to fully automatically identify a small subset of program locations where genuine program repairs exist. Our algorithm iterates over failing test cases and collects locations where an assignment change can repair exhibited faulty behaviour. Our main contribution is an improved search through the test suite, reducing the effort for the symbolic execution of the models and leading to speed-ups of more than two orders of magnitude over the previously published implementation by Griesmayer et al. We implemented our algorithm for C programs, using the KLEE symbolic execution engine, and demonstrate its effectiveness on the Siemens TCAS variants. Its performance is in line with recent alternative model-based fault localisation techniques, but narrows the location set further without rejecting any genuine repair locations where faults can be fixed by changing a single assignment. We also show how our tool can be used in an educational context to improve self-guided learning and accelerate assessment. We apply our algorithm to a large selection of actual student coursework submissions, provid- Automated debugging; Model-based fault localisation; Symbolic execution; Automated assessment - University of Southampton, Southampton SO17 1BJ, UK ing precise localisation within a sub-second response time. We show this using small test suites, already provided in the coursework management system, and on expanded test suites, demonstrating the scalability of our approach. We also show compliance with test suites does not reliably grade a class of “almost-correct” submissions, which our tool highlights, as being close to the correct answer. Finally, we show an extension to our tool that extends our fast localisation results to a selection of student submissions that contain two faults. 1 Introduction Fault localisation, i.e. the identification of program locations that can cause erroneous state transitions that eventually lead to observed program failures, is a critical component of the debugging cycle. Since it puts a significant time [ 47,50 ] and expertise burden [ 1,66 ] on programmers, a variety of different automated fault localisation methods have been proposed [ 12,14,23,25,26,34,55,56,58 ]. We describe a fast model-based fault localisation algorithm that, given a test suite, uses symbolic execution methods to fully automatically identify a small subset of program locations within which (under a single-fault assumption) a genuine program repair exists. Our main contribution is an improved search through the test suite that drastically reduces the effort for the symbolic execution of the models. Model-based fault localisation [ 54 ] (sometimes also called model-based debugging [ 15 ]) is the application of modelbased diagnosis methods [ 18 ] to programs. It involves three main steps: (i) the construction of a logical model from the original program; (ii) the symbolic analysis of this model; and (iii) mapping any faults found in the model back to program locations. One popular approach to model-based fault localisation is to transform the program so that a symbolic program verification tool can be used for all three steps. For example, Griesmayer et al. describe a method [ 23 ] in which the model (in the form of a logical satisfiability problem) is derived by running the CBMC model checker over a transformed program and then analysed by means of the model checker’s integrated SAT solver. The transformation “inverts” the program’s specification (cf. Sect. 2, producing failures where the original program would complete and blocking paths where the original program would fail), and replaces each original assignment by a conditional assignment with either the original value or an unconstrained symbolic value, depending on the value of a toggle variable. The actual localisation can then be reduced to extracting the possible values of the toggle variable from the satisfying assignments that the SAT solver returns. However, the technique initially described by Griesmayer et al. requires detailed specifications to achieve acceptable precision—the weaker the specification, the more program locations are flagged as potential faults. Such detailed specifications rarely exist in practice. What do commonly exist, though, are extensive unit test suites, in particular in the (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs10270-017-0612-y.pdf

Geoff Birch, Bernd Fischer, Michael Poppleton. Fast test suite-driven model-based fault localisation with application to pinpointing defects in student programs, Software and Systems Modeling, 2017, pp. 1-27, DOI: 10.1007/s10270-017-0612-y