Return to search

Use of program and data-specific heuristics for automatic software test data generation

The application of heuristic search techniques, such as genetic algorithms, to the problem of automatically generating software test data has been a growing interest for many researchers in recent years. The problem tackled by this thesis is the development of heuristics for test data search for a class of test data generation problems that could not be solved prior to the work done in this thesis because of a lack of an informative cost function. Prior to this thesis, work in applying search techniques to structural test data generation was largely limited to numeric test data and in particular, this left open the problem of generating string test data. Some potential string cost functions and corresponding search operators are presented in this thesis. For string equality, an adaptation of the binary Hamming distance is considered, together with two new string specific match cost functions. New cost functions for string ordering are also defined. For string equality, a version of the edit distance cost function with fine-grained costs based on the difference in character ordinal values was found to be the most effective in an empirical study. A second problem tackled in this thesis is the problem of generating test data for programs whose coverage criterion cost function is locally constant. This arises because the computation produced by many programs leads to a loss of information. The use of flag variables, for example, can lead to information loss. Consequently, conventional instrumentation added to a program receives constant or almost constant input and hence the search receives very little guidance and will often fail to find test data. The approach adopted in this thesis is to exploit the structure and behaviour of the computation from the input values to the test goal, the usual instrumentation point. The new technique depends on introducing program data-state scarcity as an additional search goal. The search is guided by a new fitness function made up of two parts, one depending on the branch distance of the test goal, the other depending on the diversity of the data-states produced during execution of the program under test. In addition to the program data-state, the program operations, in the form of the program-specific operations, can be used to aid the generation of test data. The program-specific operators is demonstrated for strings and an empirical investigation showed a fivefold increase in performance. This technique can also be generalised to other data types. An empirical investigation of the use of program-specific search operators combined with a data-state scarcity search for flag problems showed a threefold increase in performance.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:441758
Date January 2007
CreatorsAlshraideh, Mohammad
ContributorsBottaci, Leonardo
PublisherUniversity of Hull
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hydra.hull.ac.uk/resources/hull:12387

Page generated in 0.0026 seconds