Global ETD Search

Return to search

Providing statistical inference to case-based software effort estimation

This thesis proposes a novel approach, called Analogy-X to extend and improve the classical data-intensive analogy approach for software effort estimation. The Analogy-X approach combines the notions of distance matrix correlation found in ecology literature and statistic analysis techniques to provide useful inferential statistics to support analogy-based systems. Data-intensive analogy for software effort estimation has been proposed as a viable alternative to other prediction methods such as linear regression. In many cases, researchers found analogy outperformed algorithmic methods. However, the overall performance of analogy depends on the dataset quality or relevance of project cases to the target project, and the feature subset selected in the analogy-based model. Unfortunately, there is no mechanism to assess its appropriateness for a specific dataset, in most of the cases analogy will continue to execute regardless of the dataset quality. The Analogy-X approach is a set of procedures that utilize the principles of Mantel randomization test to provide inferential statistics to Analogy. Inspired by the Mantel correlation randomization test commonly used in ecology and psychology, Analogy-X uses the strength of correlation between the distance matrix of project features and the distance matrix of known effort values of the dataset to assess the suitability of the dataset for analogy, to identify the most appropriate feature subset, and to remove any atypical project cases from the dataset. The empirical studies show that Analogy-X is capable of: -- Detect extremely outlying project cases that will ultimately distort prediction outcomes using a sensitivity analysis strategy. -- Detect relevant project features that are useful to identify potential source analogues in a stepwise fashion similar to that of stepwise regression. -- Identifying whether analogy-based approach is appropriate for the dataset Analogy-X, thus is a robust solution, provides a sound statistical basis for analogy. It removes the need of using any forms of heuristic search and greatly improves its algorithmic performance. The studies also show that the Analogy-X approach is capable of removing the bottlenecks of performance in data-intensive analogy. The overall results obtained also suggest that a fully automated data-intensive analogy for software effort estimation can be implemented using the Analogy-X approach, and it is indeed an effective front end to analogy-based systems. The contribution of this work is significant since it provides an approach that will have major impact on the evolution of data-intensive analogy-based and case-based reasoning systems.

http://handle.unsw.edu.au/1959.4/40700

Software engineering.

Software engineering -- Mathematics.

Identifer	oai:union.ndltd.org:ADTP/215520
Date	January 2007
Creators	Keung, Wai, Computer Science & Engineering, Faculty of Engineering, UNSW
Source Sets	Australiasian Digital Theses Program
Language	English
Detected Language	English
Rights	http://unsworks.unsw.edu.au/copyright, http://unsworks.unsw.edu.au/copyright

Page generated in 0.0019 seconds

Providing statistical inference to case-based software effort estimation

Description

Links & Downloads

Tags

Additional Fields