1 |
An Effort Prediction Framework for Software Defect CorrectionHassouna, Alaa 27 August 2008 (has links)
Developers apply changes and updates to software systems to adapt to emerging
environments and address new requirements. In turn, these changes introduce
additional software defects, usually caused by our inability to comprehend the full
scope of the modi ed code. As a result, software practitioners have developed tools
to aid in the detection and prediction of imminent software defects, in addition to
the eort required to correct them. Although software development eort prediction
has been in use for many years, research into defect-correction eort prediction is
relatively new. The increasing complexity, integration and ubiquitous nature of
current software systems has sparked renewed interest in this eld. Eort prediction
now plays a critical role in the planning activities of managers. Accurate predictions
help corporations budget, plan and distribute available resources eectively and
e ciently. In particular, early defect-correction eort predictions could be used by
testers to set schedules, and by managers to plan costs and provide earlier feedback
to customers about future releases.
In this work, we address the problem of predicting the eort needed to resolve a
software defect. More speci cally, our study is concerned with defects or issues that
are reported on an Issue Tracking System or any other defect repository. Current
approaches use one prediction method or technique to produce eort predictions.
This approach usually suers from the weaknesses of the chosen prediction method,
and consequently the accuracy of the predictions are aected. To address this problem,
we present a composite prediction framework. Rather than using one prediction
approach for all defects, we propose the use of multiple integrated methods
which complement the weaknesses of one another. Our framework is divided into
two sub-categories, Similarity-Score Dependent and Similarity-Score Independent.
The Similarity-Score Dependent method utilizes the power of Case-Based Reasoning,
also known as Instance-Based Reasoning, to compute predictions. It relies on
matching target issues to similar historical cases, then combines their known eort
for an informed estimate. On the other hand, the Similarity-Score Independent
method makes use of other defect-related information with some statistical manipulation
to produce the required estimate. To measure similarity between defects,
some method of distance calculation must be used. In some cases, this method
might produce misleading results due to observed inconsistencies in history, and
the fact that current similarity-scoring techniques cannot account for all the variability
in the data. In this case, the Similarity-Score Independent method can be
used to estimate the eort, where the eect of such inconsistencies can be reduced.
We have performed a number of experimental studies on the proposed framework
to assess the eectiveness of the presented techniques. We extracted the data sets
from an operational Issue Tracking System in order to test the validity of the model
on real project data. These studies involved the development of multiple tools in
both the Java programming language and PHP, each for a certain stage of data
analysis and manipulation. The results show that our proposed approach produces
signi cant improvements when compared to current methods.
|
2 |
An Effort Prediction Framework for Software Defect CorrectionHassouna, Alaa 27 August 2008 (has links)
Developers apply changes and updates to software systems to adapt to emerging
environments and address new requirements. In turn, these changes introduce
additional software defects, usually caused by our inability to comprehend the full
scope of the modi ed code. As a result, software practitioners have developed tools
to aid in the detection and prediction of imminent software defects, in addition to
the eort required to correct them. Although software development eort prediction
has been in use for many years, research into defect-correction eort prediction is
relatively new. The increasing complexity, integration and ubiquitous nature of
current software systems has sparked renewed interest in this eld. Eort prediction
now plays a critical role in the planning activities of managers. Accurate predictions
help corporations budget, plan and distribute available resources eectively and
e ciently. In particular, early defect-correction eort predictions could be used by
testers to set schedules, and by managers to plan costs and provide earlier feedback
to customers about future releases.
In this work, we address the problem of predicting the eort needed to resolve a
software defect. More speci cally, our study is concerned with defects or issues that
are reported on an Issue Tracking System or any other defect repository. Current
approaches use one prediction method or technique to produce eort predictions.
This approach usually suers from the weaknesses of the chosen prediction method,
and consequently the accuracy of the predictions are aected. To address this problem,
we present a composite prediction framework. Rather than using one prediction
approach for all defects, we propose the use of multiple integrated methods
which complement the weaknesses of one another. Our framework is divided into
two sub-categories, Similarity-Score Dependent and Similarity-Score Independent.
The Similarity-Score Dependent method utilizes the power of Case-Based Reasoning,
also known as Instance-Based Reasoning, to compute predictions. It relies on
matching target issues to similar historical cases, then combines their known eort
for an informed estimate. On the other hand, the Similarity-Score Independent
method makes use of other defect-related information with some statistical manipulation
to produce the required estimate. To measure similarity between defects,
some method of distance calculation must be used. In some cases, this method
might produce misleading results due to observed inconsistencies in history, and
the fact that current similarity-scoring techniques cannot account for all the variability
in the data. In this case, the Similarity-Score Independent method can be
used to estimate the eort, where the eect of such inconsistencies can be reduced.
We have performed a number of experimental studies on the proposed framework
to assess the eectiveness of the presented techniques. We extracted the data sets
from an operational Issue Tracking System in order to test the validity of the model
on real project data. These studies involved the development of multiple tools in
both the Java programming language and PHP, each for a certain stage of data
analysis and manipulation. The results show that our proposed approach produces
signi cant improvements when compared to current methods.
|
3 |
Object-oriented software development effort prediction using design patterns from object interaction analysisAdekile, Olusegun 15 May 2009 (has links)
Software project management is arguably the most important activity in modern
software development projects. In the absence of realistic and objective management, the
software development process cannot be managed in an effective way. Software
development effort estimation is one of the most challenging and researched problems in
project management. With the advent of object-oriented development, there have been
studies to transpose some of the existing effort estimation methodologies to the new
development paradigm. However, there is not in existence a holistic approach to
estimation that allows for the refinement of an initial estimate produced in the
requirements gathering phase through to the design phase. A SysML point methodology
is proposed that is based on a common, structured and comprehensive modeling
language (OMG SysML) that factors in the models that correspond to the primary phases
of object-oriented development into producing an effort estimate. This dissertation
presents a Function Point-like approach, named Pattern Point, which was conceived to
estimate the size of object-oriented products using the design patterns found in object
interaction modeling from the late OO analysis phase. In particular, two measures are proposed (PP1 and PP2) that are theoretically validated showing that they satisfy wellknown
properties necessary for size measures.
An initial empirical validation is performed that is meant to assess the usefulness
and effectiveness of the proposed measures in predicting the development effort of
object-oriented systems. Moreover, a comparative analysis is carried out; taking into
account several other size measures. The experimental results show that the Pattern Point
measure can be effectively used during the OOA phase to predict the effort values with a
high degree of confidence. The PP2 metric yielded the best results with an aggregate
PRED (0.25) = 0.874.
|
4 |
Assessment of data-driven bayesian networks in software effort predictionTierno, Ivan Alexandre Paiz January 2013 (has links)
Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction.
|
5 |
Assessment of data-driven bayesian networks in software effort predictionTierno, Ivan Alexandre Paiz January 2013 (has links)
Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction.
|
6 |
Assessment of data-driven bayesian networks in software effort predictionTierno, Ivan Alexandre Paiz January 2013 (has links)
Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction.
|
Page generated in 0.1018 seconds