Return to search

Aspect Mining Using Self-Organizing Maps With Method Level Dynamic Software Metrics as Input Vectors

As the size and sophistication of modern software system increases, so is the need for high quality software that is easy to scale and maintain. Software systems evolve and undergo change over time. Unstructured software updates and refinements can lead to code scattering and tangling which are the main symptoms of crosscutting concerns. Presence of crosscutting concerns in a software system can lead to bloated and inefficient software system that is difficult to evolve, hard to analyze, difficult to reuse and costly to maintain.
A crosscutting concern in a software system represents implementation of a unique idea or a functionality that could not be properly encapsulated. The presence of crosscutting concerns in software systems is attributed to the limitations of programming languages and structural degradation associated with repeated change. No matter how well large-scale software applications are decomposed, some ideas are difficult to modularize and encapsulate resulting in crosscutting concerns scattered across the entire software system where code implementing one concept is tangled and mixed with code implementing unrelated concept.
Aspect Mining is a reverse software engineering exploration technique that is concerned with the development of concepts, principles, methods and tools supporting the identification and extraction of re-factorable aspect candidates in legacy software systems. The main goal of Aspect Mining is to help software engineers and developers to locate and identify crosscutting concerns and portions of the software that may need refactoring, with the aim of improving the quality, scalability and maintainability and evolution of software system.
The aspect mining approach presented in this dissertation involved three-phases. In the first phase, selected large-scale legacy benchmark test programs were dynamically traced and investigated. Metrics representing interaction between code fragments were derived from the collected data. In the second phase, the formulated dynamic metrics were then submitted as input to Self Organizing Maps (SOMs) for clustering. In the third phase, clusters produced by the SOM were then mapped against the benchmark test program in order to identify code scattering and tangling symptoms. Crosscutting concerns were identified and candidate aspect seeds mined.
Overall, the methodology used in this dissertation is found to perform as well as and no worse than other existing Aspect Mining methodologies. In other cases, the methodology used in this dissertation was found to have outperformed some of the existing Aspect mining methods that use the same set of benchmark test programs. With regards to Aspect Mining precision as it relates to LDA, 100% precision was attained, and with respect to JHD 51% precision was attained by this dissertation methodology, which is the same as attained by existing Aspect mining methods. Lessons learned from the results of experiments carried out in this dissertation have shown that even highly structured software systems that are based on best practice software design principles are laden with code repetitions and presence of crosscutting concerns.
One of the major contributions of this dissertation is the presentation of a new unsupervised Aspect Mining approach that minimizes human interaction, where hidden software features can be identified and inferences about the general structure of software system can be made, thereby addressing one of the drawbacks in currently existing dynamic Aspect Mining methodologies. The strength of the Aspect Mining approach presented in this dissertation is that the input metrics required to represent software code fragments can easily be derived from other viable software metric formulations without complex formalisms.
Other contributions made by this dissertation include the presentation of a good and viable software visualization technique that can be used for software visualization, exploration, and study and understanding of internal structure and behavioral nature of large-scale software systems. Areas that may need further study include the need for determining the optimal number of vector components that may be required to effectively represent extractible software components. Other issues worth considering include the establishment of a set of datasets derived from popularly known test benchmarks that can be used as a common standard for comparison, evaluation and validation of newly introduced Aspect Mining techniques.

Identiferoai:union.ndltd.org:nova.edu/oai:nsuworks.nova.edu:gscis_etd-1225
Date01 January 2009
CreatorsMaisikeli, Sayyed Garba
PublisherNSUWorks
Source SetsNova Southeastern University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceCEC Theses and Dissertations

Page generated in 0.0021 seconds