Return to search

Mining Aspects through Cluster Analysis Using Support Vector Machines and Genetic Algorithms

The main purpose of object-oriented programming is to use encapsulation to reduce the amount of coupling within each object. However, object-oriented programming has some weaknesses in this area. To address this shortcoming, researchers have proposed an approach known as aspect-oriented programming (AOP). AOP is intended to reduce the amount of tangled code within an application by grouping similar functions into an aspect. To demonstrate the powerful aspects of AOP, it is necessary to extract aspect candidates from current object-oriented applications.
Many different approaches have been proposed to accomplish this task. One of such approaches utilizes vector based clustering to identify the possible aspect candidates. In this study, two different types of vectors are applied to two different vector-based clustering techniques. In this approach, each method in a software system S is represented by a d-dimensional vector. These vectors take into account the Fan-in values of the methods as well as the number of calls made to individual methods within the classes in software system S. Then a semi-supervised clustering approach known as Support Vector Clustering is applied to the vectors. In addition, an improved K-means clustering approach which is based on Genetic Algorithms is also applied to these vectors. The results obtained from these two approaches are then evaluated using standard metrics for aspect mining.
In addition to introducing two new clustering based approaches to aspect mining, this research investigates the effectiveness of the currently known metrics used in aspect mining to evaluate a given vector based approach. Many of the metrics currently used for aspect mining evaluations are singleton metrics. Such metrics evaluate a given approach by taking into account only one aspect of a clustering technique. This study, introduces two different sets of metrics by combining these singleton measures. The iDIV metric combines the Diversity of a partition (DIV), Intra-cluster distance of a partition (IntraD), and the percentage of the number of methods analyzed (PAM) values to measure the overall effectiveness of the diversity of the partitions. While the iDISP metric combines the Dispersion of crosscutting concerns (DISP) along with Inter-cluster distance of a partition (InterD) and the PAM values to measure the quality of the clusters formed by a given method. Lastly, the oDIV and oDISP metrics introduced, take into account the complexity of the algorithms in relation with the DIV and DISP values.
By comparing the obtained values for each of the approaches, this study is able to identify the best performing method as it pertains to these metrics.

Identiferoai:union.ndltd.org:nova.edu/oai:nsuworks.nova.edu:gscis_etd-1169
Date01 January 2013
CreatorsHacoupian, Yourik
PublisherNSUWorks
Source SetsNova Southeastern University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceCEC Theses and Dissertations

Page generated in 0.0024 seconds