Spelling suggestions: "subject:"aspect mining"" "subject:"1spect mining""
1 |
Enhanced Method Call Tree for Comprehensive Detection of Symptoms of Cross Cutting ConcernsMir, Saleem Obaidullah 01 January 2016 (has links)
Aspect oriented programming languages provide a new enhanced composition mechanism between the functional sub units as compared to earlier non aspect oriented languages. For this reason the refactoring process requires a new approach to the analysis of existing code that focuses on how the functions cross cut one another. Aspect mining is a process of studying an existing program in order to find these cross cutting functions or concerns so they may be implemented using new aspect oriented constructs and thus reduce the complexity of the existing code. One approach to the detection of these cross cutting concerns generates a method call tree that outlines the method calls made within the existing code. The call tree is then examined to find recurring patterns of methods that can be symptoms of cross cutting concerns. The conducted research focused on enhancing this approach to detect and quantify cross cutting concerns that are a result of code tangling as well as code scattering. The conducted research also demonstrates how this aspect mining approach can be used to overcome the difficulties in detection caused by variations in the coding structure introduced by over time.
|
2 |
Aspect Mining Using Self-Organizing Maps With Method Level Dynamic Software Metrics as Input VectorsMaisikeli, Sayyed Garba 01 January 2009 (has links)
As the size and sophistication of modern software system increases, so is the need for high quality software that is easy to scale and maintain. Software systems evolve and undergo change over time. Unstructured software updates and refinements can lead to code scattering and tangling which are the main symptoms of crosscutting concerns. Presence of crosscutting concerns in a software system can lead to bloated and inefficient software system that is difficult to evolve, hard to analyze, difficult to reuse and costly to maintain.
A crosscutting concern in a software system represents implementation of a unique idea or a functionality that could not be properly encapsulated. The presence of crosscutting concerns in software systems is attributed to the limitations of programming languages and structural degradation associated with repeated change. No matter how well large-scale software applications are decomposed, some ideas are difficult to modularize and encapsulate resulting in crosscutting concerns scattered across the entire software system where code implementing one concept is tangled and mixed with code implementing unrelated concept.
Aspect Mining is a reverse software engineering exploration technique that is concerned with the development of concepts, principles, methods and tools supporting the identification and extraction of re-factorable aspect candidates in legacy software systems. The main goal of Aspect Mining is to help software engineers and developers to locate and identify crosscutting concerns and portions of the software that may need refactoring, with the aim of improving the quality, scalability and maintainability and evolution of software system.
The aspect mining approach presented in this dissertation involved three-phases. In the first phase, selected large-scale legacy benchmark test programs were dynamically traced and investigated. Metrics representing interaction between code fragments were derived from the collected data. In the second phase, the formulated dynamic metrics were then submitted as input to Self Organizing Maps (SOMs) for clustering. In the third phase, clusters produced by the SOM were then mapped against the benchmark test program in order to identify code scattering and tangling symptoms. Crosscutting concerns were identified and candidate aspect seeds mined.
Overall, the methodology used in this dissertation is found to perform as well as and no worse than other existing Aspect Mining methodologies. In other cases, the methodology used in this dissertation was found to have outperformed some of the existing Aspect mining methods that use the same set of benchmark test programs. With regards to Aspect Mining precision as it relates to LDA, 100% precision was attained, and with respect to JHD 51% precision was attained by this dissertation methodology, which is the same as attained by existing Aspect mining methods. Lessons learned from the results of experiments carried out in this dissertation have shown that even highly structured software systems that are based on best practice software design principles are laden with code repetitions and presence of crosscutting concerns.
One of the major contributions of this dissertation is the presentation of a new unsupervised Aspect Mining approach that minimizes human interaction, where hidden software features can be identified and inferences about the general structure of software system can be made, thereby addressing one of the drawbacks in currently existing dynamic Aspect Mining methodologies. The strength of the Aspect Mining approach presented in this dissertation is that the input metrics required to represent software code fragments can easily be derived from other viable software metric formulations without complex formalisms.
Other contributions made by this dissertation include the presentation of a good and viable software visualization technique that can be used for software visualization, exploration, and study and understanding of internal structure and behavioral nature of large-scale software systems. Areas that may need further study include the need for determining the optimal number of vector components that may be required to effectively represent extractible software components. Other issues worth considering include the establishment of a set of datasets derived from popularly known test benchmarks that can be used as a common standard for comparison, evaluation and validation of newly introduced Aspect Mining techniques.
|
3 |
Aspect Mining of COVID-19 Outbreak with SVM and NaiveBayes TechniquesKomara, Akhilandeswari January 2021 (has links)
The outbreak of COVID-19 is one of the major pandemics faced by the world ever and the World Health Organization (WHO) had declared it as the deadliest virus outbreak in recent times. Due to its incubation period, predicting or identifying the paints had become a tough job and thus, the impact is on a large scale. Most of the countries were affected with Coronavirus since December 2019 and the spread is still counting. Irrespective of the preventive measures being promoted on various media, still the speculations and rumors about this outbreak are peaks, that too particular with the social media platforms like Facebook and Twitter. Millions of posts or tweets are being posted on social media via various apps and due to this, the accuracy of news has become unpredictable, and further, it has increased panic among the people. To overcome these issues, a clear classification or categorization of the posts or tweets should be done to identify the accuracy of the news and this can be done by using the basic sentiment analysis technique of data sciences and machine learning. In this project, Twitter will be considered as the social media platform and the millions of tweets will be analyzed for aspect mining to categorize them into positive, negative, and neutral tweets using the NLP techniques. SVM and Naive Bayes approach of machine learning and this model will be developed.
|
4 |
Aspect Mining Using Multiobjective Genetic Clustering AlgorithmsBethelmy, David G. 01 January 2016 (has links)
In legacy software, non-functional concerns tend to cut across the system and manifest themselves as tangled or scattered code. If these crosscutting concerns could be modularized and the system refactored, then the system would become easier to understand, modify, and maintain. Modularized crosscutting concerns are known as aspects and the process of identifying aspect candidates in legacy software is called aspect mining.
One of the techniques used in aspect mining is clustering and there are many clustering algorithms. Current aspect mining clustering algorithms attempt to form clusters by optimizing one objective function. However, the objective function to be optimized tends to bias the formation of clusters towards the data model implicitly defined by that function. One solution is to use algorithms that try to optimize more than one objective function. These multiobjective algorithms have been used successfully in data mining but, as far as this author knows, have not been applied to aspect mining.
This study investigated the feasibility of using multiobjective evolutionary algorithms, in particular, multiobjective genetic algorithms, in aspect mining. The study utilized an existing multiobjective genetic algorithm, MOCK, which had already been tested against several popular single objective clustering algorithms. MOCK has been shown to be, on average, as good as, and sometimes better than, those algorithms. Since some of those data mining algorithms have counterparts in aspect mining, it was reasonable to assume that MOCK would perform at least as good in an aspect mining context.
Since MOCK's objective functions were not directly trying to optimize aspect mining metrics, the study also implemented another multiobjective genetic algorithm, AMMOC, based on MOCK but tailored to optimize those metrics. The reasoning hinged on the fact that, since the goal was to determine if a clustering method resulted in optimizing these quality metrics, it made sense to attempt to optimize these functions directly instead of a posteriori.
This study determined that these multiobjective algorithms performed at least as good as two popular aspect mining algorithms, k-means and hierarchical agglomerative. As a result, this study has contributed to both the theoretical body of knowledge in the field of aspect mining as well as provide a practical tool for the field.
|
5 |
Aspect Mining Using Model-Based ClusteringRand McFadden, Renata 01 January 2011 (has links)
Legacy systems contain critical and complex business code that has been in use for a long time. This code is difficult to understand, maintain, and evolve, in large part due to crosscutting concerns: software system features, such as persistence, logging, and error handling, whose implementation is spread across multiple modules. Aspect-oriented
techniques separate crosscutting concerns from the base code, using separate modules called aspects and, thus, simplifying the legacy code. Aspect mining techniques identify aspect candidates so that the legacy code can be refactored into aspects.
This study investigated an automated aspect mining method in which a vector-space model clustering approach was used with model-based clustering. The vector-space model clustering approach has been researched for aspect mining using a number of different heuristic clustering methods and producing mixed results. Prior to this study,
this model had not been researched with model-based algorithms, even though they have grown in popularity because they lend themselves to statistical analysis and show results that are as good as or better than heuristic clustering methods.
This study investigated the effectiveness of model-based clustering for identifying aspects when compared against heuristic methods, such as k-means clustering and agglomerative hierarchical clustering, using six different vector-space models. The study's results indicated that model-based clustering can, in fact, be more effective than heuristic methods and showed good promise for aspect mining. In general, model-based algorithms performed better in not spreading the methods of the concerns across the multiple clusters but did not perform as well in not mixing multiple concerns in the same cluster. Model-based algorithms were also significantly better at partitioning the data such that, given an ordered list of clusters, fewer clusters and methods would need to be analyzed to find all the concerns. In addition, model-based algorithms automatically determined the optimal number of clusters, which was a great advantage over heuristic-based algorithms. Lastly, the study found that the new vector-space models performed better, relative to aspect mining, than previously defined vector-space models.
|
6 |
Mining Aspects through Cluster Analysis Using Support Vector Machines and Genetic AlgorithmsHacoupian, Yourik 01 January 2013 (has links)
The main purpose of object-oriented programming is to use encapsulation to reduce the amount of coupling within each object. However, object-oriented programming has some weaknesses in this area. To address this shortcoming, researchers have proposed an approach known as aspect-oriented programming (AOP). AOP is intended to reduce the amount of tangled code within an application by grouping similar functions into an aspect. To demonstrate the powerful aspects of AOP, it is necessary to extract aspect candidates from current object-oriented applications.
Many different approaches have been proposed to accomplish this task. One of such approaches utilizes vector based clustering to identify the possible aspect candidates. In this study, two different types of vectors are applied to two different vector-based clustering techniques. In this approach, each method in a software system S is represented by a d-dimensional vector. These vectors take into account the Fan-in values of the methods as well as the number of calls made to individual methods within the classes in software system S. Then a semi-supervised clustering approach known as Support Vector Clustering is applied to the vectors. In addition, an improved K-means clustering approach which is based on Genetic Algorithms is also applied to these vectors. The results obtained from these two approaches are then evaluated using standard metrics for aspect mining.
In addition to introducing two new clustering based approaches to aspect mining, this research investigates the effectiveness of the currently known metrics used in aspect mining to evaluate a given vector based approach. Many of the metrics currently used for aspect mining evaluations are singleton metrics. Such metrics evaluate a given approach by taking into account only one aspect of a clustering technique. This study, introduces two different sets of metrics by combining these singleton measures. The iDIV metric combines the Diversity of a partition (DIV), Intra-cluster distance of a partition (IntraD), and the percentage of the number of methods analyzed (PAM) values to measure the overall effectiveness of the diversity of the partitions. While the iDISP metric combines the Dispersion of crosscutting concerns (DISP) along with Inter-cluster distance of a partition (InterD) and the PAM values to measure the quality of the clusters formed by a given method. Lastly, the oDIV and oDISP metrics introduced, take into account the complexity of the algorithms in relation with the DIV and DISP values.
By comparing the obtained values for each of the approaches, this study is able to identify the best performing method as it pertains to these metrics.
|
7 |
AspectAssay: A Technique for Expanding the Pool of Available Aspect Mining Test Data Using Concern SeedingMoore, David Gerald 01 January 2013 (has links)
Aspect-oriented software design (AOSD) enables better and more complete separation of
concerns in software-intensive systems. By extracting aspect code and relegating
crosscutting functionality to aspects, software engineers can improve the maintainability
of their code by reducing code tangling and coupling of code concerns. Further, the
number of software defects has been shown to correlate with the number of non-
encapsulated nonfunctional crosscutting concerns in a system.
Aspect-mining is a technique that uses data mining techniques to identify existing aspects
in legacy code. Unfortunately, there is a lack of suitably-documented test data for aspect-
mining research and none that is fully representative of large-scale legacy systems.
Using a new technique called concern seeding--based on the decades-old concept of
error seeding--a tool called AspectAssay (akin to the radioimmunoassay test in medicine)
was developed. The concern seeding technique allows researchers to seed existing legacy
code with nonfunctional crosscutting concerns of known type, location, and quantity, thus
greatly increasing the pool of available test data for aspect mining research.
Nine seeding test cases were run on a medium-sized codebase using the AspectAssay tool.
Each test case seeded a different concern type (data validation, tracing, and observer) and
attempted to achieve target values for each of three metrics: 0.95 degree of scattering
across methods (DOSM), 0.95 degree of scattering across classes (DOSC), and 10
concern instances. The results were manually verified for their accuracy in producing
concerns with known properties (i.e., type, location, quantity, and scattering). The
resulting code compiled without errors and was functionally identical to the original. The
achieved metrics averaged better than 99.9% of their target values.
Following the small tests, each of the three previously mentioned concern types was
seeded with a wide range of target metric values on each of two codebases--one
medium-sized and one large codebase. The tool targeted DOSM and DOSC values in the
range 0.01 to 1.00. The tool also attempted to reach target number of concern instances
from 1 to 100. Each of these 1,800 test cases was attempted ten times (18,000 total
trials). Where mathematically feasible (as permitted by scattering formulas), the tests
tended to produce code that closely matched target metric values.
Each trial's result was expressed as a percentage of its target value. There were 903 test
cases that averaged at least 0.90 of their targets. For each test case's ten trials, the
standard deviation of those trials' percentages of their targets was calculated. There was
an average standard deviation in all the trials of 0.0169. For the 808 seed attempts that
averaged at least 0.95 of their targets, the average standard deviation across the ten trials
for a particular target was only 0.0022. The tight grouping of trials for their test cases
suggests a high repeatability for the AspectAssay technique and tool.
The concern seeding technique opens the door for expansion of aspect mining research.
Until now, such research has focused on small, well-documented legacy programs.
Concern seeding has proved viable for producing code that is functionally identical to the
original and contains concerns with known properties. The process is repeatable and
precise across multiple seeding attempts and also accurate for many ranges of target
metric values.
Just like error seeding is useful in identifying indigenous errors in programs, concern
seeding could also prove useful in estimating indigenous nonfunctional crosscutting
concerns, thus introducing a new method for evaluating the performance of aspect
mining algorithms.
|
8 |
Going Deeper with Images and Natural LanguageMa, Yufeng 29 March 2019 (has links)
One aim in the area of artificial intelligence (AI) is to develop a smart agent with high intelligence that is able to perceive and understand the complex visual environment around us. More ambitiously, it should be able to interact with us about its surroundings in natural languages. Thanks to the progress made in deep learning, we've seen huge breakthroughs towards this goal over the last few years. The developments have been extremely rapid in visual recognition, in which machines now can categorize images into multiple classes, and detect various objects within an image, with an ability that is competitive with or even surpasses that of humans. Meanwhile, we also have witnessed similar strides in natural language processing (NLP). It is quite often for us to see that now computers are able to almost perfectly do text classification, machine translation, etc. However, despite much inspiring progress, most of the achievements made are still within one domain, not handling inter-domain situations. The interaction between the visual and textual areas is still quite limited, although there has been progress in image captioning, visual question answering, etc.
In this dissertation, we design models and algorithms that enable us to build in-depth connections between images and natural languages, which help us to better understand their inner structures. In particular, first we study how to make machines generate image descriptions that are indistinguishable from ones expressed by humans, which as a result also achieved better quantitative evaluation performance. Second, we devise a novel algorithm for measuring review congruence, which takes an image and review text as input and quantifies the relevance of each sentence to the image. The whole model is trained without any supervised ground truth labels. Finally, we propose a brand new AI task called Image Aspect Mining, to detect visual aspects in images and identify aspect level rating within the review context.
On the theoretical side, this research contributes to multiple research areas in Computer Vision (CV), Natural Language Processing (NLP), interactions between CVandNLP, and Deep Learning. Regarding impact, these techniques will benefit related users such as the visually impaired, customers reading reviews, merchants, and AI researchers in general. / Doctor of Philosophy / One aim in the area of artificial intelligence (AI) is to develop a smart agent with high intelligence that is able to perceive and understand the complex visual environment around us. More ambitiously, it should be able to interact with us about its surroundings in natural languages. Thanks to the progress made in deep learning, we’ve seen huge breakthroughs towards this goal over the last few years. The developments have been extremely rapid in visual recognition, in which machines now can categorize images into multiple classes, and detect various objects within an image, with an ability that is competitive with or even surpasses that of humans. Meanwhile, we also have witnessed similar strides in natural language processing (NLP). It is quite often for us to see that now computers are able to almost perfectly do text classification, machine translation, etc. However, despite much inspiring progress, most of the achievements made are still within one domain, not handling inter-domain situations. The interaction between the visual and textual areas is still quite limited, although there has been progress in image captioning, visual question answering, etc.
In this dissertation, we design models and algorithms that enable us to build in-depth connections between images and natural languages, which help us to better understand their inner structures. In particular, first we study how to make machines generate image descriptions that are indistinguishable from ones expressed by humans, which as a result also achieved better quantitative evaluation performance. Second, we devise a novel algorithm for measuring review congruence, which takes an image and review text as input and quantifies the relevance of each sentence to the image. The whole model is trained without any supervised ground truth labels. Finally, we propose a brand new AI task called Image Aspect Mining, to detect visual aspects in images and identify aspect level rating within the review context.
On the theoretical side, this research contributes to multiple research areas in Computer Vision (CV), Natural Language Processing (NLP), interactions between CV&NLP, and Deep Learning. Regarding impact, these techniques will benefit related users such as the visually impaired, customers reading reviews, merchants, and AI researchers in general.
|
Page generated in 0.0756 seconds