Global ETD Search

1	AN EMPIRICAL STUDY FOR THE IMPACT OF MAINTENANCE ACTIVITIES IN CLONE EVOLUTION MARKS, LIONEL 26 November 2009 (has links) Code clones are duplicated code fragments that are copied to re-use functionality and speed up development. However, due to the duplicate nature of code clones, inconsistent updates can lead to bugs in the software system. Existing research investigates the inconsistent updates through analysis of the updates to code clones and the bug fixes used to fix the inconsistent updates. We extend the work by investigating other factors that affect clone evolution, such as the number of developers. On two levels of analysis, the method and clone class level, we conduct an empirical study on clone evolution. We analyze the factors affecting bug fixes and co-change (i.e. update cloned methods at the same time) using our new metrics. Our metrics are related to the developers, code complexity, and stages of development. We use these metrics to find ways to improve the maintenance of cloned code. We discover that one way to improve maintenance of code clones is the decrease of code complexity. We find that increased code complexity leads to a decrease in co-change, which can lead to bugs in the software. We perform our study on 6 applications. To maximize the number of clones detected, we use two existing code clone detection tools: SimScan and Simian. SimScan was used to find clones in 5 of the applications due to its versatility in finding code clones. Simian was used to detect clones due to its reliability to find code clones regardless of language or compilation problems. To analyze and determine the significance of the metrics, we use the R Statistical Toolkit. / Thesis (Master, Computing) -- Queen's University, 2009-11-25 14:18:05.884 Clone Detection Clone Evolution
2	EMPIRICAL STUDIES OF CLONE MUTATION AND CLONE MIGRATION IN CLONE GENEALOGIES Xie, Shuai Jr 03 September 2013 (has links) Duplications and changes made on code segments by developers form code clones. Cloned code segments are exactly the same or have a particular similarity. A set of cloned code segments that have the same similarity with each other become a clone group. A clone genealogy contains several clone groups in different revisions and time periods. Based on different textual similarities, there are three clone types, i.e., Type-1, Type-2, and Type-3. Clone mutation contains the changes of clone types in the clone evolutions. Clone migration is known as moving cloned code segment to another location in the software system. In this thesis, we build clone genealogies by clone groups in two empirical studies. We conduct two studies on clone migration and clone mutation in clone genealogies. We use three large open source software systems in both studies. In the first study, we investigate if the fault-proneness of clone genealogies is affected by different patterns of clone mutation and different evolution patterns of distances among clones in clone groups. We conclude that clone groups mutated between Type-1 and Type-2 and between Type-1 and Type-3 clones have higher risk for faults. We find that modifying the location of a clone increases its risk for faults. In the second study, we study if the fault-proneness of migrated clones is affected by clone mutation with different changes on clone types. We examine if the length of time interval between clone migration and the last change of the cloned code has an impact on the faultiness of migrated clones. Our results show that the clone migration associated with clone mutation is more fault-prone than the clone migration without clone mutation. We find that a longer time interval between clone migration and the last change makes the migrated clones more fault-prone. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2013-09-01 22:10:47.925 Clone Migration Clone Mutation Clone Genealogy
3	Dealing with clones in software : a practical approach from detection towards management 2014 February 1900 (has links) Despite the fact that duplicated fragments of code also called code clones are considered one of the prominent code smells that may exist in software, cloning is widely practiced in industrial development. The larger the system, the more people involved in its development and the more parts developed by different teams result in an increased possibility of having cloned code in the system. While there are particular benefits of code cloning in software development, research shows that it might be a source of various troubles in evolving software. Therefore, investigating and understanding clones in a software system is important to manage the clones efficiently. However, when the system is fairly large, it is challenging to identify and manage those clones properly. Among the various types of clones that may exist in software, research shows detection of near-miss clones where there might be minor to significant differences (e.g., renaming of identifiers and additions/deletions/modifications of statements) among the cloned fragments is costly in terms of time and memory. Thus, there is a great demand of state-of-the-art technologies in dealing with clones in software. Over the years, several tools have been developed to detect and visualize exact and similar clones. However, usually the tools are standalone and do not integrate well with a software developer's workflow. In this thesis, first, a study is presented on the effectiveness of a fingerprint based data similarity measurement technique named 'simhash' in detecting clones in large scale code-base. Based on the positive outcome of the study, a time efficient detection approach is proposed to find exact and near-miss clones in software, especially in large scale software systems. The novel detection approach has been made available as a highly configurable and fully fledged standalone clone detection tool named 'SimCad', which can be configured for detection of clones in both source code and non-source code based data. Second, we show a robust use of the clone detection approach studied earlier by assembling its detection service as a portable library named 'SimLib'. This library can provide tightly coupled (integrated) clone detection functionality to other applications as opposed to loosely coupled service provided by a typical standalone tool. Because of being highly configurable and easily extensible, this library allows the user to customize its clone detection process for detecting clones in data having diverse characteristics. We performed a user study to get some feedback on installation and use of the 'SimLib' API (Application Programming Interface) and to uncover its potential use as a third-party clone detection library. Third, we investigated on what tools and techniques are currently in use to detect and manage clones and understand their evolution. The goal was to find how those tools and techniques can be made available to a developer's own software development platform for convenient identification, tracking and management of clones in the software. Based on that, we developed a clone-aware software development platform named 'SimEclipse' to promote the practical use of code clone research and to provide better support for clone management in software. Finally, we evaluated 'SimEclipse' by conducting a user study on its effectiveness, usability and information management. We believe that both researchers and developers would enjoy and utilize the benefit of using these tools in different aspect of code clone research and manage cloned code in software systems. Software Clone Clone Detection Integrated Clone Management Clone Management Plugin
4	Toward an Understanding of Software Code Cloning as a Development Practice Kapser, Cory 18 September 2009 (has links) Code cloning is the practice of duplicating existing source code for use elsewhere within a software system. Within the research community, conventional wisdom has asserted that code cloning is generally a bad practice, and that code clones should be removed or refactored where possible. While there is significant anecdotal evidence that code cloning can lead to a variety of maintenance headaches --- such as code bloat, duplication of bugs, and inconsistent bug fixing --- there has been little empirical study on the frequency, severity, and costs of code cloning with respect to software maintenance. This dissertation seeks to improve our understanding of code cloning as a common development practice through the study of several widely adopted, medium-sized open source software systems. We have explored the motivations behind the use of code cloning as a development practice by addressing several fundamental questions: For what reasons do developers choose to clone code? Are there distinct identifiable patterns of cloning? What are the possible short- and long-term term risks of cloning? What management strategies are appropriate for the maintenance and evolution of clones? When is the ``cure'' (refactoring) likely to cause more harm than the ``disease'' (cloning)? There are three major research contributions of this dissertation. First, we propose a set of requirements for an effective clone analysis tool based on our experiences in clone analysis of large software systems. These requirements are demonstrated in an example implementation which we used to perform the case studies prior to and included in this thesis. Second, we present an annotated catalogue of common code cloning patterns that we observed in our studies. Third, we present an empirical study of the relative frequencies and likely harmfulness of instances of these cloning patterns as observed in two medium-sized open source software systems, the Apache web server and the Gnumeric spreadsheet application. In summary, it appears that code cloning is often used as a principled engineering technique for a variety of reasons, and that as many as 71% of the clones in our study could be considered to have a positive impact on the maintainability of the software system. These results suggest that the conventional wisdom that code clones are generally harmful to the quality of a software system has been proven wrong. code clone clone detection clone analysis code duplication Computer Science
5	Toward an Understanding of Software Code Cloning as a Development Practice Kapser, Cory 18 September 2009 (has links) Code cloning is the practice of duplicating existing source code for use elsewhere within a software system. Within the research community, conventional wisdom has asserted that code cloning is generally a bad practice, and that code clones should be removed or refactored where possible. While there is significant anecdotal evidence that code cloning can lead to a variety of maintenance headaches --- such as code bloat, duplication of bugs, and inconsistent bug fixing --- there has been little empirical study on the frequency, severity, and costs of code cloning with respect to software maintenance. This dissertation seeks to improve our understanding of code cloning as a common development practice through the study of several widely adopted, medium-sized open source software systems. We have explored the motivations behind the use of code cloning as a development practice by addressing several fundamental questions: For what reasons do developers choose to clone code? Are there distinct identifiable patterns of cloning? What are the possible short- and long-term term risks of cloning? What management strategies are appropriate for the maintenance and evolution of clones? When is the ``cure'' (refactoring) likely to cause more harm than the ``disease'' (cloning)? There are three major research contributions of this dissertation. First, we propose a set of requirements for an effective clone analysis tool based on our experiences in clone analysis of large software systems. These requirements are demonstrated in an example implementation which we used to perform the case studies prior to and included in this thesis. Second, we present an annotated catalogue of common code cloning patterns that we observed in our studies. Third, we present an empirical study of the relative frequencies and likely harmfulness of instances of these cloning patterns as observed in two medium-sized open source software systems, the Apache web server and the Gnumeric spreadsheet application. In summary, it appears that code cloning is often used as a principled engineering technique for a variety of reasons, and that as many as 71% of the clones in our study could be considered to have a positive impact on the maintainability of the software system. These results suggest that the conventional wisdom that code clones are generally harmful to the quality of a software system has been proven wrong. code clone clone detection clone analysis code duplication Computer Science
6	Detection and Analysis of \\ Detection and Analysis of Near-Miss Software Clones Roy, CHANCHAL 31 August 2009 (has links) Software clones are considered harmful in software maintenance and evolution. However, despite a decade of active research, there is a marked lack of work in the detection and analysis of near-miss software clones, those where minor to extensive modifications have been made to the copied fragments. In this thesis, we advance the state-of-the-art in clone detection and analysis in several ways. First, we develop a hybrid clone detection method, called NICAD, that can detect both exact and near-miss clones with high precision and recall and with reasonable performance. Second, in order to address the decade of vagueness in clone definition, we propose an editing taxonomy for clone creation that models developers' editing activities in the copy/pasted code in a top-down fashion. NICAD is designed to address the different types of clones in the editing taxonomy. Third, we have conducted a scenario-based qualitative comparison and evaluation of all of the currently available clone detection techniques and tools in the context of a unified conceptual framework. Using the results of this study one can more easily choose the right tools to meet the requirements and constraints of any particular application, and can identify opportunities for hybridizing different techniques. The hybrid architecture of NICAD was derived from this study. Fourth, in order to evaluate and compare the available tools in a realistic setting and to avoid the challenges and huge manual effort in validating candidate clones, we have developed a mutation-based framework that automatically and efficiently measures (and compares) the recall and precision of clone detection tools for different fine-grained clone types of the proposed editing taxonomy. We have evaluated NICAD using this framework and found that it is capable of detecting different types of clones with high precision and recall. Finally, we have conducted a large scale empirical study of cloning in open source systems, both to evaluate NICAD and to study the cloning characteristics of these systems in several different dimensions. The study has demonstrated that NICAD is capable of accurately finding both exact and near-miss function clones even in large systems and different languages, and that there seem to be a large number of clones in those systems. / Thesis (Ph.D, Computing) -- Queen's University, 2009-08-31 14:05:30.233 Software Clone Clone Detection and Analysis Software Maintenance
7	Detection and Analysis of \\ Detection and Analysis of Near-Miss Software Clones Roy, CHANCHAL 31 August 2009 (has links) Software clones are considered harmful in software maintenance and evolution. However, despite a decade of active research, there is a marked lack of work in the detection and analysis of near-miss software clones, those where minor to extensive modifications have been made to the copied fragments. In this thesis, we advance the state-of-the-art in clone detection and analysis in several ways. First, we develop a hybrid clone detection method, called NICAD, that can detect both exact and near-miss clones with high precision and recall and with reasonable performance. Second, in order to address the decade of vagueness in clone definition, we propose an editing taxonomy for clone creation that models developers' editing activities in the copy/pasted code in a top-down fashion. NICAD is designed to address the different types of clones in the editing taxonomy. Third, we have conducted a scenario-based qualitative comparison and evaluation of all of the currently available clone detection techniques and tools in the context of a unified conceptual framework. Using the results of this study one can more easily choose the right tools to meet the requirements and constraints of any particular application, and can identify opportunities for hybridizing different techniques. The hybrid architecture of NICAD was derived from this study. Fourth, in order to evaluate and compare the available tools in a realistic setting and to avoid the challenges and huge manual effort in validating candidate clones, we have developed a mutation-based framework that automatically and efficiently measures (and compares) the recall and precision of clone detection tools for different fine-grained clone types of the proposed editing taxonomy. We have evaluated NICAD using this framework and found that it is capable of detecting different types of clones with high precision and recall. Finally, we have conducted a large scale empirical study of cloning in open source systems, both to evaluate NICAD and to study the cloning characteristics of these systems in several different dimensions. The study has demonstrated that NICAD is capable of accurately finding both exact and near-miss function clones even in large systems and different languages, and that there seem to be a large number of clones in those systems. / Thesis (Ph.D, Computing) -- Queen's University, 2009-08-31 14:05:30.233 Software Clone Clone Detection and Analysis Software Maintenance
8	The ahrA-encoded L-asparaginase from Aspergillus nidulans Aullibux, Nadeen January 1999 (has links) No description available. 572.8 Gene; Clone; Assay
9	Visualizing and Understanding Code Duplication in Large Software Systems Jiang, Zhen Ming 15 December 2006 (has links) Code duplication, or code cloning, is a common phenomena in the development of large software systems. Developers have a love-hate relationship with cloning. On one hand, cloning speeds up the development process. On the other hand, clone management is a challenging task as software evolves. Cloning has commonly been considered as undesirable for software maintenance and several research efforts have been devoted to automatically detect clones and eliminate clones aggressively. However, there is little empirical work done to analyze the consequences of cloning with respect to the software quality. Recent studies show that cloning is not necessarily undesirable. Cloning can used to minimize risks and there are cases where cloning is used as a design technique. In this thesis, three visualization techniques are proposed to aid researchers in analyzing cloning in studying large software systems. All of the visualizations abstract and display cloning information at the subsystem level but with different emphases. At the subsystem level, clones can be classified as external clones and internal clones. External clones refer to code duplicates that reside in the same subsystem, whereas external clones are clones that are spread across different subsystems. Software architecture quality attributes such as cohesion and coupling are introduced to contribute to the study of cloning at the architecture level. The Clone Cohesion and Coupling (CCC) Graph and the Clone System Hierarchy (CSH) Graph display the cloning information for one single release. In particular, the CCC Graph highlights the amount of internal and external cloning for each subsystems; whereas the CSH Graph focuses more on the details of the spread of cloning. Finally, the Clone System Evolution (CSE) Graph shows the evolution of cloning over a period of time. Visualization Clone Computer Science
10	Visualizing and Understanding Code Duplication in Large Software Systems Jiang, Zhen Ming 15 December 2006 (has links) Code duplication, or code cloning, is a common phenomena in the development of large software systems. Developers have a love-hate relationship with cloning. On one hand, cloning speeds up the development process. On the other hand, clone management is a challenging task as software evolves. Cloning has commonly been considered as undesirable for software maintenance and several research efforts have been devoted to automatically detect clones and eliminate clones aggressively. However, there is little empirical work done to analyze the consequences of cloning with respect to the software quality. Recent studies show that cloning is not necessarily undesirable. Cloning can used to minimize risks and there are cases where cloning is used as a design technique. In this thesis, three visualization techniques are proposed to aid researchers in analyzing cloning in studying large software systems. All of the visualizations abstract and display cloning information at the subsystem level but with different emphases. At the subsystem level, clones can be classified as external clones and internal clones. External clones refer to code duplicates that reside in the same subsystem, whereas external clones are clones that are spread across different subsystems. Software architecture quality attributes such as cohesion and coupling are introduced to contribute to the study of cloning at the architecture level. The Clone Cohesion and Coupling (CCC) Graph and the Clone System Hierarchy (CSH) Graph display the cloning information for one single release. In particular, the CCC Graph highlights the amount of internal and external cloning for each subsystems; whereas the CSH Graph focuses more on the details of the spread of cloning. Finally, the Clone System Evolution (CSE) Graph shows the evolution of cloning over a period of time. Visualization Clone Computer Science

Search results