Global ETD Search

11	Code Clone Discovery Based on Concolic Analysis Krutz, Daniel Edward 01 January 2013 (has links) Software is often large, complicated and expensive to build and maintain. Redundant code can make these applications even more costly and difficult to maintain. Duplicated code is often introduced into these systems for a variety of reasons. Some of which include developer churn, deficient developer application comprehension and lack of adherence to proper development practices. Code redundancy has several adverse effects on a software application including an increased size of the codebase and inconsistent developer changes due to elevated program comprehension needs. A code clone is defined as multiple code fragments that produce similar results when given the same input. There are generally four types of clones that are recognized. They range from simple type-1 and 2 clones, to the more complicated type-3 and 4 clones. Numerous clone detection mechanisms are able to identify the simpler types of code clone candidates, but far fewer claim the ability to find the more difficult type-3 clones. Before CCCD, MeCC and FCD were the only clone detection techniques capable of finding type-4 clones. A drawback of MeCC is the excessive time required to detect clones and the likely exploration of an unreasonably large number of possible paths. FCD requires extensive amounts of random data and a significant period of time in order to discover clones. This dissertation presents a new process for discovering code clones known as Concolic Code Clone Discovery (CCCD). This technique discovers code clone candidates based on the functionality of the application, not its syntactical nature. This means that things like naming conventions and comments in the source code have no effect on the proposed clone detection process. CCCD finds clones by first performing concolic analysis on the targeted source code. Concolic analysis combines concrete and symbolic execution in order to traverse all possible paths of the targeted program. These paths are represented by the generated concolic output. A diff tool is then used to determine if the concolic output for a method is identical to the output produced for another method. Duplicated output is indicative of a code clone. CCCD was validated against several open source applications along with clones of all four types as defined by previous research. The results demonstrate that CCCD was able to detect all types of clone candidates with a high level of accuracy. In the future, CCCD will be used to examine how software developers work with type-3 and type-4 clones. CCCD will also be applied to various areas of security research, including intrusion detection mechanisms. Clone Detection Concolic Analysis Path Analysis Software Engineering Computer Sciences
12	Detecting Test Clones with Static Analysis Jain, Divam January 2013 (has links) Large-scale software systems often have correspondingly complicated test suites, which are diffi cult for developers to construct and maintain. As systems evolve, engineers must update their test suite along with changes in the source code. Tests created by duplicating and modifying previously existing tests (clones) can complicate this task. Several testing technologies have been proposed to mitigate cloning in tests, including parametrized unit tests and test theories. However, detecting opportunities to improve existing test suites is labour intensive. This thesis presents a novel technique for etecting similar tests based on type hierarchies and method calls in test code. Using this technique, we can track variable history and detect test clones based on test assertion similarity. The thesis further includes results from our empirical study of 10 benchmark systems using this technique which suggest that test clone detection by our technique will aid test de-duplication eff orts in industrial systems. software engineering program slicing test optimization clone detection Computer Science
13	Toward an Understanding of Software Code Cloning as a Development Practice Kapser, Cory 18 September 2009 (has links) Code cloning is the practice of duplicating existing source code for use elsewhere within a software system. Within the research community, conventional wisdom has asserted that code cloning is generally a bad practice, and that code clones should be removed or refactored where possible. While there is significant anecdotal evidence that code cloning can lead to a variety of maintenance headaches --- such as code bloat, duplication of bugs, and inconsistent bug fixing --- there has been little empirical study on the frequency, severity, and costs of code cloning with respect to software maintenance. This dissertation seeks to improve our understanding of code cloning as a common development practice through the study of several widely adopted, medium-sized open source software systems. We have explored the motivations behind the use of code cloning as a development practice by addressing several fundamental questions: For what reasons do developers choose to clone code? Are there distinct identifiable patterns of cloning? What are the possible short- and long-term term risks of cloning? What management strategies are appropriate for the maintenance and evolution of clones? When is the ``cure'' (refactoring) likely to cause more harm than the ``disease'' (cloning)? There are three major research contributions of this dissertation. First, we propose a set of requirements for an effective clone analysis tool based on our experiences in clone analysis of large software systems. These requirements are demonstrated in an example implementation which we used to perform the case studies prior to and included in this thesis. Second, we present an annotated catalogue of common code cloning patterns that we observed in our studies. Third, we present an empirical study of the relative frequencies and likely harmfulness of instances of these cloning patterns as observed in two medium-sized open source software systems, the Apache web server and the Gnumeric spreadsheet application. In summary, it appears that code cloning is often used as a principled engineering technique for a variety of reasons, and that as many as 71% of the clones in our study could be considered to have a positive impact on the maintainability of the software system. These results suggest that the conventional wisdom that code clones are generally harmful to the quality of a software system has been proven wrong. code clone clone detection clone analysis code duplication Computer Science
14	Toward an Understanding of Software Code Cloning as a Development Practice Kapser, Cory 18 September 2009 (has links) Code cloning is the practice of duplicating existing source code for use elsewhere within a software system. Within the research community, conventional wisdom has asserted that code cloning is generally a bad practice, and that code clones should be removed or refactored where possible. While there is significant anecdotal evidence that code cloning can lead to a variety of maintenance headaches --- such as code bloat, duplication of bugs, and inconsistent bug fixing --- there has been little empirical study on the frequency, severity, and costs of code cloning with respect to software maintenance. This dissertation seeks to improve our understanding of code cloning as a common development practice through the study of several widely adopted, medium-sized open source software systems. We have explored the motivations behind the use of code cloning as a development practice by addressing several fundamental questions: For what reasons do developers choose to clone code? Are there distinct identifiable patterns of cloning? What are the possible short- and long-term term risks of cloning? What management strategies are appropriate for the maintenance and evolution of clones? When is the ``cure'' (refactoring) likely to cause more harm than the ``disease'' (cloning)? There are three major research contributions of this dissertation. First, we propose a set of requirements for an effective clone analysis tool based on our experiences in clone analysis of large software systems. These requirements are demonstrated in an example implementation which we used to perform the case studies prior to and included in this thesis. Second, we present an annotated catalogue of common code cloning patterns that we observed in our studies. Third, we present an empirical study of the relative frequencies and likely harmfulness of instances of these cloning patterns as observed in two medium-sized open source software systems, the Apache web server and the Gnumeric spreadsheet application. In summary, it appears that code cloning is often used as a principled engineering technique for a variety of reasons, and that as many as 71% of the clones in our study could be considered to have a positive impact on the maintainability of the software system. These results suggest that the conventional wisdom that code clones are generally harmful to the quality of a software system has been proven wrong. code clone clone detection clone analysis code duplication Computer Science
15	Dealing with clones in software : a practical approach from detection towards management 2014 February 1900 (has links) Despite the fact that duplicated fragments of code also called code clones are considered one of the prominent code smells that may exist in software, cloning is widely practiced in industrial development. The larger the system, the more people involved in its development and the more parts developed by different teams result in an increased possibility of having cloned code in the system. While there are particular benefits of code cloning in software development, research shows that it might be a source of various troubles in evolving software. Therefore, investigating and understanding clones in a software system is important to manage the clones efficiently. However, when the system is fairly large, it is challenging to identify and manage those clones properly. Among the various types of clones that may exist in software, research shows detection of near-miss clones where there might be minor to significant differences (e.g., renaming of identifiers and additions/deletions/modifications of statements) among the cloned fragments is costly in terms of time and memory. Thus, there is a great demand of state-of-the-art technologies in dealing with clones in software. Over the years, several tools have been developed to detect and visualize exact and similar clones. However, usually the tools are standalone and do not integrate well with a software developer's workflow. In this thesis, first, a study is presented on the effectiveness of a fingerprint based data similarity measurement technique named 'simhash' in detecting clones in large scale code-base. Based on the positive outcome of the study, a time efficient detection approach is proposed to find exact and near-miss clones in software, especially in large scale software systems. The novel detection approach has been made available as a highly configurable and fully fledged standalone clone detection tool named 'SimCad', which can be configured for detection of clones in both source code and non-source code based data. Second, we show a robust use of the clone detection approach studied earlier by assembling its detection service as a portable library named 'SimLib'. This library can provide tightly coupled (integrated) clone detection functionality to other applications as opposed to loosely coupled service provided by a typical standalone tool. Because of being highly configurable and easily extensible, this library allows the user to customize its clone detection process for detecting clones in data having diverse characteristics. We performed a user study to get some feedback on installation and use of the 'SimLib' API (Application Programming Interface) and to uncover its potential use as a third-party clone detection library. Third, we investigated on what tools and techniques are currently in use to detect and manage clones and understand their evolution. The goal was to find how those tools and techniques can be made available to a developer's own software development platform for convenient identification, tracking and management of clones in the software. Based on that, we developed a clone-aware software development platform named 'SimEclipse' to promote the practical use of code clone research and to provide better support for clone management in software. Finally, we evaluated 'SimEclipse' by conducting a user study on its effectiveness, usability and information management. We believe that both researchers and developers would enjoy and utilize the benefit of using these tools in different aspect of code clone research and manage cloned code in software systems. Software Clone Clone Detection Integrated Clone Management Clone Management Plugin
16	Detecting Test Clones with Static Analysis Jain, Divam January 2013 (has links) Large-scale software systems often have correspondingly complicated test suites, which are diffi cult for developers to construct and maintain. As systems evolve, engineers must update their test suite along with changes in the source code. Tests created by duplicating and modifying previously existing tests (clones) can complicate this task. Several testing technologies have been proposed to mitigate cloning in tests, including parametrized unit tests and test theories. However, detecting opportunities to improve existing test suites is labour intensive. This thesis presents a novel technique for etecting similar tests based on type hierarchies and method calls in test code. Using this technique, we can track variable history and detect test clones based on test assertion similarity. The thesis further includes results from our empirical study of 10 benchmark systems using this technique which suggest that test clone detection by our technique will aid test de-duplication eff orts in industrial systems. software engineering program slicing test optimization clone detection Computer Science
17	BUSINESS PROCESS RECOVERY USING UI DESIGN PATTERNS AND CLONE DETECTION IN BUSINESS PROCESSES Guo, JIN 28 October 2008 (has links) A business application automates a collection of business processes. A business process describes how a set of logically related tasks are executed, ordered and managed by following business rules to achieve business objectives. An “online book purchase” business process contains several tasks such as buying a book, ordering a book, and sending out promotions. In this ever changing business environment, both of business applications and business processes are modified to accommodate changed business requirements and improve the performance of the organization. These continuous modifications introduce problems in the following two aspects: 1) Business process definitions are rarely updated to reflect the current business processes deployed in business applications. 2) Business processes may be cloned (e.g., copied and slightly modified) to handle special circumstances or promotions. Identifying these clones and removing them help improve the efficiency of an organization. However, business processes are defined with textual languages that cannot be automatically understood. To maintain business process definitions up to date, we present our techniques that automatically recover business processes from UIs of business applications and identify clones in the recovered business processes. We leverage UI design patterns, which present the best practices of UI designs, to capture business processes from UIs. To refine the recovered business processes and mark the functionally equivalent tasks, we use existing code clone detection tools, such as CCFinder and CloneDR, to detect clones in business applications, and lift clones from code level to business process level. The effectiveness of our techniques is demonstrated through a case study on 15 large open source business applications. / Thesis (Master, Computing) -- Queen's University, 2008-10-28 11:06:31.41 buisness processes clone detection business applications UI design patterns
18	How Reliable is the Crowdsourced Knowledge of Security Implementation? Chen, Mengsu 12 1900 (has links) The successful crowdsourcing model and gamification design of Stack Overflow (SO) Q&A platform have attracted many programmers to ask and answer technical questions, regardless of their level of expertise. Researchers have recently found evidence of security vulnerable code snippets being possibly copied from SO to production software. This inspired us to study how reliable is SO in providing secure coding suggestions. In this project, we automatically extracted answer posts related to Java security APIs from the entire SO site. Then based on the known misuses of these APIs, we manually labeled each extracted code snippets as secure or insecure. In total, we extracted 953 groups of code snippets in terms of their similarity detected by clone detection tools, which corresponds to 785 secure answer posts and 644 insecure answer posts. Compared with secure answers, counter-intuitively, insecure answers has higher view counts (36,508 vs. 18,713), higher score (14 vs. 5), more duplicates (3.8 vs. 3.0) on average. We also found that 34% of answers provided by the so-called trusted users who have administrative privileges are insecure. Our finding reveals that there are comparable numbers of secure and insecure answers. Users cannot rely on community feedback to differentiate secure answers from insecure answers either. Therefore, solutions need to be developed beyond the current mechanism of SO or on the utilization of SO in security-sensitive software development. / Master of Science / Stack Overflow (SO), the most popular question and answer platform for programmers today, has accumulated and continues accumulating tremendous question and answer posts since its launch a decade ago. Contributed by numerous users all over the world, these posts are a type of crowdsourced knowledge. In the past few years, they have been the main reference source for software developers. Studies have shown that code snippets in answer posts are copied into production software. This is a dangerous sign because the code snippets contributed by SO users are not guaranteed to be secure implementations of critical functions, such as transferring sensitive information on the internet. In this project, we conducted a comprehensive study on answer posts related to Java security APIs. By labeling code snippets as secure or insecure, contrasting their distributions over associated attributes such as post score and user reputation, we found that there are a significant number of insecure answers (644 insecure vs 785 secure in our study) on Stack Overflow. Our statistical analysis also revealed the infeasibility of differentiating between secure and insecure posts leveraging the current community feedback system (eg. voting) of Stack Overflow. Stack Overflow crowdsourced knowledge social dynamics security implementation clone detection
19	JClone: Syntax tree based clone detection for Java Bahtiyar, Muhammed Yasin January 2010 (has links) <p>An unavoidable amount of money is spent on maintaining existing software systems today. Software maintenance cost generally higher than development cost of the system therefore lowering maintenance cost is highly appreciated in software industry.</p><p>A significant part of maintenance activities is related to repeating the investigation of problems and applying repeated solutions several times. A software system may contain a common bug in several different places and it might take extra effort and time to fix all existences of this bug. This operation commonly increases the cost of Software Maintenance Activities.</p><p>Detecting duplicate code fragments can significantly decrease the time and effort therefore the maintenance cost. Clone code detection can be achieved via analyzing the source code of given software system. An abstract syntax tree based clone detector for java systems is designed and implemented through this study.</p><p>This master thesis examines a software engineering process to create an abstract syntax tree based clone detector for the projects implemented in Java programming language.</p> clone detection abstract syntax tree java eclipse software maintenance Software engineering Programvaruteknik
20	JClone: Syntax tree based clone detection for Java Bahtiyar, Muhammed Yasin January 2010 (has links) An unavoidable amount of money is spent on maintaining existing software systems today. Software maintenance cost generally higher than development cost of the system therefore lowering maintenance cost is highly appreciated in software industry. A significant part of maintenance activities is related to repeating the investigation of problems and applying repeated solutions several times. A software system may contain a common bug in several different places and it might take extra effort and time to fix all existences of this bug. This operation commonly increases the cost of Software Maintenance Activities. Detecting duplicate code fragments can significantly decrease the time and effort therefore the maintenance cost. Clone code detection can be achieved via analyzing the source code of given software system. An abstract syntax tree based clone detector for java systems is designed and implemented through this study. This master thesis examines a software engineering process to create an abstract syntax tree based clone detector for the projects implemented in Java programming language. clone detection abstract syntax tree java eclipse software maintenance Software Engineering Programvaruteknik

Search results