Global ETD Search

71	Using Machine Learning Techniques to Improve Static Code Analysis Tools Usefulness Alikhashashneh, Enas A. 08 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / This dissertation proposes an approach to reduce the cost of manual inspections for as large a number of false positive warnings that are being reported by Static Code Analysis (SCA) tools as much as possible using Machine Learning (ML) techniques. The proposed approach neither assume to use the particular SCA tools nor depends on the specific programming language used to write the target source code or the application. To reduce the number of false positive warnings we first evaluated a number of SCA tools in terms of software engineering metrics using a highlighted synthetic source code named the Juliet test suite. From this evaluation, we concluded that the SCA tools report plenty of false positive warnings that need a manual inspection. Then we generated a number of datasets from the source code that forced the SCA tool to generate either true positive, false positive, or false negative warnings. The datasets, then, were used to train four of ML classifiers in order to classify the collected warnings from the synthetic source code. From the experimental results of the ML classifiers, we observed that the classifier that built using the Random Forests (RF) technique outperformed the rest of the classifiers. Lastly, using this classifier and an instance-based transfer learning technique, we ranked a number of warnings that were aggregated from various open-source software projects. The experimental results show that the proposed approach to reduce the cost of the manual inspection of the false positive warnings outperformed the random ranking algorithm and was highly correlated with the ranked list that the optimal ranking algorithm generated. Static code analysis Source code metrics Machine learning False positives Reduction
72	Improving Security Through Egalitarian Binary Recompilation Williams-King, David Christopher January 2021 (has links) In this thesis, we try to bridge the gap between which program transformations are possible at source-level and which are possible at binary-level. While binaries are typically seen as opaque artifacts, our binary recompiler Egalito (ASPLOS 2020) enables users to parse and modify stripped binaries on existing systems. Our technique of binary recompilation is not robust to errors in disassembly, but with an accurate analysis, provides near-zero transformation overhead. We wrote several demonstration security tools with Egalito, including code randomization, control-flow integrity, retpoline insertion, and a fuzzing backend. We also wrote Nibbler (ACSAC 2019, DTRAP 2020), which detects unused code and removes it. Many of these features, including Nibbler, can be combined with other defenses resulting in multiplicatively stronger or more effective hardening. Enabled by our recompiler, an overriding theme of this thesis is our focus on deployable software transformation. Egalito has been tested by collaborators across tens of thousands of Debian programs and libraries. We coined this term egalitarian in the context of binary security. Simply put, an egalitarian analysis or security mechanism is one that can operate on itself (and is usually more deployable as a result). As one demonstration of this idea, we created a strong, deployable defense against code reuse attacks. Shuffler (OSDI 2016) randomizes function addresses, moving functions periodically every few milliseconds. This makes an attacker's job extremely difficult, especially if they are located across a network (which necessitates ping time) -- JIT-ROP attacks take 2.3 to 378 seconds to complete. Shuffler is egalitarian and defends its own code and target code simultaneously; Shuffler actually shuffles itself. We hope our deployable, egalitarian binary defenses will allow others to improve upon state-of-the-art and paint binaries as far more malleable than they have been in the past. Computer science Computer engineering Computer programming Computer security--Computer programs Source code (Computer science)
73	A Compiler-based Framework For Automatic Extraction Of Program Skeletons For Exascale Hardware/software Co-design Dakshinamurthy, Amruth Rudraiah 01 January 2013 (has links) The design of high-performance computing architectures requires performance analysis of largescale parallel applications to derive various parameters concerning hardware design and software development. The process of performance analysis and benchmarking an application can be done in several ways with varying degrees of fidelity. One of the most cost-effective ways is to do a coarse-grained study of large-scale parallel applications through the use of program skeletons. The concept of a “program skeleton” that we discuss in this paper is an abstracted program that is derived from a larger program where source code that is determined to be irrelevant is removed for the purposes of the skeleton. In this work, we develop a semi-automatic approach for extracting program skeletons based on compiler program analysis. We demonstrate correctness of our skeleton extraction process by comparing details from communication traces, as well as show the performance speedup of using skeletons by running simulations in the SST/macro simulator. Extracting such a program skeleton from a large-scale parallel program requires a substantial amount of manual effort and often introduces human errors. We outline a semi-automatic approach for extracting program skeletons from large-scale parallel applications that reduces cost and eliminates errors inherent in manual approaches. Our skeleton generation approach is based on the use of the extensible and open-source ROSE compiler infrastructure that allows us to perform flow and dependency analysis on larger programs in order to determine what code can be removed from the program to generate a skeleton. Program skeleton static analysis dependency analysis simulation source to source code generation Computer Sciences Engineering
74	A GitHub-Based Voice Assistant for Software Developers and Teams Sereesathien, Siriwan 01 June 2021 (has links) (PDF) Software developers and teams typically rely on source code and tasks management tools for their projects. They tend to depend on different platforms such as GitHub, Azure DevOps, Bitbucket, and GitLab for task-tracking, feature-tracking, and bug-tracking to develop and maintain their software repositories. Individually, developers may lose concentration when having to navigate through numerous screens consisting of various platforms to perform daily tasks. Additionally, while in meetings (non-virtual), teams are often separate from their machines and often would have to rely on pure recollection of the tasks and issues related to their work. This can delay the decision-making process and take away valuable focus hours of developers. Although there is usually one person with their laptop to guide the meeting and has access to the source code management tools, this can take a lot of time as they are not familiar with all the developers’ independent works. Therefore, a new tool needs to be introduced to help accelerate individual and team meetings’ productivity. In this paper, we continued the work on Robin, a voice-assistant built to answer questions regarding GitHub issues and source code management. Robin has the ability to answer questions in addition to completing actions on the behalf of the developer. This thesis presents Robin's abilities, architecture, and implementation while also examining its usability through a user study. Our study suggests that some people love the idea of having a conversational agent for software development. However, a lot more research and iterations must be done to fully make Robin give the user experience we imagined. In this thesis, we were able to set the foundation of this idea and the lessons that we learned. Voice-Assistant Git Github Issue-Tracking Source Code Management Software Repositories Engineering
75	Automatic Generation and Assessment of Source-code Method Summaries Abid, Nahla Jamal 24 April 2017 (has links) No description available. Computer Science Source-code summarization Method stereotype Eye tracking Program comprehension
76	Automated Source Code Structure Feedback Using srcML and RelaxNG Sedgwick, Brandon M. 19 September 2013 (has links) No description available. Computer Science source code XML RelaxNG srcML Software Engineering static validation computer science education
77	Source Code Readability : A study on type-declaration and programming knowledge / Source Code Readability : A study on type-declaration and programming knowledge Lennartsson, Caesar January 2022 (has links) The readability of source code is essential for software maintenance. Since maintenance is an ongoing process, which is estimated to be 70 percent of the software development life cycle's total costs, it cannot be deprioritized. The readability of source code is likely to affect the program comprehension, which may help or create problems in the maintenance of the software. How different code features and functions affect the readability of source code have previously been investigated, and readability metrics have been developed. The project was initiated because of the lack of research on how programming knowledge and statically compared to dynamically typed programming languages affect the readability of the source code. A survey was conducted and included 21 computer science students with various programming knowledge, each rating eight code snippets, making it in total 168 ratings. The results showed that the type of programming language could improve the readability of source code. The results also showed that programming knowledge does not have a correlation with the ability to read source code. source code readability programming knowledge programming language statically typed dynamically typed Computer Sciences Datavetenskap (datalogi)
78	Towards Measuring & Improving Source Code Quality Iftikhar, Umar January 2024 (has links) Context: Software quality has a multi-faceted description encompassing several quality attributes. Central to our efforts to enhance software quality is to improve the quality of the source code. Poor source code quality impacts the quality of the delivered product. Empirical studies have investigated how to improve source code quality and how to quantify the source code improvement. However, the reported evidence linking internal code structure information and quality attributes observed by users is varied and, at times, conflicting. Furthermore, there is a further need for research to improve source code quality by understanding trends in feedback from code review comments. Objective: This thesis contributes towards improving source code quality and synthesizes metrics to measure improvement in source code quality. Hence, our objectives are 1) To synthesize evidence of links between source code metrics and external quality attributes, & identify source code metrics, and 2) To identify areas to improve source code quality by identifying recurring code quality issues using the analysis of code review comments. Method: We conducted a tertiary study to achieve the first objective, an archival analysis and a case study to investigate the latter two objectives. Results: To quantify source code quality improvement, we reported a comprehensive catalog of source code metrics and a small set of source code metrics consistently linked with maintainability, reliability, and security. To improve source code quality using analysis of code review comments, our explored methodology improves the state-of-the-art with interesting results. Conclusions: The thesis provides a promising way to analyze themes in code review comments. Researchers can use the source code metrics provided to estimate these quality attributes reliably. In future work, we aim to derive a software improvement checklist based on the analysis of trends in code review comments. Source code quality Code review analysis Software quality improvement Computer Systems Datorsystem
79	Enhancing Fault Localization with Cost Awareness Nachimuthu Nallasamy, Kanagaraj 24 June 2019 (has links) Debugging is a challenging and time-consuming process in software life-cycle. The focus of the thesis is to improve the accuracy of existing fault localization (FL) techniques. We experimented with several source code line level features such as line commit size, line recency, and line length to arrive at a new fault localization technique. Based on our experiments, we propose a novel enhanced cost-aware fault localization (ECFL) technique by combining line length with the existing selected baseline fault localization techniques. ECFL improves the accuracy of DStar (Baseline 1), CombineFastestFL (Baseline 2), and CombineFL (Baseline 3) by locating 81%, 58%, and 30% more real faults respectively in Top-1 evaluation metric. In comparison with the baseline techniques, ECFL requires a marginal additional time (on an average, 5 seconds per bug) and data while providing a significant improvement in accuracy. The source code line features also improve the baseline fault localization techniques when ''learning to rank'' SVM machine learning approach is used to combine the features. We also provide an infrastructure to facilitate future research on combining new source code line features with other fault localization techniques. / Master of Science / Software debugging involves locating and fixing faults (or bugs) in software. It is a challenging and time-consuming process in software life-cycle. Fault localization (FL) techniques help software developers to locate faults by providing a ranked set of program elements. The focus of the thesis is to improve the accuracy of existing fault localization techniques. We experimented with several source code line level features such as line commit size, line recency, and line length to arrive at a new fault localization technique. Based on our experiments, we propose a novel enhanced cost-aware fault localization (ECFL) technique by combining line length with the existing selected baseline fault localization techniques. ECFL improves the accuracy of DStar (Baseline 1), CombineFastestFL (Baseline 2), and CombineFL (Baseline 3) by locating 81%, 58%, and 30% more real faults respectively in Top-1 evaluation metric. In comparison with the baseline techniques, ECFL requires a marginal additional time (on an average, 5 seconds per bug) and data while providing a significant improvement in accuracy. The source code line features also improve the baseline fault localization techniques when machine learning approach is used to combine the features. We also provide an infrastructure to facilitate future research on combining new source code line features with other fault localization techniques. fault localization automated debugging source code line features cost-aware fault localization
80	Formulation interactive des requêtes pour l’analyse et la compréhension du code source Jridi, Jamel Eddine 11 1900 (has links) Nous proposons une approche basée sur la formulation interactive des requêtes. Notre approche sert à faciliter des tâches d’analyse et de compréhension du code source. Dans cette approche, l’analyste utilise un ensemble de filtres de base (linguistique, structurel, quantitatif, et filtre d’interactivité) pour définir des requêtes complexes. Ces requêtes sont construites à l’aide d’un processus interactif et itératif, où des filtres de base sont choisis et exécutés, et leurs résultats sont visualisés, changés et combinés en utilisant des opérateurs prédéfinis. Nous avons évalués notre approche par l’implantation des récentes contributions en détection de défauts de conception ainsi que la localisation de fonctionnalités dans le code. Nos résultats montrent que, en plus d’être générique, notre approche aide à la mise en œuvre des solutions existantes implémentées par des outils automatiques. / We propose an interactive querying approach for program analysis and comprehension tasks. In our approach, an analyst uses a set of basic filters (linguistic, structural, quantitative, and user selection) to define complex queries. These queries are built following an interactive and iterative process where basic filters are selected and executed, and their results displayed, changed, and combined using predefined operators. We evaluated our querying approach by implementing recent state-of-the-art contributions on feature location and design defect detection. Our results show that, in addition to be generic; our approach helps improving existing solutions implemented by fully-automated tools. compréhension du code analyse du code source visualisation interactive interrogation du code source program comprehension source code analysis source code querying interactive visualization

Search results