Spelling suggestions: "subject:"6program repair"" "subject:"6program depair""
Combining data structure repair and program repairMalik, Muhammad Zubair 19 September 2014 (has links)
Bugs in code continue to pose a fundamental problem for software reliability and cause expensive failures. The process of removing known bugs is termed debugging, which is a classic methodology commonly performed before code is deployed. Traditionally, debugging is tedious, often requiring much manual effort. A more recent technique that complements debugging is data structure repair, which handles bugs that make it to deployed systems and lead to erroneous behavior at runtime by modifying erroneous program states to recover from errors. While data structure repair presents a promising basis for dealing with bugs at runtime, it remains computationally expensive. Our thesis is that debugging and data structure repair can be integrated to provide the basis of an effective approach for removing bugs before code is deployed and handling them after it is deployed. We present a bi-directional integration where ideas at the basis of data structure repair assist in automating debugging and vice versa. Our key insight is two-fold: (1)a repair action performed to mutate an erroneous object field value to repair it can be abstracted into a program statement that performs that update correctly; and (2)repair actions that are performed repeatedly to fix the same error can be memoized and re-used. We design, develop, and evaluate two techniques that embody our insight. One, we present an automated debugging technique that leverages a systematic constraint-based data structure repair technique developed in previous work and provides suggestions on how to fix a faulty program. Two, we present repair abstractions that are based on the same central ideas as in our automated debugging technique and memoize how an erroneous state was repaired, which enables prioritizing and re-using repair actions when the same error occurs again. The focus of our work is programs that operate on structurally complex data, e.g., heap-allocated data structures that have complex structural integrity constraints, such as acyclicity. Checking such constraints plays a central role in the techniques that lay at the foundation of our work. These techniques require the user to provide the constraints, which poses a burden on the user. To facilitate the use of constraint-based techniques, we present a third technique to check constraint violations at runtime using graph spectra, which have been studied extensively by mathematicians to capture properties of graphs. We view the heap of an object-oriented program as an edge-labeled graph, which allows us to apply results from graph spectra theory. Experimental results show the effectiveness of using graph spectra as a basis of capturing structural properties of a class of commonly used data structures. / text
Program reliability through algorithmic design and analysisSamanta, Roopsha 10 February 2014 (has links)
Software systems are ubiquitous in today's world and yet, remain vulnerable to the fallibility of human programmers as well as the unpredictability of their operating environments. The overarching goal of this dissertation is to develop algorithms to enable automated and efficient design and analysis of reliable programs. In the first and second parts of this dissertation, we focus on the development of programs that are free from programming errors. The intent is not to eliminate the human programmer, but instead to complement his or her expertise, with sound and efficient computational techniques, when possible. To this end, we make contributions in two specific domains. Program debugging --- the process of fault localization and error elimination from a program found to be incorrect --- typically relies on expert human intuition and experience, and is often a lengthy, expensive part of the program development cycle. In the first part of the dissertation, we target automated debugging of sequential programs. A broad and informal statement of the (automated) program debugging problem is to suitably modify an erroneous program, say P, to obtain a correct program, say P'. This problem is undecidable in general; it is hard to formalize; moreover, it is particularly challenging to assimilate and mechanize the customized, expert programmer intuition involved in the choices made in manual program debugging. Our first contribution in this domain is a methodical formalization of the program debugging problem, that enables automation, while incorporating expert programmer intuition and intent. Our second contribution is a solution framework that can debug infinite-state, imperative, sequential programs written in higher-level programming languages such as C. Boolean programs, which are smaller, finite-state abstractions of infinite-state or large, finite-state programs, have been found to be tractable for program verification. In this dissertation, we utilize Boolean programs for program debugging. Our solution framework involves two main steps: (a) automated debugging of a Boolean program, corresponding to an erroneous program P, and (b) translation of the corrected Boolean program into a correct program P'. Shared-memory concurrent programs are notoriously difficult to write, verify and debug; this makes them excellent targets for automated program completion, in particular, for synthesis of synchronization code. Extant work in this domain has focused on either propositional temporal logic specifications with simplistic models of concurrent programs, or more refined program models with the specifications limited to just safety properties. Moreover, there has been limited effort in developing adaptable and fully-automatic synthesis frameworks that are capable of generating synchronization at different levels of abstraction and granularity. In the second part of this dissertation, we present a framework for synthesis of synchronization for shared-memory concurrent programs with respect to temporal logic specifications. In particular, given a concurrent program composed of synchronization-free processes, and a temporal logic specification describing their expected concurrent behaviour, we generate synchronized processes such that the resulting concurrent program satisfies the specification. We provide the ability to synthesize readily-implementable synchronization code based on lower-level primitives such as locks and condition variables. We enable synchronization synthesis of finite-state concurrent programs composed of processes that may have local and shared variables, may be straight-line or branching programs, may be ongoing or terminating, and may have program-initialized or user-initialized variables. We also facilitate expression of safety and liveness properties over both control and data variables by proposing an extension of propositional computation tree logic. Most program analyses, verification, debugging and synthesis methodologies target traditional correctness properties such as safety and liveness. These techniques typically do not provide a quantitative measure of the sensitivity of a computational system's behaviour to unpredictability in the operating environment. We propose that the core property of interest in reasoning in the presence of such uncertainty is robustness --- small perturbations to the operating environment do not change the system's observable behavior substantially. In well-established areas such as control theory, robustness has always been a fundamental concern; however, the techniques and results therein are not directly applicable to computational systems with large amounts of discretized, discontinuous behavior. Hence, robustness analysis of software programs used in heterogeneous settings necessitates development of new theoretical frameworks and algorithms. In the third part of this dissertation, we target robustness analysis of two important classes of discrete systems --- string transducers and networked systems of Mealy machines. For each system, we formally define robustness of the system with respect to a specific source of uncertainty. In particular, we analyze the behaviour of transducers in the presence of input perturbations, and the behaviour of networked systems in the presence of channel perturbations. Our overall approach is automata-theoretic, and necessitates the use of specialized distance-tracking automata for tracking various distance metrics between two strings. We present constructions for such automata and use them to develop decision procedures based on reducing the problem of robustness verification of our systems to the problem of checking the emptiness of certain automata. Thus, the system under consideration is robust if and only if the languages of particular automata are empty. / text
Quantitative Verification and Synthesis / Vérification et synthèse quantitativeVon Essen, Christian 28 April 2014 (has links)
Cette thèse contribue à l'étude théorique et a l'application de la vérification et de la synthèse quantitative. Nous étudions les stratégies qui optimisent la fraction de deux récompenses des MDPs. L'objectif est la synthèse de régulateurs efficaces dans des environnements probabilistes. Premièrement nous montrons que les stratégies déterministes et sans mémoire sont suffisants. Sur la base de ces résultats, nous proposons trois algorithmes pour traiter des modèles explicitement encodées. Notre évaluation de ces algorithmes montre que l'un de ces derniers est plus rapide que les autres. Par la suite nous proposons et mettons en place une variante symbolique basé sur les diagrammes de décision binaire.Deuxièmement, nous étudions le problème de réparation des programmes d'un point de vue quantitatif. Cela conduit à une reformulation de la réparation d'un log: que seules les exécutions fautives du programme soient modifiées. Nous étudions les limites de cette approche et montrons comment nous pouvons assouplir cette nouvelle exigence. Nous concevons et mettons en œuvre un algorithme pour trouver automatiquement des réparations, et montrons qu'il améliore les modifications apportées aux programmes. Troisièmement, nous étudions une nouvelle approche au framework pour la vérification et synthèse quantitative. La vérification et la synthèse fonctionnent en tandem pour analyser la qualité d'un contrôleur en ce qui concerne, par exemple , de robustesse contre des erreurs de modélisation. Nous considérons également la possibilité d'approximer la courbure de Pareto, qui appataît de la combinaison du modèle avec de multiples récompenses. Cela nous permet à la fois d'étudier les compromis inhérents au système et de choisir une configuration adéquate. Nous appliquons notre framework aux plusieurs études de cas. La majorité de l'étude de cas est concernée par un système anti-collision embarqué (ACAS X). Nous utilisons notre framework pour aider à analyser l'espace de conception du système et de valider le contrôleur en cours d'investigation par la FAA. En particulier, nous contribuons l'analyse par PCTL et stochastic model checking. / This thesis contributes to the theoretical study and application of quantitative verification and synthesis. We first study strategies that optimize the ratio of two rewards in MDPs. The goal is the synthesis of efficient controllers in probabilistic environments. We prove that deterministic and memoryless strategies are sufficient. Based on these results we suggest 3 algorithms to treat explicitly encoded models. Our evaluation of these algorithms shows that one of these is clearly faster than the others. To extend its scope, we propose and implement a symbolic variant based on binary decision diagrams, and show that it cope with millions of states. Second, we study the problem of program repair from a quantitative perspective. This leads to a reformulation of program repair with the requirement that only faulty runs of the program be changed. We study the limitations of this approach and show how we can relax the new requirement. We devise and implement an algorithm to automatically find repairs, and show that it improves the changes made to programs.Third, we study a novel approach to a quantitative verification and synthesis framework. In this, verification and synthesis work in tandem to analyze the quality of a controller with respect to, e.g., robustness against modeling errors. We also include the possibility to approximate the Pareto curve that emerges from combining the model with multiple rewards. This allows us to both study the trade-offs inherent in the system and choose a configuration to our liking. We apply our framework to several case studies. The major case study is concerned with the currently proposed next generation airborne collision avoidance system (ACAS X). We use our framework to help analyze the design space of the system and to validate the controller as currently under investigation by the FAA. In particular, we contribute analysis via PCTL and stochastic model checking to add to the confidence in the controller.
Autonomic test case generation of failing code using AOPMurguia, Giovanni 02 September 2020 (has links)
As software systems have grown in size and complexity, the costs of maintaining such systems increases steadily. In the early 2000's, IBM launched the autonomic computing initiative to mitigate this problem by injecting feedback control mechanisms into software systems to enable them to observe their health and self-heal without human intervention and thereby cope with certain changes in their requirements and environments. Self-healing is one of several fundamental challenges addressed and includes software systems that are able to recover from failure conditions. There has been considerable research on software architectures with feedback loops that allow a multi-component system to adjust certain parameters automatically in response to changes in its environment. However, modifying the components' source code in response to failures remains an open and formidable challenge. Automatic program repair techniques aim to create and apply source code patches autonomously. These techniques have evolved over the years to take advantage of advancements in programming languages, such as reflection. However, these techniques require mechanisms to evaluate if a candidate patch solves the failure condition. Some rely on test cases that capture the context under which the program failed---the patch applied can then be considered as a successful patch if the test result changes from failing to passing. Although test cases are an effective mechanism to govern the applicability of potential patches, the automatic generation of test cases for a given scenario has not received much attention. ReCrash represents the only known implementation to generate test cases automatically with promising results through the use of low-level instrumentation libraries. The work reported in this thesis aims to explore this area further and under a different light. It proposes the use of Aspect-Oriented Programming (AOP)---and in particular of AspectJ---as a higher-level paradigm to express the code elements on which monitoring actions can be interleaved with the source code, to create a representation of the context at the most relevant moments of the execution, so that if the code fails, the contextual representation is retained and used at a later time to automatically write a test case. By doing this, the author intends to contribute to fill the gap that prevents the use of automatic program repair techniques in a self-healing architecture. The prototype implementation engineered as part of this research was evaluated along three dimensions: memory usage, execution time and binary size. The evaluation results suggest that (1) AspectJ introduces significant overhead with respect to execution time, (2) the implementation algorithm causes a tremendous strain on garbage collection, and (3) AspectJ incorporates tens of additional lines of code, which account for a mean size increase to every binary file of a factor of ten compared to the original size. The comparative analysis with ReCrash shows that the algorithm and data structures developed in this thesis produce more thorough test cases than ReCrash. Most notably, the solution presented here mitigates ReCrash's current inability to reproduce environment-specific failure conditions derived from on-demand instantiation. This work can potentially be extended to apply in less-intrusive frameworks that operate at the same level as AOP to address the shortcomings identified in this analysis. / Graduate
Runtime Verification and Debugging of Concurrent SoftwareZhang, Lu 29 July 2016 (has links)
Our reliance on software has been growing fast over the past decades as the pervasive use of computer and software penetrated not only our daily life but also many critical applications. As the computational power of multi-core processors and other parallel hardware keeps increasing, concurrent software that exploit these parallel computing hardware become crucial for achieving high performance. However, developing correct and efficient concurrent software is a difficult task for programmers due to the inherent nondeterminism in their executions. As a result, concurrency related software bugs are among the most troublesome in practice and have caused severe problems in recent years. In this dissertation, I propose a series of new and fully automated methods for verifying and debugging concurrent software. They cover the detection, prevention, classification, and repair of some important types of bugs in the implementation of concurrent data structures and client-side web applications. These methods can be adopted at various stages of the software development life cycle, to help programmers write concurrent software correctly as well as efficiently. / Ph. D.
Sequence to Sequence Machine Learning for Automatic Program RepairSvensson, Niclas, Vrabac, Damir January 2019 (has links)
Most of previous program repair approaches are only able to generate fixes for one-line bugs, including machine learning based approaches. This work aims to reveal whether such a system with the state of the art technique is able to make useful predictions while being fed by whole source files. To verify whether multi-line bugs can be fixed using a state of the art solution a system has been created, using already existing Neural Machine Translation tools and data gathered from GitHub. The result of the finished system shows however, that the method used in this thesis is not sufficient to get satisfying results. No bug has successfully been corrected by the system. Although the results are poor there are still unexplored approaches to the project that possibly could improve the performance of the system. One way being narrowing down the input data to method level of source code instead of file level.
Java Syntax Error Repair Using RoBERTaXiang, Ziyi January 2022 (has links)
Deep learning has achieved promising results for automatic program repair (APR).In this paper, we revisit this topic and propose an end-to-end approach Classfix tocorrect java syntax errors. Classfix uses the RoBERTa classification model to localizethe error, and uses the RoBERTa encoder-decoder model to repair the located buggyline. Our work introduces a new localization method that enables us to fix a programwith an arbitrary length. Our approach categorises errors into symbol errors and worderrors. We conduct a large scale experiment to evaluate Classfix and the result showsClassfix is able to repair 75.5% symbol errors and 64.3% word errors. In addition,Classfix achieves 97% and 84.7% accuracy in locating symbol errors and word errors,respectively. / Deep learning har uppnått lovande resultat för automatisk programreparation (APR).I den här uppsatsen återkommer vi till det här ämnet och använder Classfix för attkorrigera javasyntaxfel. Classfix använder en RoBERTa-classification model för attlokalisera felet och en RoBERTa-encoder-decoder model för att reparera buggar.Vårt arbete introducerar en ny lokaliseringsmetod som gör att vi kan fixa programav godtycklig längd. Studien kategoriserar fel i symbolfel och ordfel. Vi genomförstorskaliga experiment för att utvärdera Classfix. Resultatet visar att Classfix kan fixa75.5% av symbolfel och 64.3% av ordfel. Dessutom uppnår Classfix 97% och 84,7% noggrannhet när det gäller att lokalisera symbolfel respektive ordfel.
Addressing the Rare Word Problem in Source Code ModellingIvstam, Linn January 2020 (has links)
The field of automatic program repair has adapteddeep learning techniques. Sequence to sequence neural networkshave successfully been applied in neural machine translation(NMT). This can also be applied to automatic program repair,attempting to translate buggy source code into fixed sourcecode. However, the frequent occurrence of user-defined variablesmakes the rare word problem a significant issue. Techniquesused in NMT to handle the rare word problem specifically bytepairing encoding (BPE) and the copy mechanism were appliedand evaluated on source code. The results showed that whenobserving the exact sequence match of the predicted output andtarget output, techniques were not an improvement. However,when observing correct syntax techniques outperformed theoriginal model without any techniques applied. To be able tosee an improvement in exact sequence match there should be agreater variety of sequence length and vocabulary size also, moreextensive hyperparameter tuning should be performed. / Inom området för automatisk mjukvarureparation har djupinlärningstekniker implementerats. Neurala nätverk av typen sekvens till sekvens har blivit framgångsrikt applicerade inom neural maskinöversättning av mänskliga språk. Dessa neurala nätverk kan också appliceras inom automatisk mjukvarureparation genom att översätta källkod innehållande buggar till en lagad kod utan buggar. Den frekventa användningen av användardefinierade variabler gör att ”the rare word problem” är en signifikant svaghet. Tekniker som används i neural maskinöversättning, ”byte pairing encoding” (BPE) och ”the copy mechanism” har applicerats och utvärderats på källkod. Resultaten visar att då modellens förutsagda utdata jämförs med det förväntat utdata visar teknikerna ingen förbättring. Dock hanterar nätverk med tekniker applicerade syntax för programmeringsspråket c avsevärt bättre. / Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
Uma proposta de representação e operadores genéticos para algoritmos evolucionários aplicados no reparo automatizado de software / A proposed representation and genetic operators for evolutionary algorithms applied in automated software repairOliveira, Vinícius Paulo Lopes de 14 August 2017 (has links)
Submitted by JÚLIO HEBER SILVA (email@example.com) on 2017-09-13T17:19:44Z No. of bitstreams: 2 Dissertação - Vinícius Paulo Lopes de Oliveira - 2017.pdf: 2066886 bytes, checksum: c610d8e21e23795d1cea6eeca17b5e5e (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Luciana Ferreira (firstname.lastname@example.org) on 2017-09-19T13:58:48Z (GMT) No. of bitstreams: 2 Dissertação - Vinícius Paulo Lopes de Oliveira - 2017.pdf: 2066886 bytes, checksum: c610d8e21e23795d1cea6eeca17b5e5e (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2017-09-19T13:58:48Z (GMT). No. of bitstreams: 2 Dissertação - Vinícius Paulo Lopes de Oliveira - 2017.pdf: 2066886 bytes, checksum: c610d8e21e23795d1cea6eeca17b5e5e (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2017-08-14 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / Maintenance and software repair are responsible for most of the cost of a software in the course of its life. Software repair through genetic evolution may repair errors and improve software, reducing its high cost. GenProg is a technique that uses this approach and through patches evolution it is capable to fix errors in large and small softwares. A patch composed by low-granularity operations compromise the manipulation of these operations. These operations consist of three subspaces: operation, location of application of the operation and what the operation will apply at the location of the fault (operator, fault and fix, respectively). The recombination and mutation operators applied to a low granulation representation limits the ability of the technique to navigate in search space efficiently. It is proposed the reformulation of the representation, in order to allow greater search capability. Theoretical analysis of the representation showed that the new representation has a greater locality than the original one. Through experimentation, validation and genotypic analysis it is shown that the proposed changes have led to a better performance with respect to the original operators and parameters in terms of efficiency, in the first experiments the operator UnifSingle with memorization was 48.88% more effective than the Original operator and then the operator OPSingle_V2 was 26% more effective than the operator UnifSingle with memorization. Some characteristics of these cross-operators were observed through a genotype distance analysis and their influence on the automatic software reapair problem. The proposed mutation operator shown superior results if compared to original. Combination between operator UniSingle with memorization showed the best efficacy among all combinations of operators and parameters (28.29% superior to the best result of the original GenProg). / Manutenção e reparo de software é responsável pela maior parte do custo de um software no decorrer de sua vida. O reparo de software por meio de evolução genética pode reparar erros e/ou melhorar softwares, diminuindo seu alto custo. GenProg é uma técnica em desenvolvimento que utiliza esta abordagem e por meio de evolução de patches é capaz de reparar erros em grandes e pequenos softwares. Um patch é composto por operações de edições de baixa granularidade o que compromete a separação e edição dessas operações. Essas operações são formadas por três subespaços: operação, local da aplicação da operação e o que a operação irá aplicar no local da falha (operator, fault, fix, respectivamente). Os operadores de recombinação e mutação aplicados às representações de baixa granularidade limita a habilidade da técnica de navegar no espaço de busca de forma eficiente. É proposto neste estudo, a reformulação da representação, do operador de cruzamento e mutação a fim de permitir uma maior capacidade de busca. Análises teóricas da representação demonstraram que a nova representação possui localidade maior que a original. Por meio de experimentações, validações e análises genotípicas é mostrado que as mudanças propostas levaram a uma melhoria em relação aos operadores e parâmetros originais em termos de eficácia, sendo que nos experimentos iniciais o operador UnifSingle com memorização apresentou eficácia 45,88% superior ao melhor caso do operador Original e em seguida o operador posteriormente proposto OPSingle_V2 apresentou eficácia 26% superior ao UnifSingle com memorização. Foram observadas algumas características desses operadores de cruzamento por meio de uma análise por distância genotípica e suas influências no problema de reparo automatizado de software. O operador de mutação proposto apresentou resultados superiores ao operador de mutação original e combinado com operador UnifSingle com memorização, apresentou a melhor eficácia entre todas as combinações de operadores e parâmetros.
Exploring the Usage of Neural Networks for Repairing Static Analysis Warnings / Utforsking av användningen av neurala nätverk för att reparera varningar för statisk analysLohse, Vincent Paul January 2021 (has links)
C# provides static analysis libraries for template-based code analysis and code fixing. These libraries have been used by the open-source community to generate numerous NuGet packages for different use-cases. However, due to the unstructured vastness of these packages, it is difficult to find the ones required for a project and creating new analyzers and fixers take time and effort to create. Therefore, this thesis proposes a neural network, which firstly imitates existing fixers and secondly extrapolates to fixes of unseen diagnostics. To do so, the state-of-the-art of static analysis NuGet packages is examined and further used to generate a dataset with diagnostics and corresponding code fixes for 24,622 data points. Since many C# fixers apply formatting changes, all formatting is preserved in the dataset. Furthermore, since the fixers also apply identifier changes, the tokenization of the dataset is varied between splitting identifiers by camelcase and preserving them. The neural network uses a sequence-to-sequence learning approach with the Transformer model and takes file context, diagnostic message and location as input and predicts a diff as output. It is capable of imitating 46.3% of the fixes, normalized by diagnostic type, and for data points with unseen diagnostics, it is able to extrapolate to 11.9% of normalized data points. For both experiments, splitting identifiers by camelcase produces the best results. Lastly, it is found that a higher proportion of formatting tokens in input has minimal positive impact on prediction success rates, whereas the proportion of formatting in output has no impact on success rates. / C# tillhandahåller statiska analysbibliotek för mallbaserad kodanalys och kodfixering. Dessa bibliotek har använts av open source-gemenskapen för att generera många NuGet-paket för olika användningsfall. Men på grund av mängden av dessa paket är det svårt att hitta de som krävs för ett projekt och att skapa nya analysatorer och fixare tar tid och ansträngning att skapa. Därför föreslår denna avhandling ett neuralt nätverk, som för det första imiterar befintliga korrigeringar och för det andra extrapolerar till korrigeringar av osynlig diagnostik. För att göra det har det senaste inom statisk analys NuGetpaketen undersökts och vidare använts för att generera en datauppsättning med diagnostik och motsvarande kodfixar för 24 622 datapunkter. Eftersom många C# fixers tillämpar formateringsändringar, bevaras all formatering i datasetet. Dessutom, eftersom fixarna också tillämpar identifieringsändringar, varieras tokeniseringen av datamängden mellan att dela upp identifierare efter camelcase och att bevara dem. Det neurala nätverket använder en sekvenstill- sekvens-inlärningsmetod med Transformer-modellen och tar filkontext, diagnostiskt meddelande och plats som indata och förutsäger en skillnad som utdata. Den kan imitera 46,3% av korrigeringarna, normaliserade efter diagnostisk typ, och för datapunkter med osynlig diagnostik kan den extrapolera till 11,9% av normaliserade datapunkter. För båda experimenten ger uppdelning av identifierare efter camelcase de bästa resultaten. Slutligen har det visat sig att en högre andel formateringstokens i indata har minimal positiv inverkan på åndelen korrekta förutsägelser, medan andelen formatering i utdata inte har någon inverkan på åndelen korrekta förutsägelser.
Page generated in 0.0964 seconds