Global ETD Search

1	Automated support for reproducing and debugging field failures Jin, Wei 21 September 2015 (has links) As confirmed by a recent survey conducted among developers of the Apache, Eclipse, and Mozilla projects, two extremely challenging tasks during maintenance are reproducing and debugging field failures--failures that occur on user machines after release. In my PhD study, I have developed several techniques to address and mitigate the problems of reproducing and debugging field failures. In this defense, I will present an overview of my work and describe in detail four different techniques: BugRedux, F3, Clause Weighting (CW), and On-demand Formula Computation (OFC). BugRedux is a general technique for reproducing field failures that collects dynamic data about failing executions in the field and uses this data to synthesize executions that mimic the observed field failures. F3 leverages the executions generated by BugRedux to perform automated debugging using a set of suitably optimized fault-localization techniques. OFC and CW improves the overall effectiveness and efficiency of state-of-the-art formula-based debugging. In addition to the presentation of these techniques, I will also present an empirical evaluation of the techniques on a set of real-world programs and field failures. The results of the evaluation are promising in that, for all the failures considered, my approach was able to (1) synthesize failing executions that mimicked the observed field failures, (2) synthesize passing executions similar to the failing ones, and (3) use the synthesized executions successfully to perform fault localization with accurate results. Debugging Fault localization Field failures
2	Unifying regression testing with mutation testing Zhang, Lingming 07 July 2014 (has links) Software testing is the most commonly used methodology for validating quality of software systems. Conceptually, testing is simple, but in practice, given the huge (practically infinite) space of inputs to test against, it requires solving a number of challenging problems, including evaluating and reusing tests efficiently and effectively as software evolves. While software testing research has seen much progress in recent years, many crucial bugs still evade state-of-the-art approaches and cause significant monetary losses and sometimes are responsible for loss of life. My thesis is that a unified, bi-dimensional, change-driven methodology can form the basis of novel techniques and tools that can make testing significantly more effective and efficient, and allow us to find more bugs at a reduced cost. We propose a novel unification of the following two dimensions of change: (1) real manual changes made by programmers, e.g., as commonly used to support more effective and efficient regression testing techniques; and (2) mechanically introduced changes to code or specifications, e.g., as originally conceived in mutation testing for evaluating quality of test suites. We believe such unification can lay the foundation of a scalable and highly effective methodology for testing and maintaining real software systems. The primary contribution of my thesis is two-fold. One, it introduces new techniques to address central problems in both regression testing (e.g., test prioritization) and mutation testing (e.g., selective mutation testing). Two, it introduces a new methodology that uses the foundations of regression testing to speed up mutation testing, and also uses the foundations of mutation testing to help with the fault localization problem raised in regression testing. The central ideas are embodied in a suite of prototype tools. Rigorous experimental evaluation is used to validate the efficacy of the proposed techniques using a variety of real-world Java programs. / text Regression testing Mutation testing Fault localization
3	On The Impact Of Distinct Metrics For Fault Localization In Automated Program Repair Mazur, Marek Marcin January 2023 (has links) Automatic Program Repair (APR) is a dynamically growing field of computer science that aims to reduce the time and cost of debugging code and improve its efficiency. Fault localization (FL) is a critical component of the APR workflow and has a real impact on the success of an APR procedure. The fault localization step produces a list of potentially faulty code by calculating its level of suspiciousness, i.e., the likelihood of being faulty. In the process of calculation, a great variety of metrics can be implemented. In this thesis, we examine the effectiveness of ASTOR, a Java APR framework, with chosen FL metrics for calculating suspiciousness by conducting a controlled experiment. ASTOR is tested against the Defects4J dataset, a benchmark for APR evaluation, containing bugs from open-source projects. The most significant difference between ASTOR executions regards the tests performed on the bug Math 82, where the difference between the fastest and slowest execution was of 553,23 s (the slowest execution was 571,31 s, i.e., +3060 % to the fastest execution which was 18,08 s). The experiment showed also that the mean execution time for the cases when a plausible patch was found could differ from metric to metric. automated program repair fault localization metrics for fault localization Computer Sciences Datavetenskap (datalogi)
4	Enhancing Fault Localization with Cost Awareness Nachimuthu Nallasamy, Kanagaraj 24 June 2019 (has links) Debugging is a challenging and time-consuming process in software life-cycle. The focus of the thesis is to improve the accuracy of existing fault localization (FL) techniques. We experimented with several source code line level features such as line commit size, line recency, and line length to arrive at a new fault localization technique. Based on our experiments, we propose a novel enhanced cost-aware fault localization (ECFL) technique by combining line length with the existing selected baseline fault localization techniques. ECFL improves the accuracy of DStar (Baseline 1), CombineFastestFL (Baseline 2), and CombineFL (Baseline 3) by locating 81%, 58%, and 30% more real faults respectively in Top-1 evaluation metric. In comparison with the baseline techniques, ECFL requires a marginal additional time (on an average, 5 seconds per bug) and data while providing a significant improvement in accuracy. The source code line features also improve the baseline fault localization techniques when ''learning to rank'' SVM machine learning approach is used to combine the features. We also provide an infrastructure to facilitate future research on combining new source code line features with other fault localization techniques. / Master of Science / Software debugging involves locating and fixing faults (or bugs) in software. It is a challenging and time-consuming process in software life-cycle. Fault localization (FL) techniques help software developers to locate faults by providing a ranked set of program elements. The focus of the thesis is to improve the accuracy of existing fault localization techniques. We experimented with several source code line level features such as line commit size, line recency, and line length to arrive at a new fault localization technique. Based on our experiments, we propose a novel enhanced cost-aware fault localization (ECFL) technique by combining line length with the existing selected baseline fault localization techniques. ECFL improves the accuracy of DStar (Baseline 1), CombineFastestFL (Baseline 2), and CombineFL (Baseline 3) by locating 81%, 58%, and 30% more real faults respectively in Top-1 evaluation metric. In comparison with the baseline techniques, ECFL requires a marginal additional time (on an average, 5 seconds per bug) and data while providing a significant improvement in accuracy. The source code line features also improve the baseline fault localization techniques when machine learning approach is used to combine the features. We also provide an infrastructure to facilitate future research on combining new source code line features with other fault localization techniques. fault localization automated debugging source code line features cost-aware fault localization
5	Fault Diagnosis in Distributed Simulation Systems over Wide Area Networks using Active Probing / Feldiagnostik i Distibuerade Simulationssystem över Wide Area Networks med Active Probing Andersson, Filip January 2016 (has links) The domain of distributed simulation is growing rapidly. This growth leads to larger and more complex supporting network architectures with high requirements on availability and reliability. For this purpose, efficient fault-monitoring is required. This work is an attempt to evaluate the viability of an Active probing approach in a distributed simulation system in a wide area network setting. In addition, some effort was directed towards building the probing-software with future extensions in mind. The Active probing approach was implemented and tested against certain performance requirements in a simulated environment. It was concluded that the approach is viable for detecting the health of the network components. However, additional research is required to draw a conclusion about the viability in more complicated scenarios that depend on more than the responsiveness of the nodes. The extensibility of the implemented software was evaluated with the QMOOD-metric and not deemed particularly extensible. Active Probing Distributed Simulation Fault localization Master thesis Extensibility
6	A Debugging Supported Automated Assessment System for Novice Programming Fong, Chao-Chun 29 August 2010 (has links) Novice programmers are difficult to debug on their own because of their lacking of prior knowledge. If we want to help them, first we need to able to check the correctness of a novice¡¦s program. And whenever any error is found, we could provide some suggestion to assist them in debugging. We use concolic testing algorithm to automatically generate test inputs. The test inputs generation of the concolic testing is directed by negating path conditions and is produced by solving path constraints. By using of concolic testing, we are able to explore as much more branches as we can. And once we found an error, we will try to locate it for novice programmers. We propose a new method called concolic debugging. Its idea comes from concolic testing. The concolic debugging algorithm initiates with a given failed test, and try to locate the faulty block by negating and backtracking the path conditions of the failed test. We use concolic testing to improve assessing style of the automated assessment system. 86.67% of our sample programs are successfully assessed by concolic testing algorithm on our new automated assessment system. And we also found our concolic debugging is much more stable and accuracy on fault localization then spectrum-based fault localization. fault localization automated debugging concolic debugging concolic testing automated assessment
7	A mixed approach to spectrum-based fault localization using information theoretic foundations Roychowdhury, Shounak 18 February 2014 (has links) Fault localization, i.e., locating faults in code, such as faulty statements or expressions, which are responsible for observed failures, is traditionally a manual, laborious, and tedious task. Recent years have seen much progress in automated techniques for fault localization. A particularly promising approach is to utilize program execution spectra to analyze passing and failing runs and compute how likely each statement is to be faulty. Techniques based on this approach have so far largely focused on either using statistical analysis or similarity-based measures, which have a natural application in evaluating such runs. However, in spite of some initial success, the current techniques lack the effectiveness of localizing the faults with a high degree of confidence in real applications. Our thesis is that information theoretic feature selection can provide a basis for novel techniques that mix coverage of different program elements for improving the effectiveness of fault localization using program spectra. Our basic insight is that each additional failing or passing run can increase the information diversity with respect to the program elements, which can help localize faults in code. For example, the statements with maximum feature diversity information can point to the most suspicious lines of code. This dissertation presents a new fault localization approach that embodies our insight and introduces Bernoulli divergence for feature selection and uses it as the foundation for two novel techniques: (1) mixing of branch and statement coverage information; and (2) varying of feature granularity from function-level to statement-level. An experimental evaluation using a suite of subject programs commonly used in evaluation of fault localization techniques shows that our approach provides an effective basis for fault localization. / text Fault localization Information theory Statistical divergences Software engineering
8	Fault detection and precedent-free localization in thermal-fluid systems Carpenter, Katherine Patricia 16 February 2011 (has links) This thesis presents a method for fault detection and precedent-free isolation for two types of channel flow systems, which were modeled with the finite element method. Unlike previous fault detection methods, this method requires no a priori knowledge or training pertaining to any particular fault. The basis for anomaly detection was the model of normal behavior obtained using the recently introduced Growing Structure Multiple Model System (GSMMS). Anomalous behavior is then detected as statistically significant departures of the current modeling residuals away from the modeling residuals corresponding to the normal system behavior. Distributed anomaly detection facilitated by multiple anomaly detectors monitoring various parts of the thermal-fluid system enabled localization of anomalous partitions of the system without the need to train classifiers to recognize an underlying fault. / text Fault detection Fault localization Thermal-fluid systems Channel flow
9	Methods and measures for statistical fault localisation Landsberg, David January 2016 (has links) Fault localisation is the process of finding the causes of a given error, and is one of the most costly elements of software development. One of the most efficient approaches to fault localisation appeals to statistical methods. These methods are characterised by their ability to estimate how faulty a program artefact is as a function of statistical information about a given program and test suite. However, the major problem facing statistical approaches is their effectiveness -- particularly with respect to finding single (or multiple) faults in large programs typical to the real world. A solution to this problem hinges on discovering new formal properties of faulty programs and developing scalable statistical techniques which exploit them. In this thesis I address this by identifying new properties of faulty programs, developing the formal frameworks and methods which are formally proven to exploit them, and demonstrating that many of our new techniques substantially and statistically significantly outperform competing algorithms at given fault localisation tasks (using p = 0.01) on what (to our knowledge) is one of the largest scale set of experiments in fault localisation to date. This research is thus designed to corroborate the following thesis statement: That the new algorithms presented in this thesis are effective and efficient at software fault localisation and outperform state of the art statistical techniques at a range of fault localisation tasks. In more detail, the major thesis contributions are as follows: 1. We perform a thorough investigation into the existing framework of (sbfl), which currently stands at the cutting edge of statistical fault localisation. To improve on the effectiveness of sbfl, our first contribution is to introduce and motivate many new statistical measures which can be used within this framework. First, we show that many are well motivated to the task of sbfl. Second, we formally prove equivalence properties of large classes of measures. Third, we show that many of the measures perform competitively with the existing measures in experimentation -- in particular our new measure m9185 outperforms all existing measures on average in terms of effectiveness, and along with Kulkzynski2, is in a class of measures which statistically significantly outperforms all other measures at finding a single fault in a program (p = 0.01). 2. Having investigated sbfl, our second contribution is to motivate, introduce, and formally develop a new formal framework which we call probabilistic fault localisation (pfl). pfl is similar to sbfl insofar as it can leverage any suspiciousness measure, and is designed to directly estimate the probability that a given program artefact is faulty. First, we formally prove that pfl is theoretically superior to sbfl insofar as it satisfies and exploits a number of desirable formal properties which sbfl does not. Second, we experimentally show that pfl methods (namely, our measure pfl-ppv) substantially and statistically significantly outperforms the best performing sbfl measures at finding a fault in large multiple fault programs (p = 0.01). Furthermore, we show that for many of our benchmarks it is theoretically impossible to design strictly rational sbfl measures which outperform given pfl techniques. 3. Having addressed the problem of localising a single fault in a pro- gram, we address the problem of localising multiple faults. Accord- ingly, our third major contribution is the introduction and motiva- tion of a new algorithm M<sub>Opt(g)</sub> which optimises any ranking-based method g (such as pfl/sbfl/Barinel) to the task of multiple fault localisation. First we prove that MOpt(g) formally satisfies and exploits a newly identified formal property of multiple fault optimality. Secondly, we experimentally show that there are values for g such that M<sub>Opt(g)</sub> substantially and statistically significantly outperforms given ranking-based fault localisation methods at the task of finding multiple faults (p = 0.01). 4. Having developed methods for localising faults as a function of a given test suite, we finally address the problem of optimising test suites for the purposes of fault localisation. Accordingly, we first present an algorithm which leverages model checkers to improve a given test suite by making it satisfy a property of single bug opti- mality. Second, we experimentally show that on small benchmarks single bug optimal test suites can be generated (from scratch) efficiently when the algorithm is used in conjunction with the cbmc model checker, and that the test suite generated can be used effectively for fault localisation.
10	On the use of control- and data-ow in fault localization / Sobre o uso de fluxo de controle e de dados para a localizao de defeitos Ribeiro, Henrique Lemos 19 August 2016 (has links) Testing and debugging are key tasks during the development cycle. However, they are among the most expensive activities during the development process. To improve the productivity of developers during the debugging process various fault localization techniques have been proposed, being Spectrum-based Fault Localization (SFL), or Coverage-based Fault Localization (CBFL), one of the most promising. SFL techniques pinpoints program elements (e.g., statements, branches, and definition-use associations), sorting them by their suspiciousness. Heuristics are used to rank the most suspicious program elements which are then mapped into lines to be inspected by developers. Although data-flow spectra (definition-use associations) has been shown to perform better than control-flow spectra (statements and branches) to locate the bug site, the high overhead to collect data-flow spectra has prevented their use on industry-level code. A data-flow coverage tool was recently implemented presenting on average 38% run-time overhead for large programs. Such a fairly modest overhead motivates the study of SFL techniques using data-flow information in programs similar to those developed in the industry. To achieve such a goal, we implemented Jaguar (JAva coveraGe faUlt locAlization Ranking), a tool that employ control-flow and data-flow coverage on SFL techniques. The effectiveness and efficiency of both coverages are compared using 173 faulty versions with sizes varying from 10 to 96 KLOC. Ten known SFL heuristics to rank the most suspicious lines are utilized. The results show that the behavior of the heuristics are similar both to control- and data-flow coverage: Kulczynski2 and Mccon perform better for small number of lines investigated (from 5 to 30 lines) while Ochiai performs better when more lines are inspected (30 to 100 lines). The comparison between control- and data-flow coverages shows that data-flow locates more defects in the range of 10 to 50 inspected lines, being up to 22% more effective. Moreover, in the range of 20 and 100 lines, data-flow ranks the bug better than control-flow with statistical significance. However, data-flow is still more expensive than control-flow: it takes from 23% to 245% longer to obtain the most suspicious lines; on average data-flow is 129% more costly. Therefore, our results suggest that data-flow is more effective in locating faults because it tracks more relationships during the program execution. On the other hand, SFL techniques supported by data-flow coverage needs to be improved for practical use at industrial settings / Teste e depuração são tarefas importantes durante o ciclo de desenvolvimento de programas, no entanto, estão entre as atividades mais caras do processo de desenvolvimento. Diversas técnicas de localização de defeitos têm sido propostas a fim de melhorar a produtividade dos desenvolvedores durante o processo de depuração, sendo a localização de defeitos baseados em cobertura de código (Spectrum-based Fault Localization (SFL) uma das mais promissoras. A técnica SFL aponta os elementos de programas (e.g., comandos, ramos e associações definição-uso), ordenando-os por valor de suspeição. Heursticas são usadas para ordenar os elementos mais suspeitos de um programa, que então são mapeados em linhas de código a serem inspecionadas pelos desenvolvedores. Embora informações de fluxo de dados (associações definição-uso) tenham mostrado desempenho melhor do que informações de fluxo de controle (comandos e ramos) para localizar defeitos, o alto custo para coletar cobertura de fluxo de dados tem impedido a sua utilização na prática. Uma ferramenta de cobertura de fluxo de dados foi recentemente implementada apresentando, em média, 38% de sobrecarga em tempo de execução em programas similares aos desenvolvidos na indústria. Tal sobrecarga, bastante modesta, motiva o estudo de SFL usando informações de fluxo de dados. Para atingir esse objetivo, Jaguar (Java coveraGe faUlt locAlization Ranking), uma ferramenta que usa técnicas SFL com cobertura de fluxo de controle e de dados, foi implementada. A eficiência e eficácia de ambos os tipos de coberturas foram comparados usando 173 versões com defeitos de programas com tamanhos variando de 10 a 96 KLOC. Foram usadas dez heursticas conhecidas para ordenar as linhas mais suspeitas. Os resultados mostram que o comportamento das heursticas são similares para fluxo de controle e de dados: Kulczyski2 e Mccon obtêm melhores resultados para números menores de linhas investigadas (de 5 a 30), enquanto Ochiai é melhor quando mais linhas são inspecionadas (de 30 a 100). A comparação entre os dois tipos de cobertura mostra que fluxo de dados localiza mais defeitos em uma variação de 10 a 50 linhas inspecionadas, sendo até 22% mais eficaz. Além disso, na faixa entre 20 e 100 linhas, fluxo de dados classifica com significância estatstica melhor os defeitos. No entanto, fluxo de dados é mais caro do que fluxo de controle: leva de 23% a 245% mais tempo para obter os resultados; fluxo de dados é em média 129% mais custoso. Portanto, os resultados indicam que fluxo de dados é mais eficaz para localizar os defeitos pois rastreia mais relacionamentos durante a execução do programa. Por outro lado, técnicas SFL apoiadas por cobertura de fluxo de dados precisam ser mais eficientes para utilização prática na indústria Control-flow Data-flow Engenharia de software Fault localization Fluxo de controle Fluxo de dados Localização de defeitos Software engineering

Search results