21 |
A software testing framework for context-aware applications in pervasive computingLu, Heng, 陸恒 January 2008 (has links)
published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy
|
22 |
Making Software More Reliable by Uncovering Hidden DependenciesBell, Jonathan Schaffer January 2016 (has links)
As software grows in size and complexity, it also becomes more interdependent. Multiple internal components often share state and data. Whether these dependencies are intentional or not, we have found that their mismanagement often poses several challenges to testing. This thesis seeks to make it easier to create reliable software by making testing more efficient and more effective through explicit knowledge of these hidden dependencies.
The first problem that this thesis addresses, reducing testing time, directly impacts the day-to-day work of every software developer. The frequency with which code can be built (compiled, tested, and package) directly impacts the productivity of developers: longer build times mean a longer wait before determining if a change to the application being build was successful. We have discovered that in the case of some languages, such as Java, the vast majority of build time is spent running tests. Therefore, it's incredibly important to focus on approaches to accelerating testing, while simultaneously making sure that we do not inadvertently cause tests to erratically fail (i.e. become flaky).
Typical techniques for accelerating tests (like running only a subset of them, or running them in parallel) often can't be applied soundly, since there may be hidden dependencies between tests. While we might think that each test should be independent (i.e. that a test's outcome isn't influenced by the execution of another test), we and others have found many examples in real software projects where tests truly have these dependencies: some tests require others to run first, or else their outcome will change. Previous work has shown that these dependencies are often complicated, unintentional, and hidden from developers. We have built several systems, VMVM and ElectricTest, that detect different sorts of dependencies between tests and use that information to soundly reduce testing time by several orders of magnitude.
In our first approach, Unit Test Virtualization, we reduce the overhead of isolating each unit test with a lightweight, virtualization-like container, preventing these dependencies from manifesting. Our realization of Unit Test Virtualization for Java, VMVM eliminates the need to run each test in its own process, reducing test suite execution time by an average of 62% in our evaluation (compared to execution time when running each test in its own process).
However, not all test suites isolate their tests: in some, dependencies are allowed to occur between tests. In these cases, common test acceleration techniques such as test selection or test parallelization are unsound in the absence of dependency information. When dependencies go unnoticed, tests can unexpectedly fail when executed out of order, causing unreliable builds. Our second approach, ElectricTest, soundly identifies data dependencies between test cases, allowing for sound test acceleration.
To enable more broad use of general dependency information for testing and other analyses, we created Phosphor, the first and only portable and performant dynamic taint tracking system for the JVM. Dynamic taint tracking is a form of data flow analysis that applies labels to variables, and tracks all other variables derived from those tagged variables, propagating those tags. Taint tracking has many applications to software engineering and software testing, and in addition to our own work, researchers across the world are using Phosphor to build their own systems. Towards making testing more effective, we also created Pebbles, which makes it easy for developers to specify data-related test oracles on mobile devices by thinking in terms of high level objects such as emails, notes or pictures.
|
23 |
Compiler-assisted Adaptive Software TestingPetsios, Theofilos January 2018 (has links)
Modern software is becoming increasingly complex and is plagued with vulnerabilities that are constantly exploited by attackers. The vast numbers of bugs found in security-critical systems and the diversity of errors presented in commercial off-the-shelf software require effective, scalable testing frameworks. Unfortunately, the current testing ecosystem is heavily fragmented, with the majority of toolchains targeting limited classes of errors and applications without offering provably strong guarantees. With software codebases continuously becoming more diverse and complex, the large-scale deployment of monolithic, non-adaptive analysis engines is likely to increase the aforementioned fragmentation. Instead, modern software testing requires adaptive, hybrid techniques that target errors selectively. This dissertation argues that adopting context-aware analyses will enable us to set the foundations for retargetable testing frameworks while further increasing the accuracy and extensibility of existing toolchains. To this end, we initially examine how compiler analyses can become context-aware, prioritizing certain errors over others of the same type. As a use case of our proposed approach, we extend a state-of-the-art compiler's integer error detection pipeline to suppress reports of benign errors by up to 89% in real-world workloads, while allowing for reporting of serious errors. Subsequently, we demonstrate how compiler-based instrumentation can be utilized by feedback-driven evolutionary fuzzers to provide multifaceted analyses targeting broader classes of bugs. In this direction, we present differential diversity (δ-diversity), we propose a generic methodology for offering state-aware guidance in feedback-driven frameworks, and we demonstrate how to retrofit state-of-the-art fuzzers to target broader classes of errors. We provide two such prototype implementations: NEZHA, the first differential generic fuzzer capable of handling logic bugs, as well as SlowFuzz, the first generic fuzzer targeting complexity vulnerabilities. We applied both prototypes on production software, and demonstrate their effectiveness. We found that NEZHA discovered hundreds of logic discrepancies across a wide variety of applications (SSL/TLS libraries, parsers, etc.), while SlowFuzz successfully generated inputs triggering slowdowns in complex, real-world software, including zip parsers, regular expression libraries, and hash table implementations.
|
24 |
Coverage-based testing strategies and reliability modeling for fault-tolerant software systems. / CUHK electronic theses & dissertations collectionJanuary 2006 (has links)
Finally, we formulate the relationship between code coverage and fault detection. Although our two current models are in simple mathematical formats, they can predict the percentage of fault detected by the code coverage achieved for a certain test set. We further incorporate such formulation into traditional reliability growth models, not only for fault-tolerant software, but also for general software system. Our empirical evaluations show that our new reliability model can achieve more accurate reliability assessment than the traditional Non-homogenous Poisson model. / Furthermore, to investigate some "variants" as well as "invariants" of fault-tolerant software, we perform an empirical investigation on evaluating reliability features by a comprehensive comparison between two projects: our project and NASA 4-University project. Based on the same specification for program development, these two projects encounter some common as well as different features. The testing results of two comprehensive operational testing procedures involving hundreds of thousands test cases are collected and compared. Similar as well as dissimilar faults are observed and analyzed, indicating common problems related to the same application in both projects. The small number of coincident failures in the two projects, nevertheless, provide a supportive evidence for N-version programming, while the observed reliability improvement implies some trends in the software development in the past twenty years. / Motivated by the lack of real-world project data for investigation on software testing and fault tolerance techniques together, we conduct a real-world project and engage multiple programming teams to independently develop program versions based on an industry-scale avionics application. Detailed experimentations are conducted to study the nature, source, type, detectability, and effect of faults uncovered in the program versions, and to learn the relationship among these faults and the correlation of their resulting failures. Coverage-based testing as well as mutation testing techniques are adopted to reproduce mutants with real faults, which facilitate the investigation on the effectiveness of data flow coverage, mutation coverage, and fault coverage for design diversity. / Next, we investigate the effect of code coverage on fault detection which is the underlying intuition of coverage-based testing strategies. From our experimental data, we find that code coverage is a moderate indicator for the capability of fault detection on the whole test set. But the effect of code coverage on fault detection varies under different testing profiles. The correlation between the two measures is high with exceptional test cases, but weak in normal testing. Moreover, our study shows that code coverage can be used as a good filter to reduce the size of the effective test set, although it is more evident for exceptional test cases. / Software permeates our modern society, and its complexity and criticality is ever increasing. Thus the capability to tolerate software faults, particularly for critical applications, is evident. While fault-tolerant software is seen as a necessity, it also remains as a controversial technique and there is a lack of conclusive assessment about its effectiveness. / Then, based on the preliminary experimental data, further experimentation and detailed analyses on the correlations among these faults and the relation to their resulting failures are studied. The results are further applied to the current reliability modeling techniques for fault-tolerant software to examine their effectiveness and accuracy. / This thesis aims at providing a quantitative assessment scheme for a comprehensive evaluation of fault-tolerant software including reliability model comparisons and trade-off studies with software testing techniques. First of all, we propose a comprehensive procedure in assessing fault-tolerant software for software reliability engineering, which is composed of four tasks: modeling, experimentation, evaluation and economics. Our ultimate objective is to construct a systematic approach to predicting the achievable reliability based on the software architecture and testing evidences, through an investigation of testing and modeling techniques for fault-tolerant software. / Cai Xia. / "September 2006." / Adviser: Rung Tsong Michael Lyu. / Source: Dissertation Abstracts International, Volume: 68-03, Section: B, page: 1715. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2006. / Includes bibliographical references (p. 165-181). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
|
25 |
On adaptive random testingKuo, Fei-Ching, n/a January 2006 (has links)
Adaptive random testing (ART) has been proposed as an enhancement to random
testing for situations where failure-causing inputs are clustered together. The basic
idea of ART is to evenly spread test cases throughout the input domain. It has
been shown by simulations and empirical analysis that ART frequently outperforms
random testing. However, there are some outstanding issues on the cost-effectiveness
and practicality of ART, which are the main foci of this thesis.
Firstly, this thesis examines the basic factors that have an impact on the faultdetection
effectiveness of adaptive random testing, and identifies favourable and
unfavourable conditions for ART. Our study concludes that favourable conditions
for ART occur more frequently than unfavourable conditions. Secondly, since all
previous studies allow duplicate test cases, there has been a concern whether adaptive
random testing performs better than random testing because ART uses fewer
duplicate test cases. This thesis confirms that it is the even spread rather than less
duplication of test cases which makes ART perform better than RT. Given that the
even spread is the main pillar of the success of ART, an investigation has been conducted
to study the relevance and appropriateness of several existing metrics of even
spreading. Thirdly, the practicality of ART has been challenged for nonnumeric or
high dimensional input domains. This thesis provides solutions that address these
concerns. Finally, a new problem solving technique, namely, mirroring, has been
developed. The integration of mirroring with adaptive random testing has been
empirically shown to significantly increase the cost-effectiveness of ART.
In summary, this thesis significantly contributes to both the foundation and the
practical applications of adaptive random testing.
|
26 |
Empirical study - pairwise prediction of fault based on coverageShamasunder, Shalini 14 June 2012 (has links)
Researchers/engineers in the field of software testing have valued coverage as a testing metric for decades now. There have been various empirical results that have shown that as coverage increases the ability of the test program to detect a fault also increases. As a result numerous coverage techniques have been introduced. Which coverage criteria co-relates better with fault detection? Which coverage criteria on the other hand have lower correlation with fault detection? In other words, does it make more sense to achieve a higher percentage of c1 kind of coverage over a higher percentage of c2 coverage to gain good fault detection rate. Do the popular block and branch coverage perform better or does path coverage outperform them? Answering these questions will help future engineers/researchers in generating more efficient test suites and in gaining a better metric of measurement. This also helps in test suite minimization. This thesis studies the relationship between coverage and mutant kill-rates over large, randomly generated test suites for statement, branch, predicate, and path coverage of two realistic programs to answer the above open questions. The experiments both confirm conventional wisdom about these coverage criteria and contains a few surprises. / Graduation date: 2013
|
27 |
Streamlined and prioritized hierarchical relations: a technique for improving the effectiveness of theclassification-tree methodologyKwok, Wing-hong., 郭永康. January 2001 (has links)
published_or_final_version / abstract / toc / Computer Science and Information Systems / Master / Master of Philosophy
|
28 |
A formal specification-based approach to object-oriented software testing at the class level徐志農, Xu, Zhinong. January 1997 (has links)
published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy
|
29 |
Combining over- and under-approximating program analyses for automatic software testingCsallner, Christoph. January 2008 (has links)
Thesis (Ph.D.)--Computing, Georgia Institute of Technology, 2009. / Committee Chair: Smaragdakis, Yannis; Committee Member: Dwyer, Matthew; Committee Member: Orso, Alessandro; Committee Member: Pande, Santosh; Committee Member: Rugaber, Spencer.
|
30 |
Alinhamento de sequências na avaliação de resultados de teste de robustez / Sequence alignment algorithms applied in the evaluation of robustness testing resultsLemos, Gizelle Sandrini, 1975- 12 November 2012 (has links)
Orientador: Eliane Martins / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-22T06:26:11Z (GMT). No. of bitstreams: 1
Lemos_GizelleSandrini_D.pdf: 2897622 bytes, checksum: 93eb35a9b69a8e36e90d0399422e6520 (MD5)
Previous issue date: 2013 / Resumo: A robustez, que é a capacidade do sistema em funcionar de maneira adequada em situações inesperadas, é uma propriedade cada vez mais importante, em especial para sistemas críticos. Uma técnica bastante empregada para testar a robustez consiste em injetar falhas no sistema e observar seu comportamento. Um problema comum nos testes é a determinação de um oráculo, i.e., um mecanismo que decida se o comportamento do sistema é ou não aceitável. Oráculos como a comparação com padrão-ouro - execução sem injeção de falhas - consideram todo comportamento diferente do padrão como sendo erro no sistema em teste (SUT). Por exemplo, a ativação de rotinas de tratamento de exceção em presença de falhas, pode ser considerada como erros. Também utilizada como oráculo, à busca por propriedades de segurança (safety) pode mostrar a presença de não robustez no SUT. Caso haja no SUT eventos semanticamente similares aos da propriedade estes não são notados (a menos que sejam explicitamente definidos na propriedade). O objetivo deste trabalho é o desenvolvimento de oráculos específicos para teste de robustez visando à diminuição dos problemas existentes nas soluções atualmente empregadas. Desenvolvemos duas abordagens a serem utilizadas como oráculos. O principal diferencial de nossas soluções em relação aos oráculos atuais é a adoção de algoritmos de alinhamento de sequências comumente aplicados em bioinformática. Estes algoritmos trabalham com matching inexato, permitindo algumas variações entre as sequência comparadas. A primeira abordagem criada é baseada na comparação tradicional com o padrão-ouro, porém aplica o alinhamento global de sequências na comparação de traços de execução coletados durante a injeção de falhas e padrões coletados sem a injeção de falhas. Isto permite que traços com pequenas porções diferentes do padrão sejam também classificados como robustos possibilitando inclusive a utilização da abordagem como oráculo em sistemas não deterministas, o que não é possível atualmente. Já, a segunda abordagem busca propriedades de segurança (safety) em traços coletados durante a injeção de falhas por meio do uso do algoritmo de alinhamento local de sequências. Além das vantagens do fato de serem algoritmos de matching inexato, estes algoritmos utilizam um sistema de pontuação que se baseia em informações obtidas na especificação do sistema em teste para guiar o alinhamento das sequências. Mostramos o resultado da aplicação das abordagens em estudos de caso / Abstract: Robustness, which is the ability of a system to work properly in unexpected situations, is an important characteristic, especially for critical systems. A commonly used technique for robustness testing is to inject faults during the system execution and to observe its behavior. A frequent problem during the tests is to determine an oracle, i.e., a mechanism that decides if the system behavior is acceptable or not. Oracles such as the golden run comparison - system execution without injection of faults - consider all different behaviors from the golden run as errors in the system under test (SUT). For example, the activation of exception handlers in the presence of faults could be considered as errors. Safety property searching approach is also used as oracle and it can show the presence of non-robustness in the SUT. If there are events in the SUT execution that are semantically similar to the property they are not taken into account (unless they have been explicitly defined in the property). The objective of this work is to develop specific oracles to evaluate results of robustness testing in order to minimize the deficiencies in the current oracles. The main difference between our solutions and the existing approaches is the type of algorithm that we used to compare the sequences. We adopted sequence alignment algorithms commonly applied in Bioinformatics. These algorithms are a kind of inexact matching, allowing some variations between the compared sequences. First approach is based on the traditional golden run comparison, but applies global sequence alignment of sequences to compare traces collected during fault injection and traces collected without fault injection. The use of these algorithms allows that traces with some differences of the golden run also being classified as robust allowing it use in non-deterministic systems evaluation which is not possible currently. The second approach works with comparison of patterns derived from safety properties and traces collected during robustness testing. However, differently from the first approach, the second one use of local sequence alignment algorithm to search for subsequences. Besides the advantages of the inexact matching, these algorithms use a scoring system based on information obtained from SUT specification to guide the alignment of the sequences. We show the results of the approaches application through case studies / Doutorado / Ciência da Computação / Doutora em Ciência da Computação
|
Page generated in 0.1293 seconds