Global ETD Search

81	Instrumentation Analysis: An Automated Method for Producing Numeric Abstractions of Heap-Manipulating Programs Magill, Stephen 29 November 2010 (has links) A number of questions regarding programs involving heap-based data structures can be phrased as questions about numeric properties of those structures. A data structure traversal might terminate if the length of some path is eventually zero or a function to remove n elements from a collection may only be safe if the collection has size at least n. In this thesis, we develop proof methods for reasoning about the connection between heap-manipulating programs and numeric programs. In addition, we develop an automatic method for producing numeric abstractions of heap-manipulating programs. These numeric abstractions are expressed as simple imperative programs over integer variables and have the feature that if a property holds of the numeric program, then it also holds of the original, heap-manipulating program. This is true for both safety and liveness. The abstraction procedure makes use of a shape analysis based on separation logic and has support for user-defined inductive data structures. We also discuss a number of applications of this technique. Numeric abstractions, once obtained, can be analyzed with a variety of existing verification tools. Termination provers can be used to reason about termination of the numeric abstraction, and thus termination of the original program. Safety checkers can be used to reason about assertion safety. And bound inference tools can be used to obtain bounds on the values of program variables. With small changes to the program source, bounds analysis also allows the computation of symbolic bounds on memory use and computational complexity. Separation Logic Instrumentation Analysis Static Analysis Abstract Interpretation Termination Proving Hoare Logic Computer Sciences
82	Hermes: A Targeted Fuzz Testing Framework Shortt, Caleb James 12 March 2015 (has links) The use of security assurance cases (security cases) to provide evidence-based assurance of security properties in software is a young field in Software Engineering. A security case uses evidence to argue that a particular claim is true. For example, the highest-level claim may be that a given system is sufficiently secure, and it would include sub claims to break that general claim down into more granular, and tangible, items - such as evidence or other claims. Random negative testing (fuzz testing) is used as evidence to support security cases and the assurance they provide. Many current approaches apply fuzz testing to a target system for a given amount of time due to resource constraints. This may leave entire sections of code untouched [60]. These results may be used as evidence in a security case but their quality varies based on controllable variables, such as time, and uncontrollable variables, such as the random paths chosen by the fuzz testing engine. This thesis presents Hermes, a proof-of-concept fuzz testing framework that provides improved evidence for security cases by automatically targeting problem sections in software and selectively fuzz tests them in a repeatable and timely manner. During our experiments Hermes produced results with comparable target code coverage to a full, exhaustive, fuzz test run while significantly reducing the test execution time that is associated with an exhaustive fuzz test. These results provide a targeted piece of evidence for security cases which can be audited and refined for further assurance. Hermes' design allows it to be easily attached to continuous integration frameworks where it can be executed in addition to other frameworks in a given test suite. / Graduate / 0984 / cshortt@uvic.ca security fuzz testing genetic algorithm static analysis dynamic analysis hermes assurance
83	Context-sensitive analysis of x86 obfuscated executables Boccardo, Davidson Rodrigo [UNESP] 09 October 2009 (has links) (PDF) Made available in DSpace on 2014-06-11T19:30:32Z (GMT). No. of bitstreams: 0 Previous issue date: 2009-10-09Bitstream added on 2014-06-13T18:40:52Z : No. of bitstreams: 1 boccardo_dr_dr_ilha.pdf: 1178776 bytes, checksum: cdd885f0beff962757e3b9de59ce0832 (MD5) / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Ofusca c~ao de c odigo tem por nalidade di cultar a detec c~ao de propriedades intr nsecas de um algoritmo atrav es de altera c~oes em sua sintaxe, entretanto preservando sua sem antica. Desenvolvedores de software usam ofusca c~ao de c odigo para defender seus programas contra ataques de propriedade intelectual e para aumentar a seguran ca do c odigo. Por outro lado, programadores maliciosos geralmente ofuscam seus c odigos para esconder comportamento malicioso e para evitar detec c~ao pelos anti-v rus. Nesta tese, e introduzido um m etodo para realizar an alise com sensitividade ao contexto em bin arios com ofuscamento de chamada e retorno de procedimento. Para obter sem antica equivalente, estes bin arios utilizam opera c~oes diretamente na pilha ao inv es de instru c~oes convencionais de chamada e retorno de procedimento. No estado da arte atual, a de ni c~ao de sensitividade ao contexto est a associada com opera c~oes de chamada e retorno de procedimento, assim, an alises interprocedurais cl assicas n~ao s~ao con aveis para analisar bin arios cujas opera c~oes n~ao podem ser determinadas. Uma nova de ni c~ao de sensitividade ao contexto e introduzida, baseada no estado da pilha em qualquer instru c~ao. Enquanto mudan cas em contextos a chamada de procedimento s~ao intrinsicamente relacionadas com transfer encia de controle, assim, podendo ser obtidas em termos de caminhos em um grafo de controle de uxo interprocedural, o mesmo n~ao e aplic avel para mudan cas em contextos a pilha. Um framework baseado em interpreta c~ao abstrata e desenvolvido para avaliar contexto baseado no estado da pilha e para derivar m etodos baseado em contextos a chamada de procedimento para uso com contextos baseado no estado da pilha. O metodo proposto n~ao requer o uso expl cito de instru c~oes de chamada e retorno de procedimento, por em depende do... / A code obfuscation intends to confuse a program in order to make it more di cult to understand while preserving its functionality. Programs may be obfuscated to protect intellectual property and to increase security of code. Programs may also be obfuscated to hide malicious behavior and to evade detection by anti-virus scanners. We introduce a method for context-sensitive analysis of binaries that may have obfuscated procedure call and return operations. These binaries may use direct stack operators instead of the native call and ret instructions to achieve equivalent behavior. Since de nition of context-sensitivity and algorithms for context-sensitive analysis has thus far been based on the speci c semantics associated to procedure call and return operations, classic interprocedural analyses cannot be used reliably for analyzing programs in which these operations cannot be discerned. A new notion of context-sensitivity is introduced that is based on the state of the stack at any instruction. While changes in calling-context are associated with transfer of control, and hence can be reasoned in terms of paths in an interprocedural control ow graph (ICFG), the same is not true for changes in stackcontext. An abstract interpretation based framework is developed to reason about stackcontext and to derive analogues of call-strings based methods for the context-sensitive analysis using stack-context. This analysis requires the knowledge of how the stack, rather the stack pointer, is represented and on the knowledge of operators that manipulate the stack pointer. The method presented is used to create a context-sensitive version of Venable et al.'s algorithm for detecting obfuscated calls. Experimental results show that the contextsensitive version of the algorithm generates more precise results and is also computationally more e cient than its context-insensitive counterpart. Análise estática Interpretação abstrata Ofuscação de código Static analysis Abstract interpretation Obfuscation of code
84	Static analysis of semantic web queries with ShEx schema constraints / Analyse statique de requêtes au web sémantique avec des contraintes de schéma ShEx Abbas, Abdullah 06 November 2017 (has links) La disponibilité de gros volumes de données structurées selon le modèle Resource Description Framework (RDF) est en constante augmentation. Cette situation implique un intérêt scientifique et un besoin important de rechercher de nouvelles méthodes d’analyse et de compilation de requêtes pour tirer le meilleur parti de l’extraction de données RDF. SPARQL est le plus utilisé et le mieux supporté des langages de requêtes sur des données RDF. En parallèle des langages de requêtes, les langages de définition de schéma d’expression de contraintes sur des jeux de données RDF ont également évolués. Les Shape Expressions (ShEx) sont de plus en plus utilisées pour valider des données RDF et pour indiquer les motifs de graphes attendus. Les schémas sont importants pour les tâches d’analyse statique telles que l’optimisation ou l’injection de requêtes. Notre intention est d’examiner les moyens et méthodologies d’analyse statique et d’optimisation de requêtes associés à des contraintes de schéma.Notre contribution se divise en deux grandes parties. Dans la première, nous considérons le problème de l’injection de requêtes SPARQL en présence de contraintes ShEx. Nous proposons une procédure rigoureuse et complète pour le problème de l’injection de requêtes avec ShEx, en prenant en charge plusieurs fragments de SPARQL. Plus particulièrement, notre procédure gère les patterns de requêtes OPTIONAL, qui s’avèrent former un important fonctionnalité à étudier avec les schémas. Nous fournissons ensuite les limites de complexité de notre problème en considération des fragments gérés. Nous proposons également une méthode alternative pour l’injection de requêtes SPARQL avec ShEx. Celle-ci réduit le problème à une satisfiabilité de Logique de Premier Ordre, qui permet de considérer une extension du fragment SPARQL traité par la première méthode. Il s’agit de la première étude traitant l’injection de requêtes SPARQL en présence de contraintes ShEx.Dans la seconde partie de nos contributions, nous proposons une méthode d’analyse pour optimiser l’évaluation de requêtes SPARQL groupées, sur des graphes RDF, en tirant avantage des contraintes ShEx. Notre optimisation s’appuie sur le calcul et l’assignation de rangs aux triple patterns d’une requête, permettant de déterminer leur ordre d’exécution. La présence de jointures intermédiaires entre ces patterns est la raison pour laquelle l’ordonnancement est important pour gagner en efficicacité. Nous définissons un ensemble de schémas ShEx bien- formulés, qui possède d’intéressantes caractéristiques pour l’optimisation de requêtes SPARQL. Nous développons ensuite notre méthode d’optimisation par l’exploitation d’informations extraites d’un schéma ShEx. Enfin, nous rendons compte des résultats des évaluations effectuées, montrant les avantages de l’application de notre optimisation face à l’état de l’art des systèmes d’évaluation de requêtes. / Data structured in the Resource Description Framework (RDF) are increasingly available in large volumes. This leads to a major need and research interest in novel methods for query analysis and compilation for making the most of RDF data extraction. SPARQL is the widely used and well supported standard query language for RDF data. In parallel to query language evolutions, schema languages for expressing constraints on RDF datasets also evolve. Shape Expressions (ShEx) are increasingly used to validate RDF data, and to communicate expected graph patterns. Schemas in general are important for static analysis tasks such as query optimisation and containment. Our purpose is to investigate the means and methodologies for SPARQL query static analysis and optimisation in the presence of ShEx schema constraints.Our contribution is mainly divided into two parts. In the first part we consider the problem of SPARQL query containment in the presence of ShEx constraints. We propose a sound and complete procedure for the problem of containment with ShEx, considering several SPARQL fragments. Particularly our procedure considers OPTIONAL query patterns, that turns out to be an important feature to be studied with schemas. We provide complexity bounds for the containment problem with respect to the language fragments considered. We also propose alternative method for SPARQL query containment with ShEx by reduction into First Order Logic satisfiability, which allows for considering SPARQL fragment extension in comparison to the first method. This is the first work addressing SPARQL query containment in the presence of ShEx constraints.In the second part of our contribution we propose an analysis method to optimise the evaluation of conjunctive SPARQL queries, on RDF graphs, by taking advantage of ShEx constraints. The optimisation is based on computing and assigning ranks to query triple patterns, dictating their order of execution. The presence of intermediate joins between the query triple patterns is the reason why ordering is important in increasing efficiency. We define a set of well-formed ShEx schemas, that possess interesting characteristics for SPARQL query optimisation. We then develop our optimisation method by exploiting information extracted from a ShEx schema. We finally report on evaluation results performed showing the advantages of applying our optimisation on the top of an existing state-of-the-art query evaluation system. Web Sémantique Sparql Analyse statique Schemas ShEx Semantic Web Sparql Static Analysis Schemas ShEx 004
85	Static and Dynamic Analyses of a Long-Span Cable-Stayed Bridge with Corroded Cables Xiang, Yang 19 November 2018 (has links) This study investigates the static behavior and dynamic behavior of the Stonecutters Cable-Stayed Bridge, which is the third longest cable-stayed bridge in the world, when a group of 8 cables is damaged by corrosion. In this thesis, the reduction in cable cross-section area is used to simulate the corrosion level. The corroded cables are divided into 2 groups of 4 cables, arranged in symmetric and asymmetric distributions on both decks. A finite element bridge model was employed to perform the analysis. The validation of the model was established by comparing numerical results to the published results of the bridge. The model was then subjected to gravity load only to check the effects of corrosion level, the distance of the corroded cables and the distribution of the damaged cables on the decks’ deflection and cable stresses change. In the dynamic analysis, the natural frequencies and mode shapes were compared with a reference case with no corrosion-damaged cables. A recorded wind load was applied on the deck to investigate the time-history response of the mid-span in the horizontal, vertical and torsional directions, and of the tower top in the horizontal direction. Moreover, frequency analysis was performed on the time-history response, and coupled motions at certain frequencies were observed. Numerical results as the ones presented in this thesis can complement information gathered from non-destructive testing technology in detecting corrosion-damaged stay-cables in cable-stayed bridges. long-span cable-stayed bridge corroded cables static analysis dynamic analysis
86	Estudo numérico para a determinação das pressões devidas a ação do vento em torres metálicas de seção circular Carrera, Fernando Henrique [UNESP] 13 August 2007 (has links) (PDF) Made available in DSpace on 2014-06-11T19:25:22Z (GMT). No. of bitstreams: 0 Previous issue date: 2007-08-13Bitstream added on 2014-06-13T20:32:52Z : No. of bitstreams: 1 carrera_fh_me_ilha.pdf: 1664760 bytes, checksum: 5e03e489b4282bd00667bb00a9844991 (MD5) / PROPG / O presente trabalho tem por objetivo obter numericamente os valores das distribuições de pressões devidas a ação do vento e seus respectivos coeficientes de pressões de formas externos em torres de seção circular. As distribuições de pressões nas torres são determinadas através da simulação numérica, utilizando-se o programa ANSYS 9.0, considerando-se a interação fluido-estrutura. Para a simulação numérica, a geometria da torre foi modelada tridimensionalmente, considerando como fluido o ar no qual a edificação está inserida. As distribuições de pressão foram determinadas para relações geométricas em planta da torre, entre a altura e o diâmetro (h/d), para valores menores ou iguais a 10. Posteriormente, comparam-se os resultados numéricos obtidos na simulação através do ANSYS com os valores apresentados pela norma NBR-6123:1988, a fim de verificar a viabilidade da utilização da simulação numérica na obtenção das distribuições de pressão em outras estruturas. / The present work has for objective to obtain the distributions pressures values the wind actions in tower with circular section. The values of the distributions pressures are obtained to the numeric simulation, using the ANSYS 9.0 software and considering the flow structure interaction. In the numeric simulation, the tower geometry was considered in 3D dimension, and the flowed it is the air. The distributions pressures were certain for geometry relationships between the height and the diameter (h/d), for values smaller or equal than 10. Later, the ANSYS numeric results are compared with the presents values by the NBR 6123:1998, in order to verify the viability numeric simulation used for obtaining the pressures distributions in other structures. Pressão do vento Coeficiente de pressão Interação fluido-estrutura Static analysis Wind Flow–structure
87	Analyse statique de requête pour le Web sémantique / Static Analysis of Semantic Web Queries Chekol, Melisachew Wudage 19 December 2012 (has links) L'inclusion de requête est un problème bien étudié sur plusieurs décennies de recherche. En règle générale, il est défini comme le problème de déterminer si le résultat d'une requête est inclus dans le résultat d'une autre requête pour tout ensemble de données. Elle a des applications importantes dans l'optimisation des requêtes et la vérification de bases de connaissances. L'objectif principal de cette thèse est de fournir des procédures solides et com- plètes pour déterminer l'inclusion des requêtes SPARQL en vertu d'exprimés en axiomes logiques de description. De plus, nous mettons en œuvre ces procédures à l'appui des résultats théoriques par l'expérimentation. À ce jour, test d'inclusion de requête a été effectuée à l'aide de différentes techniques: homomorphisme de graphes, bases de données canoniques, les tech- niques de la théorie des automates et par une réduction au problème de la va- lidité de la logique. Dans cette thèse, nous utilisons la derniere technique pour tester l'inclusion des requêtes SPARQL utilisant une logique expressive appelée μ-calcul. Pour ce faire, les graphes RDF sont codés comme des systèmes de transitions, et les requêtes et les axiomes du schéma sont codés comme des formules de μ-calcul. Ainsi, l'inclusion de requêtes peut être réduit á test de validité de formule logique. L'objectif de cette thèse est d'identifier les divers fragments de SPARQL (et PSPARQL) et les langages de description logique de schéma pour lequelle l'inculsion est décidable. En outre, afin de fournir théoriquement et expériment- alement éprouvées procédures de vérifier l'inclusion de ces fragments décid- ables. Pas durer au moins mais, cette thèse propose un point de repère pour les solveurs d'inclusion. Ce benchmark est utilisé pour tester et comparer l'état actuel des solveurs d'inclusion. / Query containment is a well-studied problem spanning over several decades of research. Generally, it is defined as the problem of determining if the result of one query is included in the result of another query for any given dataset. It has major applications in query optimization and knowledge base verification. The main objective of this thesis is to provide sound and complete procedures to determine containment of SPARQL queries under expressive description logic axioms. Further, to support theoretical results by experimentation. To date query containment has been done using different techniques: containment mapping, canonical databases, automata theory techniques and through a reduction to the validity problem in logic. In this thesis, we use the later technique to address containment using an expressive logic called mu-calculus. In doing so, RDF graphs are encoded as transitions systems, and queries and schema axioms are encoded as mu-calculus formulae. Thereby, query containment can be reduced to validity test in the logic. The focus of this thesis is to identify various fragments of SPARQL (and PSPARQL) and description logic schema languages for which containment is decidable. Additionally, to provide theoretically and experimentally proven procedures to check containment of those decidable fragments. Last not but least, this thesis proposes a benchmark for containment solvers. This benchmark is used to test and compare the current state-of-the-art containment solvers. Inclusion SPARQL PSPARQL RDF OWL Analyse statique Containment SPARQL PSPARQL Static analysis RDF OWL
88	Programmation efficace et sécurisé d'applications à mémoire partagée / Towards efficient and secure shared memory applications Sifakis, Emmanuel 06 May 2013 (has links) L'utilisation massive des plateformes multi-cœurs et multi-processeurs a pour effet de favoriser la programmation parallèle à mémoire partagée. Néanmoins, exploiter efficacement et de manière correcte le parallélisme sur ces plateformes reste un problème de recherche ouvert. De plus, leur modèle d'exécution sous-jacent, et notamment les modèles de mémoire "relâchés", posent de nouveaux défis pour les outils d'analyse statiques et dynamiques. Dans cette thèse nous abordons deux aspects importants dans le cadre de la programmation sur plateformes multi-cœurs et multi-processeurs: l'optimisation de sections critiques implémentées selon l'approche pessimiste, et l'analyse dynamique de flots d'informations. Les sections critiques définissent un ensemble d'accès mémoire qui doivent être exécutées de façon atomique. Leur implémentation pessimiste repose sur l'acquisition et le relâchement de mécanismes de synchronisation, tels que les verrous, en début et en fin de sections critiques. Nous présentons un algorithme générique pour l'acquisition/relâchement des mécanismes de synchronisation, et nous définissons sur cet algorithme un ensemble de politiques particulier ayant pour objectif d'augmenter le parallélisme en réduisant le temps de possession des verrous par les différentes threads. Nous montrons alors la correction de ces politiques (respect de l'atomicité et absence de blocages), et nous validons expérimentalement leur intérêt. Le deuxième point abordé est l'analyse dynamique de flot d'information pour des exécutions parallèles. Dans ce type d'analyse, l'enjeu est de définir précisément l'ordre dans lequel les accès à des mémoires partagées peuvent avoir lieu à l'exécution. La plupart des travaux existant sur ce thème se basent sur une exécution sérialisée du programme cible. Ceci permet d'obtenir une sérialisation explicite des accès mémoire mais entraîne un surcoût en temps d'exécution et ignore l'effet des modèles mémoire relâchées. A contrario, la technique que nous proposons permet de prédire l'ensemble des sérialisations possibles vis-a-vis de ce modèle mémoire à partir d'une seule exécution parallèle ("runtime prediction"). Nous avons développé cette approche dans le cadre de l'analyse de teinte, qui est largement utilisée en détection de vulnérabilités. Pour améliorer la précision de cette analyse nous prenons également en compte la sémantique des primitives de synchronisation qui réduisent le nombre de sérialisations valides. Les travaux proposé ont été implémentés dans des outils prototype qui ont permit leur évaluation sur des exemples représentatifs. / The invasion of multi-core and multi-processor platforms on all aspects of computing makes shared memory parallel programming mainstream. Yet, the fundamental problems of exploiting parallelism efficiently and correctly have not been fully addressed. Moreover, the execution model of these platforms (notably the relaxed memory models they implement) introduces new challenges to static and dynamic program analysis. In this work we address 1) the optimization of pessimistic implementations of critical sections and 2) the dynamic information flow analysis for parallel executions of multi-threaded programs. Critical sections are excerpts of code that must appear as executed atomically. Their pessimistic implementation reposes on synchronization mechanisms, such as mutexes, and consists into obtaining and releasing them at the beginning and end of the critical section respectively. We present a general algorithm for the acquisition/release of synchronization mechanisms and define on top of it several policies aiming to reduce contention by minimizing the possession time of synchronization mechanisms. We demonstrate the correctness of these policies (i.e. they preserve atomicity and guarantee deadlock freedom) and evaluate them experimentally. The second issue tackled is dynamic information flow analysis of parallel executions. Precisely tracking information flow of a parallel execution is infeasible due to non-deterministic accesses to shared memory. Most existing solutions that address this problem enforce a serial execution of the target application. This allows to obtain an explicit serialization of memory accesses but incurs both an execution-time overhead and eliminates the effects of relaxed memory models. In contrast, the technique we propose allows to predict the plausible serializations of a parallel execution with respect to the memory model. We applied this approach in the context of taint analysis , a dynamic information flow analysis widely used in vulnerability detection. To improve precision of taint analysis we further take into account the semantics of synchronization mechanisms such as mutexes, which restricts the predicted serializations accordingly. The solutions proposed have been implemented in proof of concept tools which allowed their evaluation on some hand-crafted examples. Concurrence Atomicité Vulnerabilité Concurrency Weak atomicity Memory access Static analysis Runtime analysis
89	Modeling and Pattern Matching Security Properties with Dependence Graphs Fåk, Pia January 2005 (has links) With an increasing number of computers connected to the Internet, the number of malicious attacks on computer systems also raises. The key to all successful attacks on information systems is finding a weak spot in the victim system. Some types of bugs in software can constitute such weak spots. This thesis presents and evaluates a technique for statically detecting such security related bugs. It models the analyzed program as well as different types of security bugs with dependence graphs. Errors are detected by searching the program graph model for subgraphs matching security bug models. The technique has been implemented in a prototype tool called GraphMatch. Its accuracy and performance have been measured by analyzing open source application code for missing input validation vulnerabilities. The test results show that the accuracy obtained so far is low and the complexity of the algorithms currently used cause analysis times of several hours even for fairly small projects. Further research is needed to determine if the performance and accuracy can be improved. information security static analysis dependence graphs pattern matching Computer Sciences Datavetenskap (datalogi)
90	Ranking source code static analysis warnings for continuous monitoring of free/libre/open source software repositories / Ranqueamento de avisos de análise estática de código fonte para monitoramento de repositórios de software livre Athos Coimbra Ribeiro 22 June 2018 (has links) While there is a wide variety of both open source and proprietary source code static analyzers available in the market, each of them usually performs better in a small set of problems, making it hard to choose one single tool to rely on when examining a program. Combining the analysis of different tools may reduce the number of false negatives, but yields a corresponding increase in the number of false positives (which is already high for many tools). An interesting solution, then, is to filter these results to identify the issues least likely to be false positives. This work presents kiskadee, a system to support the usage of static analysis during software development by providing carefully ranked static analysis reports. First, it runs multiple static analyzers on the source code. Then, using a classification model, the potential bugs detected by the static analyzers are ranked based on their importance, with critical flaws ranked first, and potential false positives ranked last. To train kiskadee\'s classification model, we post-analyze the reports generated by three tools on synthetic test cases provided by the US National Institute of Standards and Technology. To make our technique as general as possible, we limit our data to the reports themselves, excluding other information such as change histories or code metrics. The features extracted from these reports are used to train a set of decision trees using AdaBoost to create a stronger classifier, achieving 0.8 classification accuracy (the combined false positive rate from the used tools was 0.61). Finally, we use this classifier to rank static analyzer alarms based on the probability of a given alarm being an actual bug. Our experimental results show that, on average, when inspecting warnings ranked by kiskadee, one hits 5.2 times less false positives before each bug than when using a randomly sorted warning list. / Embora exista grande variedade de analisadores estáticos de código-fonte disponíveis no mercado, tanto com licenças proprietárias, quanto com licenças livres, cada uma dessas ferramentas mostra melhor desempenho em um pequeno conjunto de problemas distinto, dificultando a escolha de uma única ferramenta de análise estática para analisar um programa. A combinação das análises de diferentes ferramentas pode reduzir o número de falsos negativos, mas gera um aumento no número de falsos positivos (que já é alto para muitas dessas ferramentas). Uma solução interessante é filtrar esses resultados para identificar os problemas com menores probabilidades de serem falsos positivos. Este trabalho apresenta kiskadee, um sistema para promover o uso da análise estática de código fonte durante o ciclo de desenvolvimento de software provendo relatórios de análise estática ranqueados. Primeiramente, kiskadee roda diversos analisadores estáticos no código-fonte. Em seguida, utilizando um modelo de classificação, os potenciais bugs detectados pelos analisadores estáticos são ranqueados conforme sua importância, onde defeitos críticos são colocados no topo de uma lista, e potenciais falsos positivos, ao fim da mesma lista. Para treinar o modelo de classificação do kiskadee, realizamos uma pós-análise nos relatórios gerados por três analisadores estáticos ao analisarem casos de teste sintéticos disponibilizados pelo National Institute of Standards and Technology (NIST) dos Estados Unidos. Para tornar a técnica apresentada o mais genérica possível, limitamos nossos dados às informações contidas nos relatórios de análise estática das três ferramentas, não utilizando outras informações, como históricos de mudança ou métricas extraídas do código-fonte dos programas inspecionados. As características extraídas desses relatórios foram utilizadas para treinar um conjunto de árvores de decisão utilizando o algoritmo AdaBoost para gerar um classificador mais forte, atingindo uma acurácia de classificação de 0,8 (a taxa de falsos positivos das ferramentas utilizadas foi de 0,61, quando combinadas). Finalmente, utilizamos esse classificador para ranquear os alarmes dos analisadores estáticos nos baseando na probabilidade de um dado alarme ser de fato um bug no código-fonte. Resultados experimentais mostram que, em média, quando inspecionando alarmes ranqueados pelo kiskadee, encontram-se 5,2 vezes menos falsos positivos antes de se encontrar cada bug quando a mesma inspeção é realizada para uma lista ordenada de forma aleatória. Análise estática de código-fonte Engenharia de software Qualidade de software Software engineering Software quality Source code static analysis

Search results