Detecting bugs plays a significant role in software development. Bugs may lead to unexpected behaviors. An attacker can gain control over a system by exploiting its bugs. Usually, an attack can be triggered by user's input. Unchecked user input can cause serious problems in a program. In order to prevent this situation, user's input must be checked carefully before it can be used. To provide the information of where user's input can affect a program, the taint dataflow analysis is being considered. In this thesis, we introduce a concurrent solution to perform static taint dataflow analysis. The goal is to find the statements of the program dependent on user input and inform the developers to validate those. We provides a method for the static concurrent taint dataflow analysis based on sequential static taint dataflow analysis. Static dataflow analysis is time consuming. This research addresses the challenge of efficiently analyzing the dataflow. Our experimental shows that our concurrent taint dataflow analysis improves the speed of analyzing complex programs.
01 January 2017
The major goal of this dissertation is to enhance software security by provably correct enforcement of in-depth policies. In-depth security policies allude to heterogeneous specification of security strategies that are required to be followed before and after sensitive operations. Prospective security is the enforcement of security, or detection of security violations before the execution of sensitive operations, e.g., in authorization, authentication and information flow. Retrospective security refers to security checks after the execution of sensitive operations, which is accomplished through accountability and deterrence. Retrospective security frameworks are built upon auditing in order to provide sufficient evidence to hold users accountable for their actions and potentially support other remediation actions. Correctness and efficiency of audit logs play significant roles in reaching the accountability goals that are required by retrospective, and consequently, in-depth security policies. This dissertation addresses correct audit logging in a formal framework. Leveraging retrospective controls beside the existing prospective measures enhances security in numerous applications. This dissertation focuses on two major application spaces for in-depth enforcement. The first is to enhance prospective security through surveillance and accountability. For example, authorization mechanisms could be improved by guaranteed retrospective checks in environments where there is a high cost of access denial, e.g., healthcare systems. The second application space is the amelioration of potentially flawed prospective measures through retrospective checks. For instance, erroneous implementations of input sanitization methods expose vulnerabilities in taint analysis tools that enforce direct flow of data integrity policies. In this regard, we propose an in-depth enforcement framework to mitigate such problems. We also propose a general semantic notion of explicit flow of information integrity in a high-level language with sanitization. This dissertation studies the ways by which prospective and retrospective security could be enforced uniformly in a provably correct manner to handle security challenges in legacy systems. Provable correctness of our results relies on the formal Programming Languages-based approach that we have taken in order to provide software security assurance. Moreover, this dissertation includes the implementation of such in-depth enforcement mechanisms for a medical records web application.
Automatic malware analysis is an essential part of today's computer security practices. Nearly one million malware samples were delivered to the analysts on a daily basis on year 2014 alone while the number of samples submitted for analysis increases almost exponentially each year. Given the size of the threat we are facing today and the amount of malicious codes emerging every day, the ability to automatically analyze unknown and unwanted software is critically important more than ever. On the other hand, malware writers adapt their malicious codes to new security measurements to protect them from being exposed and detected. This is usually achieved by employing obfuscation techniques that complicate the reverse engineering and analysis of the code by adding lots of unnecessary and irrelevant computations. Most of the malicious samples found in the wild are obfuscated and equipped with complicated anti-analysis defenses intended to hide the malicious intent of the malware by defeating the analysis and/or increasing the analysis time. Deobfuscation (reversing the obfuscation) requires automatic techniques to extract the original logic embedded in the obfuscated code for further analysis. Presumably the deobfuscated code requires less analysis time and is easier to analyze compared to the obfuscated one. Previous approaches in this regard target specific types of obfuscations by making strong assumptions about the underlying protection scheme leaving opportunities for the adversaries to attack. This work addresses this limitation by proposing new program analysis techniques that are effective against code obfuscations while being generic by minimizing the assumptions about the underlying code. We found that standard program analysis techniques, including well-known data and control flow analyses and/or symbolic execution, suffer from imprecision due to the obfuscation and show how to mitigate this loss of precision. Using more precise program analysis techniques, we propose a deobfuscation technique that is successful in reversing the complex obfuscation techniques such as virtualization-obfuscation and/or Return-Oriented Programming (ROP).
In recent years there has been an increasing interest in dynamic taint tracing of compiled software as a powerful analysis method for security and other purposes. Most existing approaches are highly application specific and tends to sacrifice precision in favor of performance. In this thesis project a generic taint tracing tool has been developed that can deliver high precision taint information. By allowing an arbitrary number of taint labels to be stored for every tainted byte, accurate taint propagation can be achieved for values that are derived from multiple input bytes. The tool has been developed for x86 Linux systems using the dynamic binary instrumentation framework Valgrind. The basic theory of taint tracing and multi-label taint propagation is discussed, as well as the main concepts of implementing a taint tracing tool using dynamic binary instrumentation. The impact of multi-label taint propagation on performance and precision is evaluated. While multi-label taint propagation has a considerable impact on performance, experiments carried out using the tool show that large amounts of taint information is lost with approximate methods using only one label per tainted byte.
A Security Related and Evidence-Based Holistic Ranking and Composition Framework for Distributed ServicesChowdhury, Nahida Sultana 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / The number of smart mobile devices has grown at a significant rate in recent years. This growth has resulted in an exponential number of publicly available mobile Apps. To help the selection of suitable Apps, from various offered choices, the App distribution platforms generally rank/recommend Apps based on average star ratings, the number of installs, and associated reviews ― all the external factors of an App. However, these ranking schemes typically tend to ignore critical internal factors (e.g., bugs, security vulnerabilities, and data leaks) of the Apps. The AppStores need to incorporate a holistic methodology that includes internal and external factors to assign a level of trust to Apps. The inclusion of the internal factors will describe associated potential security risks. This issue is even more crucial with newly available Apps, for which either user reviews are sparse, or the number of installs is still insignificant. In such a scenario, users may fail to estimate the potential risks associated with installing Apps that exist in an AppStore. This dissertation proposes a security-related and evidence-based ranking framework, called SERS (Security-related and Evidence-based Ranking Scheme) to compare similar Apps. The trust associated with an App is calculated using both internal and external factors (i.e., security flaws and user reviews) following an evidence-based approach and applying subjective logic principles. The SERS is formalized and further enhanced in the second part of this dissertation, resulting in its enhanced version, called as E-SERS (Enhanced SERS). These enhancements include an ability to integrate any number of sources that can generate evidence for an App and consider the temporal aspect and reputation of evidence sources. Both SERS and E-SERS are evaluated using publicly accessible Apps from the Google PlayStore and the rankings generated by them are compared with prevalent ranking techniques such as the average star ratings and the Google PlayStore Rankings. The experimental results indicate that E-SERS provides a comprehensive and holistic view of an App when compared with prevalent alternatives. E-SERS is also successful in identifying malicious Apps where other ranking schemes failed to address such vulnerabilities. In the third part of this dissertation, the E-SERS framework is used to propose a trust-aware composition model at two different granularities. This model uses the trust score computed by E-SERS, along with the probability of an App belonging to the malicious category, as the desired attributes for selecting a composition as the two granularities. Finally, the trust-aware composition model is evaluated with the average star rating parameter and the trust score. A holistic approach, as proposed by E-SERS, to computer a trust score will benefit all kinds of Apps including newly published Apps that follow proper security measures but initially struggle in the AppStore rankings due to a lack of a large number of reviews and ratings. Hence, E-SERS will be helpful both to the developers and users. In addition, the composition model that uses such a holistic trust score will enable system integrators to create trust-aware distributed systems for their specific needs.
Kelkar, Soham P.
No description available.
Most of the people in the industrial world are using several web applications every day. Many of those web applications contain vulnerabilities that can allow attackers to steal sensitive data from the web application's users. One way to detect these vulnerabilities is to have a penetration tester examine the web application. A common way to train penetration testers to find vulnerabilities is to challenge them with realistic web applications that contain vulnerabilities. The penetration tester's assignment is to try to locate and exploit the vulnerabilities in the web application. Training on the same web application twice will not provide any new challenges to the penetration tester, because the penetration tester already knows how to exploit all the vulnerabilities in the web application. Therefore, a vast number of web applications and variants of web applications are needed to train on. This thesis describes a tool designed and developed to automatically generate vulnerable web applications. First a web application is prepared, so that the tool can generate a vulnerable version of the web application. The tool injects Cross Site Scripting (XSS) and Cross Site Request Forgery (CSRF) vulnerabilities in prepared web applications. Different variations of the same vulnerability can also be injected, so that different methods are needed to exploit the vulnerability depending on the variation. A purpose of the tool is that it should generate web applications which shall be used to train penetration testers, and some of the vulnerabilities the tool can inject, cannot be detected by current free web application vulnerability scanners, and would thus need to be detected by a penetration tester. To inject the vulnerabilities, the tool uses abstract syntax trees and taint analysis to detect where vulnerabilities can be injected in the prepared web applications. Tests confirm that web application vulnerability scanners cannot find all the vulnerabilities on the web applications which have been generated by the tool.
Recherche de vulnérabilités logicielles par combinaison d'analyses de code binaire et de frelatage (Fuzzing) / Software vulnerability research combining fuzz testing and binary code analysisBekrar, Sofia 10 October 2013 (has links)
Le frelatage (ou fuzzing) est l'une des approches les plus efficaces pour la détection de vulnérabilités dans les logiciels de tailles importantes et dont le code source n'est pas disponible. Malgré une utilisation très répandue dans l'industrie, les techniques de frelatage "classique" peuvent avoir des résultats assez limités, et pas toujours probants. Ceci est dû notamment à une faible couverture des programmes testés, ce qui entraîne une augmentation du nombre de faux-négatifs; et un manque de connaissances sur le fonctionnement interne de la cible, ce qui limite la qualité des entrées générées. Nous présentons dans ce travail une approche automatique de recherche de vulnérabilités logicielles par des processus de test combinant analyses avancées de code binaire et frelatage. Cette approche comprend : une technique de minimisation de suite de tests, pour optimiser le rapport entre la quantité de code testé et le temps d'exécution ; une technique d'analyse de couverture optimisée et rapide, pour évaluer l'efficacité du frelatage ; une technique d'analyse statique, pour localiser les séquences de codes potentiellement sensibles par rapport à des patrons de vulnérabilités; une technique dynamique d'analyse de teinte, pour identifier avec précision les zones de l'entrée qui peuvent être à l'origine de déclenchements de vulnérabilités; et finalement une technique évolutionniste de génération de test qui s'appuie sur les résultats des autres analyses, afin d'affiner les critères de décision et d'améliorer la qualité des entrées produites. Cette approche a été mise en œuvre à travers une chaîne d'outils intégrés et évalués sur de nombreuses études de cas fournies par l'entreprise. Les résultats obtenus montrent son efficacité dans la détection automatique de vulnérabilités affectant des applications majeures et sans accès au code source. / Fuzz testing (a.k.a. fuzzing) is one of the most effective approaches for discovering security vulnerabilities in large and closed-source software. Despite their wide use in the software industry, traditional fuzzing techniques suffer from a poor coverage, which results in a large number of false negatives. The other common drawback is the lack of knowledge about the application internals. This limits their ability to generate high quality inputs. Thus such techniques have limited fault detection capabilities. We present an automated smart fuzzing approach which combines advanced binary code analysis techniques. Our approach has five components. A test suite reduction technique, to optimize the ratio between the amount of covered code and the execution time. A fast and optimized code coverage measurement technique, to evaluate the fuzzing effectiveness. A static analysis technique, to locate potentially sensitive sequences of code with respect to vulnerability patterns. An origin-aware dynamic taint analysis technique, to precisely identify the input fields that may trigger potential vulnerabilities. Finally, an evolutionary based test generation technique, to produce relevant inputs. We implemented our approach as an integrated tool chain, and we evaluated it on numerous industrial case studies. The obtained results demonstrate its effectiveness in automatically discovering zero-day vulnerabilities in major closed-source applications. Our approach is relevant to both defensive and offensive security purposes.
Analyse du flot de contrôle multivariante : application à la détection de comportements des programmes / Multivariant control flow analysis : application to behavior detection in programsLaouadi, Rabah 14 December 2016 (has links)
Sans exécuter une application, est-il possible de prévoir quelle est la méthode cible d’un site d’appel ? Est-il possible de savoir quels sont les types et les valeurs qu’une expression peut contenir ? Est-il possible de déterminer de manière exhaustive l’ensemble de comportements qu’une application peut effectuer ? Dans les trois cas, la réponse est oui, à condition d’accepter une certaine approximation. Il existe une classe d’algorithmes − peu connus à l’extérieur du cercle académique − qui analysent et simulent un programme pour calculer de manière conservatrice l’ensemble des informations qui peuvent être véhiculées dans une expression.Dans cette thèse, nous présentons ces algorithmes appelés CFAs (acronyme de Control Flow Analysis), plus précisément l’algorithme multivariant k-l-CFA. Nous combinons l’algorithme k-l-CFA avec l’analyse de taches (taint analysis),qui consiste à suivre une donnée sensible dans le flot de contrôle, afin de déterminer si elle atteint un puits (un flot sortant du programme). Cet algorithme, en combinaison avec l’interprétation abstraite pour les valeurs, a pour objectif de calculer de manière aussi exhaustive que possible l’ensemble des comportements d’une application. L’un des problèmes de cette approche est le nombre élevé de faux-positifs, qui impose un post-traitement humain. Il est donc essentiel de pouvoir augmenter la précision de l’analyse en augmentant k.k-l-CFA est notoirement connu comme étant très combinatoire, sa complexité étant exponentielle dans la valeur de k. La première contribution de cette thèse est de concevoir un modèle et une implémentation la plus efficace possible, en séparant soigneusement les parties statiques et dynamiques de l’analyse, pour permettre le passage à l’échelle. La seconde contribution de cette thèse est de proposer une nouvelle variante de CFA basée sur k-l-CFA, et appelée *-CFA, qui consiste à faire du paramètre k une propriété de chaque variante, de façon à ne l’augmenter que dans les contextes qui le justifient.Afin d’évaluer l’efficacité de notre implémentation de k-l-CFA, nous avons effectué une comparaison avec le framework Wala. Ensuite, nous validons l’analyse de taches et la détection de comportements avec le Benchmark DroidBench. Enfin, nous présentons les apports de l’algorithme *-CFA par rapport aux algorithmes standards de CFA dans le contexte d’analyse de taches et de détection de comportements. / Without executing an application, is it possible to predict the target method of a call site? Is it possible to know the types and values that an expression can contain? Is it possible to determine exhaustively the set of behaviors that an application can perform? In all three cases, the answer is yes, as long as a certain approximation is accepted.There is a class of algorithms - little known outside of academia - that can simulate and analyze a program to compute conservatively all information that can be conveyed in an expression. In this thesis, we present these algorithms called CFAs (Control flow analysis), and more specifically the multivariant k-l-CFA algorithm.We combine k-l-CFA algorithm with taint analysis, which consists in following tainted sensitive data inthe control flow to determine if it reaches a sink (an outgoing flow of the program).This combination with the integration of abstract interpretation for the values, aims to identify asexhaustively as possible all behaviors performed by an application.The problem with this approach is the high number of false positives, which requiresa human post-processing treatment.It is therefore essential to increase the accuracy of the analysis by increasing k.k-l-CFA is notoriously known as having a high combinatorial complexity, which is exponential commensurately with the value of k.The first contribution of this thesis is to design a model and most efficient implementationpossible, carefully separating the static and dynamic parts of the analysis, to allow scalability.The second contribution of this thesis is to propose a new CFA variant based on k-l-CFA algorithm -called *-CFA - , which consists in keeping locally for each variant the parameter k, and increasing this parameter in the contexts which justifies it.To evaluate the effectiveness of our implementation of k-l-CFA, we make a comparison with the Wala framework.Then, we do the same with the DroidBench benchmark to validate out taint analysis and behavior detection. Finally , we present the contributions of *-CFA algorithm compared to standard CFA algorithms in the context of taint analysis and behavior detection.
Ahmed, Md Salman
11 January 2022
Proactive approaches for preventing attacks through security measurements are crucial for preventing sophisticated attacks. However, proactive measures must employ qualitative security metrics and systemic measurement methodologies to assess security guarantees, as some metrics (e.g., entropy) used for evaluating security guarantees may not capture the capabilities of advanced attackers. Also, many proactive measures (e.g., data pointer protection or data flow integrity) suffer performance bottlenecks. This dissertation identifies and represents attack vectors as metrics using the knowledge from advanced exploits and demonstrates the effectiveness of the metrics by quantifying attack surface and enabling ways to tune performance vs. security of existing defenses by identifying and prioritizing key attack vectors for protection. We measure attack surface by quantifying the impact of fine-grained Address Space Layout Randomization (ASLR) on code reuse attacks under the Just-In-Time Return-Oriented Programming (JITROP) threat model. We conduct a comprehensive measurement study with five fine-grained ASLR tools, 20 applications including six browsers, one browser engine, and 25 dynamic libraries. Experiments show that attackers only need several seconds (1.5-3.5) to find various code reuse gadgets such as the Turing Complete gadget set. Experiments also suggest that some code pointer leaks allow attackers to find gadgets more quickly than others. Besides, the instruction-level single-round randomization can restrict Turing Complete operations by preventing up to 90% of gadgets. This dissertation also identifies and prioritizes critical data pointers for protection to enable the capability to tune between performance vs. security. We apply seven rule-based heuristics to prioritize externally manipulatable sensitive data objects/pointers. Our evaluations using 33 ground truths vulnerable data objects/pointers show the successful detection of 32 ground truths with a 42% performance overhead reduction compared to AddressSanitizer. Our results also suggest that sensitive data objects are as low as 3%, and on average, 82% of data objects do not need protection for real-world applications. / Doctor of Philosophy / Proactive approaches for preventing attacks through security measurements are crucial to prevent advanced attacks because reactive measures can become challenging, especially when attackers enter sophisticated attack phases. A key challenge for the proactive measures is the identification of representative metrics and measurement methodologies to assess security guarantees, as some metrics used for evaluating security guarantees may not capture the capabilities of advanced attackers. Also, many proactive measures suffer performance bottlenecks. This dissertation identifies and represents attack elements as metrics using the knowledge from advanced exploits and demonstrates the effectiveness of the metrics by quantifying attack surface and enabling the capability to tune performance vs. security of existing defenses by identifying and prioritizing key attack elements. We measure the attack surface of various software applications by quantifying the available attack elements of code reuse attacks in the presence of fine-grained Address Space Layout Randomization (ASLR), a defense in modern operating systems. ASLR makes code reuse attacks difficult by making the attack components unavailable. We perform a comprehensive measurement study with five fine-grained ASLR tools, real-world applications, and libraries under an influential code reuse attack model. Experiments show that attackers only need several seconds (1.5-3.5) to find various code reuse elements. Results also show the influence of one attack element over another and one defense strategy over another strategy. This dissertation also applies seven rule-based heuristics to prioritize externally manipulatable sensitive data objects/pointers – a type of attack element – to enable the capability to tune between performance vs. security. Our evaluations using 33 ground truths vulnerable data objects/pointers show the successful identification of 32 ground truths with a 42% performance overhead reduction compared to AddressSanitizer, a memory error detector. Our results also suggest that sensitive data objects are as low as 3% of all objects, and on average, 82% of objects do not need protection for real-world applications.
Page generated in 0.1324 seconds