1 |
Generation of dynamic control-dependence graphs for binary programsPogulis, Jakob January 2014 (has links)
Dynamic analysis of binary files is an area of computer science that has many purposes. It is useful when it comes to debugging software in a development environment and the developer needs to know which statements affected the value of a specific variable. But it is also useful when analyzing a software for potential vulnerabilities, where data controlled by a malicious user could potentially result in the software executing adverse commands or executing malicious code. In this thesis a tool has been developed to perform dynamic analysis of x86 binaries in order to generate dynamic control-dependence graphs over the execution. These graphs can be used to determine which conditional statements that resulted in a certain outcome. The tool has been developed for x86 Linux systems using the dynamic binary instrumentation framework PIN, developed and maintained by Intel. Techniques for utilizing the additional information about the control flow for a program available during the dynamic analysis in order to improve the control flow information have been implemented and tested. The basic theory of dynamic analysis as well as dynamic slicing is discussed, and a basic overview of the implementation of a dynamic analysis tool is presented. The impact on the performance of the dynamic analysis tool for the techniques used to improve the control flow graph is significant, but approaches to improving the performance are discussed.
|
2 |
Program Dependence Graph Generation and Analysis for Source Code Plagiarism Detection / Generering och analys av programberoendegrafer för detektering av plagiat i källkodHolma, Niklas January 2012 (has links)
Systems and tools that finds similarities among essays and reports are widely used by todays universities and schools to detect plagiarism. Such tools are however insufficient when used for source code comparisons since they are fragile to the most simplest forms of diguises. Other methods that analyses intermediate forms such as token strings, syntax trees and graph representations have shown to be more effective than using simple textual matching methods. In this master thesis report we discuss how program dependence graphs, an abstract representation of a programs semantics, can be used to find similar procedures. We also present an implementation of a system that constructs approximated program dependence graphs from the abstract syntax tree representation of a program. Matching procedures are found by testing graph pairs for either sub-graph isomorphism or graph monomorphism depending on whether structured transfer of control has been used. Under a scenario based evaluation our system is compared to Moss, a popular plagiarism detection tool. The result shows that our system is more or least as effective than Moss in finding plagiarized procedured independently on the type of modifications used. / System och verktyg som hittar likheter mellan uppsatser och rapporter används i stor omfattning av dagens universitet och skolor för att hitta plagiat bland studenters inlämningar. Sådana verktyg är dock otillräckliga när de används för att jämföra programkod eftersom de är svaga mot de enklaste formerna av modifikationer. Andra metoder som analyserar mellanstegsformer såsom tokensträngar, syntaxträd och grafrepresentationer har visat sig vara mer effektiva än att använda sig av enkla textuella metoder. I denna examensuppsats diskuterar vi hur programberoendegrafer, en abstrakt representation av en programs semantik, kan användas för att hitta jämförelsevis liknande procedurer. Vi presenterar också ett system som konstruerar approximerade programberoendegrafer från det abstrakta syntaxträdet av ett program. Matchande procedurer hittas genom att testa grafpar för antingen sub-graf isomorfism eller monomorfism beroende på om strukturerad byte av kontrolflöde har använts. I en scenariobaserad utvärdering jämför vi vårt system mot Moss, ett populärt verktyg för att detektera plagiat. Resultaten visar att vårt system är lika eller mer effektivt som Moss att detektera plagierade procedurer oberoende av de typer av modifikationer som använts.
|
3 |
A Static Slicing Tool for Sequential Java ProgramsDevaraj, Arvind January 2007 (has links) (PDF)
A program slice consists of a subset of the statements of a program that can potentially affect values computed at some point of interest. Such a point of interest along with a set of variables is called a slicing criterion. Slicing tools are useful for several applications, such as program understanding, testing, program integration, and so forth. Slicing object oriented programs has some special problems that need to be addressed due to features like inheritance, polymorphism and dynamic binding. Alias analysis is important for precision of slices. In this thesis we implement a slicing tool for sequential Java programs in the Soot framework. Soot is a front-end for Java developed at McGill University and it provides several forms of intermediate code. We have integrated the slicer into the framework. We also propose an improved technique for intraprocedural points-to analysis. We have implemented this technique and compare the results of the analysis with those for a flow-insensitive scheme in Soot. Performance results of the slicer are reported for several benchmarks.
|
4 |
Dependence analysis for inferring information flow properties in Spark ADA programsThiagarajan, Hariharan January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / John Hatcliff / With the increase in development of safety and security critical systems, it is important to have more sophisticated methods for engineering such systems. It can be difficult to understand and verify critical properties of these systems because of their ever growing size and complexity. Even a small error in a complex system may result in casualty or significant monetary loss. Consequently, there is a rise in the demand for scalable and accurate techniques to enable faster development and verification of high assurance systems.
This thesis focuses on discovering dependencies between various parts of a system and leveraging that knowledge to infer information flow properties and to verify security policies specified for the system. The primary contribution of this thesis is a technique to build dependence graphs for languages which feature abstraction and refinement. Inter-procedural slicing and inter-procedural chopping are the techniques used to analyze the properties of the system statically.
The approach outlined in this thesis provides a domain-specific language to query the information flow properties and to specify security policies for a critical system. The spec- ified policies can then be verified using existing static analysis techniques. All the above contributions are integrated with a development environment used to develop the critical system. The resulting software development tool helps programmers develop, infer, and verify safety and security systems in a single unified environment.
|
5 |
Automatic Generation of Simulation Models from DesignsAxling, Erik January 2007 (has links)
<p>When working with embedded systems, secure and fast applications are desired. To achieve this the applications needs to be analyzed and optimized so that they will not be deadlocked or communicate inefficiently. For this purpose an analysis program that can track communications, deadlocks and response times is needed. Operating System Embedded, OSE, is a wide spread real-time operating system that is used in embedded systems. OSE-applications are excellent candidates for analysis and there exists such a tool, VirtualTime, for that purpose. To analyze an OSE-application a model needs to be written that VirtualTime can analyze. This takes up time and effort as the models can require a lot of work to write.</p><p>In this thesis we have investigated and implemented a prototype that translates OSE-application code into VirtualTime simulation model code. We used the transformation tool TXL to translate communication and timing behaviors. In the translation one needs to preserve the communication and timing behavior and throw away other unnecessary code in the OSE-application. This complicates the translation and sophisticated methods like backward slicing might be necessary. A proposed method in this thesis could help with the problem.</p>
|
6 |
Software Architecture-Based Failure PredictionMohamed, ATEF 28 September 2012 (has links)
Depending on the role of software in everyday life, the cost of a software failure can sometimes be unaffordable. During system execution, errors may occur in system components and failures may be manifested due to these errors. These errors differ with respect to their effects on system behavior and consequent failure manifestation manners. Predicting failures before their manifestation is important to assure system resilience. It helps avoid the cost of failures and enables systems to perform corrective actions prior to failure occurrences. However, effective runtime error detection and failure prediction techniques encounter a prohibitive challenge with respect to the control flow representation of large software systems with intricate control flow structures. In this thesis, we provide a technique for failure prediction from runtime errors of large software systems. Aiming to avoid the possible difficulties and inaccuracies of the existing Control Flow Graph (CFG) structures, we first propose a Connection Dependence Graph (CDG) for control flow representation of large software systems. We describe the CDG structure and explain how to derive it from program source code. Second, we utilize the proposed CDG to provide a connection-based signature approach for control flow error detection. We describe the monitor structure and present the error checking algorithm. Finally, we utilize the detected errors and erroneous state parameters to predict failure occurrences and modes during system runtime. We craft runtime signatures based on these errors and state parameters. Using system error and failure history, we determine a predictive function (an estimator) for each failure mode based on these signatures. Our experimental evaluation for these techniques uses a large open-source software (PostgreSQL 8.4.4 database system). The results show highly efficient control flow representation, error detection, and failure prediction techniques. This work contributes to software reliability by providing a simple and accurate control flow representation and utilizing it to detect runtime errors and predict failure occurrences and modes with high accuracy. / Thesis (Ph.D, Computing) -- Queen's University, 2012-09-25 23:44:12.356
|
7 |
A Study of Backward Compatible Dynamic Software UpdateJanuary 2015 (has links)
abstract: Dynamic software update (DSU) enables a program to update while it is running. DSU aims to minimize the loss due to program downtime for updates. Usually DSU is done in three steps: suspending the execution of an old program, mapping the execution state from the old program to a new one, and resuming execution of the new program with the mapped state. The semantic correctness of DSU depends largely on the state mapping which is mostly composed by developers manually nowadays. However, the manual construction of a state mapping does not necessarily ensure sound and dependable state mapping. This dissertation presents a methodology to assist developers by automating the construction of a partial state mapping with a guarantee of correctness.
This dissertation includes a detailed study of DSU correctness and automatic state mapping for server programs with an established user base. At first, the dissertation presents the formal treatment of DSU correctness and the state mapping problem. Then the dissertation presents an argument that for programs with an established user base, dynamic updates must be backward compatible. The dissertation next presents a general definition of backward compatibility that specifies the allowed changes in program interaction between an old version and a new version and identified patterns of code evolution that results in backward compatible behavior. Thereafter the dissertation presents formal definitions of these patterns together with proof that any changes to programs in these patterns will result in backward compatible update. To show the applicability of the results, the dissertation presents SitBack, a program analysis tool that has an old version program and a new one as input and computes a partial state mapping under the assumption that the new version is backward compatible with the old version.
SitBack does not handle all kinds of changes and it reports to the user in incomplete part of a state mapping. The dissertation presents a detailed evaluation of SitBack which shows that the methodology of automatic state mapping is promising in deal with real world program updates. For example, SitBack produces state mappings for 17-75% of the changed functions. Furthermore, SitBack generates automatic state mapping that leads to successful DSU. In conclusion, the study presented in this dissertation does assist developers in developing state mappings for DSU by automating the construction of state mappings with a correctness guarantee, which helps the adoption of DSU ultimately. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2015
|
8 |
Automatic Generation of Simulation Models from DesignsAxling, Erik January 2007 (has links)
When working with embedded systems, secure and fast applications are desired. To achieve this the applications needs to be analyzed and optimized so that they will not be deadlocked or communicate inefficiently. For this purpose an analysis program that can track communications, deadlocks and response times is needed. Operating System Embedded, OSE, is a wide spread real-time operating system that is used in embedded systems. OSE-applications are excellent candidates for analysis and there exists such a tool, VirtualTime, for that purpose. To analyze an OSE-application a model needs to be written that VirtualTime can analyze. This takes up time and effort as the models can require a lot of work to write. In this thesis we have investigated and implemented a prototype that translates OSE-application code into VirtualTime simulation model code. We used the transformation tool TXL to translate communication and timing behaviors. In the translation one needs to preserve the communication and timing behavior and throw away other unnecessary code in the OSE-application. This complicates the translation and sophisticated methods like backward slicing might be necessary. A proposed method in this thesis could help with the problem.
|
9 |
Cloneless: Code Clone Detection via Program Dependence Graphs with Relaxed ConstraintsSimko, Thomas J 01 June 2019 (has links) (PDF)
Code clones are pieces of code that have the same functionality. While some clones may structurally match one another, others may look drastically different. The inclusion of code clones clutters a code base, leading to increased costs through maintenance. Duplicate code is introduced through a variety of means, such as copy-pasting, code generated by tools, or developers unintentionally writing similar pieces of code. While manual clone identification may be more accurate than automated detection, it is infeasible due to the extensive size of many code bases. Software code clone detection methods have differing degree of success based on the analysis performed. This thesis outlines a method of detecting clones using a program dependence graph and subgraph isomorphism to identify similar subgraphs, ultimately illuminating clones. The project imposes few constraints when comparing code segments to potentially reveal more clones.
|
10 |
Program Slicing for Modern Programming LanguagesGalindo Jiménez, Carlos Santiago 24 September 2025 (has links)
[ES] Producir software eficiente y efectivo es una tarea que parece ser tan difícil ahora como lo era para los primeros ordenadores. Con cada mejora de hardware y herramientas de desarrollo (como son compiladores y analizadores), la demanda de producir software más rápido y más complejo ha ido aumentando. Por tanto, todos estos análisis auxiliares ahora son una parte integral del desarrollo de programas complejos.
La fragmentación de programas es una técnica de análisis estático, que da respuesta a ¿Qué partes del programa pueden afectar a esta instrucción? Su aplicación principal es la depuración de programas, porque puede acotar la zona de código a la que el programador debe prestar atención mientras busca la causa de un error. También tiene otras muchas aplicaciones, como pueden ser la paralelización y especialización de programas, la comprensión de programas y el mantenimiento. En los últimos años, su uso más común ha sido como preproceso a otros análisis con alto coste computacional, para reducir el tamaño del programa a procesar, y, por tanto, el tiempo de ejecución de estos. La estructura de datos más popular para fragmentar programas es el system dependence graph (SDG), un grafo dirigido que representa las instrucciones de un programa como vértices, y sus dependencias como arcos. Los dos tipos principales de dependencias son las de control y las de datos, que encapsulan el flujo de control y datos en todas las ejecuciones posibles de un programa.
El área de lenguajes de programación está en eterno cambio, ya sea por la aparición de nuevos lenguajes o por el lanzamiento de nuevas características en lenguajes existentes, como pueden ser Java o Erlang. Sin embargo, la fragmentación de programas se definió originalmente para el paradigma imperativo. Aun así, hay características populares en lenguajes imperativos, como las arrays y las excepciones, que aún no tienen una representación eficiente y/o completa en el SDG. Otros paradigmas, como el funcional o el orientado a objetos, sufren también de un soporte parcial en el SDG.
Esta tesis presenta mejoras para construcciones comunes en la programación moderna, dividiendo contribuciones en las enfocadas a dependencias de control y las enfocadas a datos. Para las primeras, especificamos una nueva representación de instrucciones catch, junto a una descripción completa del resto de instrucciones relacionadas con excepciones. También analizamos las técnicas punteras para saltos incondicionales (p.e., break), y mostramos los riesgos de combinarlas con otras técnicas para objetos, llamadas o excepciones. A continuación, ponemos nuestra mirada en la concurrencia, con una formalización de un depurador de especificaciones CSP reversible y causal-consistente. En cuanto a las dependencias de datos, se enfocan en técnicas sensibles al contexto (es decir, más precisas en presencia de rutinas y sus llamadas). Exploramos las dependencias de datos generadas en programas concurrentes por memoria compartida, redefiniendo las dependencias de interferencia para hacerlas sensibles al contexto. A continuación, damos un pequeño rodeo por el campo de la indecidibilidad, en el que demostramos que ciertos tipos de análisis de datos sobre programas con estructuras de datos complejas son indecidibles. Finalmente, ampliamos un trabajo previo sobre la fragmentación de estructuras de datos complejas, combinándolo con la fragmentación tabular, que la hace sensible al contexto.
Además, se han desarrollado o extendido múltiples librerías de código con las mejoras mencionadas anteriormente. Estas librerías nos han permitido realizar evaluaciones empíricas para algunos de los capítulos, y también han sido publicadas bajo licencias libres, que permiten a otros desarrolladores e investigadores extenderlas y contrastarlas con sus propuestas, respectivamente. Las herramientas resultantes son dos fragmentadores de código para Java y Erlang, y un depurador de CSP reversible y causal-consistente. / [CA] La producció de programari eficient i eficaç és una tasca que resulta tan difícil hui dia com ho va ser durant l'adveniment dels ordinadors. Per cada millora de maquinari i ferramentes per al desenvolupament, augmenta sovint la demanda de programes, així com la seua complexitat. Com a conseqüència, totes aquestes anàlisis auxiliars esdevenen una part integral del desenvolupament de programari.
La fragmentació de programes és una tècnica d'anàlisi estàtica, que respon a "Quines parts d'aquest programa poden afectar a aquesta instrucció?". L'aplicació principal d'aquesta tècnica és la depuració de programes, per la seua capacitat de reduir la llargària d'un programa sense canviar el seu funcionament respecte a una instrucció que està fallant, delimitant així l'àrea del codi en què el programador busca l'origen de l'errada. Tot i això, té moltes altres aplicacions, com la paral·lelització i especialització de programes o la comprensió de programes i el seu manteniment. Durant els darrers anys, l'ús més freqüent de la fragmentació de programes ha sigut com a <<preprocés>> abans d'altres anàlisis amb un alt cost computacional, per tal de reduir-ne el temps requerit per realitzar-les. L'estructura de dades més popular per fragmentar programes és el system dependence graph (SDG), un graf dirigit representant-ne les instruccions d'un programa amb vèrtexs i les seues dependències amb arcs. Els dos tipus principals de dependència són el de control i el de dades, aquests encapsulen el flux de control i dades a totes les possibles execucions d'un programa.
L'àrea dels llenguatges de programació s'hi troba en constant evolució, o bé per l'aparició de nous llenguatges, o bé per noves característiques per als preexistents, com poden ser Java o Erlang. No obstant això, la fragmentació de programes s'hi va definir originalment per al paradigma imperatiu. Tot i que, també hi trobem característiques populars als llenguatges imperatius, com els arrays i les excepcions, que encara no en tenen una representació eficient i/o completa al SDG. Altres paradigmes, com el funcional o l'orientat a objectes, pateixen també d'un suport reduit al SDG.
Aquesta tesi presenta millores per a construccions comunes de la programació moderna, dividint les contribucions entre aquelles enfocades a les dependències de control i aquelles enfocades a dades. Per a les primeres, hi especifiquem una nova representació d'instruccions catch, junt amb una descripció de la resta d'instruccions relacionades amb excepcions. També hi analitzem les tècniques capdavanteres de fragmentació de salts incondicionals, i hi mostrem els riscs de combinar-ne-les amb altres tècniques per a objectes, instruccions de crida i excepcions. A continuació, hi posem la nostra atenció en la concurrència, amb una formalització d'un depurador d'especificacions CSP reversible i causal-consistent. Respecte a les dependències de dades, dirigim els nostres esforços a produir tècniques sensibles al context (és a dir, que es mantinguen precises en presència de procediments). Hi explorem les dependències de dades generades en programes concurrents amb memòria compartida, redefinint-ne les dependències d'interferència per a fer-ne-les sensibles al context. Seguidament, hi demostrem la indecidibilitat d'alguns tipus d'anàlisis de dades per a programes amb estructures de dades complexes. Finalment, hi ampliem un treball previ sobre la fragmentació d'estructures de dades complexes, combinant-lo amb la fragmentació tabular, fent-hi-la sensible al context.
A més a més, s'han desenvolupat o estés diverses llibreries de codi amb les millores esmentades prèviament. Aquestes llibreries ens han permés avaluar empíricament alguns dels capítols i també han sigut publicades sota llicències lliures, fet que permet a altres desenvolupadors i investigadors poder estendre-les i contrastar-les, respectivament. Les ferramentes resultants són dos fragmentadors de codi per a Java i Erlang, i un depurador CSP. / [EN] Producing efficient and effective software is a task that has remained difficult since the advent of computers. With every improvement on hardware and developer tooling (e.g., compilers and checkers), the demand for software has increased even further. This means that auxiliary analyses have become integral in developing complex software systems.
Program slicing is a static analysis technique that gives answers to "What parts of the program can affect a given statement?", and similar questions. Its main application is debugging, as it can reduce the amount of code on which a programmer must look for a mistake or bug. Other applications include program parallelization and specialisation, program comprehension, and software maintenance. Lately, it has mostly been applied as a pre-processing step in other expensive static analyses, to lower the size of the program and thus the analyses' runtime. The most popular data structure in program slicing is the system dependence graph (SDG), which represents statements as nodes and dependences as arcs between them. The two main types of dependences are control and data dependences, which encapsulate the control and data flow throughout every possible execution of a program.
Programming languages are an ever-expanding subject, with new features coming to new releases of popular and up-and-coming languages like Python, Java, Erlang, Rust, and Go. However, program slicing was originally defined for (and has been mostly focused on) imperative programming languages. Even then, some popular elements of the imperative paradigm, such as arrays and exceptions do not have an efficient or sometimes complete representation in the SDG. Other paradigms, such as functional or object-oriented also suffer from partial support in the SDG.
This thesis presents improvements for common programming constructs, and its contributions are split into control and data dependence. For the former, we (i) specify a new representation of catch statements, along with a full description of other exception-handling constructs. We also (ii) analyse the current state-of-the-art technique for unconditional jumps (e.g., break or return), and show the risks of combining it with other popular techniques. Then, we focus on concurrency, with a (iii) formalisation of a reversible, causal-consistent debugger for CSP specifications. Switching to data dependences, we focus our contributions on making existing techniques context-sensitive (i.e., more accurate in the presence of routines or functions). We explore the data dependences involved in shared-memory concurrent programs, (iv) redefining interference dependence to make it context-sensitive. Afterwards, we take a small detour to (v) explore the decidability of various data analyses on programs with (and without) complex data structures and routine calls. Finally, we (vi) extend our previous work on slicing complex data structures to combine it with tabular slicing, which provides context-sensitivity.
Additionally, throughout this thesis, multiple supporting software libraries have been written or extended with the aforementioned improvements to program slicing. These have been used to provide empirical evaluations, and are available under libre software licenses, such that other researchers and software developers may extend or contrast them against their own proposals. The resulting tools are two program slicers for Java and Erlang, and a causal-consistent reversible debugger for CSP. / Galindo Jiménez, CS. (2024). Program Slicing for Modern Programming Languages [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/211183
|
Page generated in 0.0267 seconds