Global ETD Search

1	Searching the space of representations : reasoning through transformations for mathematical problem solving Raggi, Daniel January 2016 (has links) The role of representation in reasoning has been long and widely regarded as crucial. It has remained one of the fundamental considerations in the design of information-processing systems and, in particular, for computer systems that reason. However, the process of change and choice of representation has struggled to achieve a status as a task for the systems themselves. Instead, it has mostly remained a responsibility for the human designers and programmers. Many mathematical problems have the characteristic of being easy to solve only after a unique choice of representation has been made. In this thesis we examine two classes of problems in discrete mathematics which follow this pattern, in the light of automated and interactive mechanical theorem provers. We present a general notion of structural transformation, which accounts for the changes of representation seen in such problems, and link this notion to the existing Transfer mechanism in the interactive theorem prover Isabelle/HOL. We present our mechanisation in Isabelle/HOL of some specific transformations identified as key in the solutions of the aforementioned mathematical problems. Furthermore, we present some tools that we developed to extend the functionalities of the Transfer mechanism, designed with the specific purpose of searching efficiently the space of representations using our set of transformations. We describe some experiments that we carried out using these tools, and analyse these results in terms of how close the tools lead us to a solution, and how desirable these solutions are. The thorough qualitative analysis we present in this thesis reveals some promise as well as some challenges for the far-reaching problem of representation in reasoning, and the automation of the processes of change and choice of representation. 511
2	The Logic of Hereditary Harrop Formulas as a Specification Logic for Hybrid Battell, Chelsea January 2016 (has links) Hybrid is a two-level logical framework that supports higher-order abstract syntax (HOAS), where a specification logic (SL) extends the class of object logics (OLs) we can reason about. We develop a new Hybrid SL and formalize its metatheory, proving weakening, contraction, exchange, and cut admissibility; results that greatly simplify reasoning about OLs in systems providing HOAS. The SL is a sequent calculus defined as an inductive type in Coq and we prove properties by structural induction over SL sequents. We also present a generalized SL and metatheory statement, allowing us to prove many cases of such theorems in a general way and understand how to identify and prove the difficult cases. We make a concrete and measurable improvement to Hybrid with the new SL formalization and provide a technique for abstracting such proofs, leading to a condensed presentation, greater understanding, and a generalization that may be instantiated to other logics. cut admissibility structural rules interactive theorem proving inductive reasoning Coq logical frameworks higher-order abstract syntax
3	Low-Level Static Analysis for Memory Usage and Control Flow Recovery Bockenek, Joshua Alexander 07 March 2023 (has links) Formal characterization of the memory used by a program is an important basis for security analyses, compositional verification, and identification of noninterference. However, soundly proving memory usage requires operating on the assembly level due to the semantic gap between high-level languages and the code that processors actually execute. Automated methods, such as model checking, would not be able to handle many interesting functions due to the undecidability of memory usage. Fully-interactive methods do not scale well either. Sound control flow recovery (CFR) is also important for binary decompilation, verification, patching, and security analysis. It lifts raw unstructured data into a form that allows reasoning over behavior and semantics. However, doing so requires interpreting the behavior of the program when indirect or dynamic control flow exists, creating a recursive dependency. This dissertation tackles the first property with two contributions that perform proof generation combined with interactive theorem proving in a semi-automated manner: an untrusted tool extracts as much information as it can from the functions under test and then generates all the necessary proofs to be completed in a theorem prover. The first, Floyd-style approach still requires significant manual effort but provides good flexibility and ensures no paths are analyzed more than once. In contrast, the second, Hoare-style approach sacrifices some flexibility and avoidance of repeated path evaluation in order to achieve much greater automation. However, neither approach can handle the dynamic control flow caused by indirect branching. The second property is handled by the second set of contributions of this dissertation. These two contributions provide fully-automated methods of recovering control flow from binaries even in the presence of indirect branching. When such dynamic control flow cannot be overapproximatively resolved, it is clearly noted in the resultant output. In the first approach to control flow recovery, a structured memory representation allows for general analysis of control flow in the presence of indirection, gaining scalability by utilizing context-free function analysis. It supports various aliasing conditions via the usage of nondeterminism, with multiple output states potentially being produced from a given input state. The second approach adds function context and abstract interpretation-inspired modeling of the C++ exception handling (EH) application binary interface (ABI), allowing for the discovery of previously-unknown paths while maintaining or increasing automation. / Doctor of Philosophy / Modern computer programs are so complicated that individual humans cannot manually check all but the smallest programs to make sure they are correct and secure. This is even worse if you want to reduce the trusted computing base (TCB), the stuff that you have to assume is working right in order to say a program will execute correctly. The TCB includes your computer itself, but also whatever tools were used to take the programs written by programmers and transform them into a form suitable for running on a computer. Such tools are often called compilers. One method of reducing the TCB is to examine the lowest-level representation of that program, the assembly or even machine code that is actually run by your computer. This poses unique challenges, because operating on such a low level means you do not have a lot of the structure that a more abstract, higher-level representation provides. Also, sometimes you want to formally state things about a program's behavior; that is, say things about what it does with a high degree of confidence based on mathematical principles. You may also want to verify that one or more of those statements are true. If you want to be detailed about that behavior, you may need to know all of the chunks, or regions, in random-access memory (RAM) that are used by that program. RAM, henceforth referred to as just ``memory'', is your computer's first place of storage for the information used by running programs. This is distinct from long-term storage devices like hard disk drives (HDDs) or solid-state drives (SSDs), which programs do not normally have direct access to. Unfortunately, there is no one single approach that can automatically determine with absolute certainty for all cases the exact regions of memory that are read or written. This is called undecidability, and means that you need to approximate those memory regions a lot of the time if you want to have a significant degree of automation. An underapproximation, an approach that only gives you some of the regions, is not useful for formal statements as it might miss out on some behavior; it is unsound. This means that you need an overapproximation, an approach that is guaranteed to give you at least the regions read or written. Therefore, the first contribution of this dissertation is a preliminary approach to such an overapproximation. This approach is based on the work of Robert L. Floyd, focusing on the direct control flow (where the steps of a program go) in an individual function (structured program component). It still requires a lot of user effort, including having to manually specify the regions in memory that were possibly used and do a lot of work to prove that those regions are (overapproximatively) correct, so our tests were limited in scope. The second contribution automated a lot of the manual work done for the first approach. It is based on the work of Charles Antony Richard Hoare, who developed a verification approach focusing on the syntax (the textual form) of programs. This contribution produces what we call formal memory usage certificates (FMUCs), which are formal statements that the regions of memory they describe are the only ones possibly affected by the functions under test. These statements also come with proofs, which for our work are like scripts used to verify that the things the FMUCs assert about the corresponding functions can be shown to be true given the assumptions our FMUCs have. Sometimes those proofs are incomplete, though, such as when there is a loop (repeated bit of code) in a function under test or one function calls (executes) another. In those cases, a user has to finish the proof, in the first case by weakening (removing information from) the FMUC's statements about the loop and in the second by composing, or combining, the FMUCs of the two functions. Additionally, this second approach cannot handle dynamic control flow. Such control flow occurs when the low-level instructions a program uses to move to another place in that program do not have a pre-stored location to go to. Instead, that location is supplied as the program is running. This is opposed to direct control flow, where the place to go to is hard-coded into the program when it is compiled. The tool also cannot not deal with aliasing, which is when different state parts (value-holding components) of a program contain the same value and that value is used as the numeric address or identifier of a location in memory. Specifically, it cannot deal with potential aliasing, when there is not enough information available to determine if the state parts alias or not. Because of that, we had to add extra assumptions to the FMUCs that limited them to those cases where ambiguous memory-referencing state parts referred to separate memory locations. Finally, it specifically requires assembly as input; you cannot directly supply a binary to it. This is also true of the first contribution. Because of this, we were able to test on more functions than before, but not a lot more. Not being able to deal with dynamic control flow is a big problem, as almost all programs use it. For example, when a function reaches its end, it has to figure out where to return to based on the current state of the program (in the previous contribution, this was done manually). This means that control flow recovery (CFR) is very important for many applications, including decompilation (converting a program back into a higher-level form), patching (updating a program in place without modifying the original code and recompiling it), and low-level analysis or verification in general. However, as you may have noticed from earlier in this paragraph, in order to deal with such dynamic control flow you need to figure out what the possible destinations are for the individual control flow transfers. That can require knowing where you came from in the program, which means that analysis of dynamic control flow requires context (in this context, information previously obtained in the program). Even worse, it is another undecidable problem that requires overapproximation. To soundly recover control flow, we developed Hoare graphs (HGs), the third contribution of this dissertation. HGs use memory models that take the form of forests, or collections of tree data structures. A single tree represents a region in memory that may have multiple symbolic references, or abstract representations of a value. The children of the tree represent regions used in the program that are enclosed within their parent tree elements. Now, instead of assuming that all ambiguous memory regions are separate, we can use them under various aliasing conditions. We have also implemented support for some forms of dynamic control flow. Those that are not supported are clearly marked in the resultant HG. No user interaction is required even when loops are present thanks to a methodology that automatically reduces the amount of information present at a re-executed instruction until the information stabilizes. Function composition is also automatic now thanks to a method that treats each function as its own context in a safe and automated way, reducing memory consumption of our tool and allowing larger programs to be examined. In the process we did lose the ability to deal with recursion (functions that call themselves or call other functions that call back to the original), though. Lastly, we provided the ability to directly load binaries into the tool, no external disassembly (converting machine code into human-readable instructions) needed. This all allowed much greater testing than before, with applications to multiple programs and program libraries. The fourth and final contribution of this dissertation iterates on the HG work by narrowing focus to the concept of exceptional control flow. Specifically, it models the kind of exception handling used by C++ programs. This is important as, if you want to explore a program's behavior, you need to know all the places it goes to. If you use a tool that does not model exception handling, you may end up missing paths of execution caused by unwinding. This is when an exception is thrown and propagates up through the program's current stack of function calls, potentially reaching programmer-supplied handling for that exception. Despite this, commonplace tools for static, low-level program analysis do not model such unwinding. The control flow graph (CFG) produced by our exception-aware tool are called exceptional interprocedural control flow graphs (EICFGs). These provide information about the exceptions being thrown and what paths they take in the program when they are thrown. Additional improvements are a better methodology for handling dynamic control flow as well adding back in support for recursion. All told, this allowed us to explore even more programs than ever before. Formal Verification x86-64 Assembly Interactive Theorem Proving Static Binary Analysis Memory Usage Control Flow Recovery Exception Handling
4	Reasoning Using Higher-Order Abstract Syntax in a Higher-Order Logic Proof Environment: Improvements to Hybrid and a Case Study Martin, Alan J. 24 January 2011 (has links) We present a series of improvements to the Hybrid system, a formal theory implemented in Isabelle/HOL to support specifying and reasoning about formal systems using higher-order abstract syntax (HOAS). We modify Hybrid's type of terms, which is built definitionally in terms of de Bruijn indices, to exclude at the type level terms with `dangling' indices. We strengthen the injectivity property for Hybrid's variable-binding operator, and develop rules for compositional proof of its side condition, avoiding conversion from HOAS to de Bruijn indices. We prove representational adequacy of Hybrid (with these improvements) for a lambda-calculus-like subset of Isabelle/HOL syntax, at the level of set-theoretic semantics and without unfolding Hybrid's definition in terms of de Bruijn indices. In further work, we prove an induction principle that maintains some of the benefits of HOAS even for open terms. We also present a case study of the formalization in Hybrid of a small programming language, Mini-ML with mutable references, including its operational semantics and a type-safety property. This is the largest case study in Hybrid to date, and the first to formalize a language with mutable references. We compare four variants of this formalization based on the two-level approach adopted by Felty and Momigliano in other recent work on Hybrid, with various specification logics (SLs), including substructural logics, formalized in Isabelle/HOL and used in turn to encode judgments of the object language. We also compare these with a variant that does not use an intermediate SL layer. In the course of the case study, we explore and develop new proof techniques, particularly in connection with context invariants and induction on SL statements. formal proof higher-order abstract syntax higher-order logic Hybrid interactive theorem proving Isabelle/HOL representational adequacy set-theoretic semantics variable binding
5	Reasoning Using Higher-Order Abstract Syntax in a Higher-Order Logic Proof Environment: Improvements to Hybrid and a Case Study Martin, Alan J. 24 January 2011 (has links) We present a series of improvements to the Hybrid system, a formal theory implemented in Isabelle/HOL to support specifying and reasoning about formal systems using higher-order abstract syntax (HOAS). We modify Hybrid's type of terms, which is built definitionally in terms of de Bruijn indices, to exclude at the type level terms with `dangling' indices. We strengthen the injectivity property for Hybrid's variable-binding operator, and develop rules for compositional proof of its side condition, avoiding conversion from HOAS to de Bruijn indices. We prove representational adequacy of Hybrid (with these improvements) for a lambda-calculus-like subset of Isabelle/HOL syntax, at the level of set-theoretic semantics and without unfolding Hybrid's definition in terms of de Bruijn indices. In further work, we prove an induction principle that maintains some of the benefits of HOAS even for open terms. We also present a case study of the formalization in Hybrid of a small programming language, Mini-ML with mutable references, including its operational semantics and a type-safety property. This is the largest case study in Hybrid to date, and the first to formalize a language with mutable references. We compare four variants of this formalization based on the two-level approach adopted by Felty and Momigliano in other recent work on Hybrid, with various specification logics (SLs), including substructural logics, formalized in Isabelle/HOL and used in turn to encode judgments of the object language. We also compare these with a variant that does not use an intermediate SL layer. In the course of the case study, we explore and develop new proof techniques, particularly in connection with context invariants and induction on SL statements. formal proof higher-order abstract syntax higher-order logic Hybrid interactive theorem proving Isabelle/HOL representational adequacy set-theoretic semantics variable binding
6	Reasoning Using Higher-Order Abstract Syntax in a Higher-Order Logic Proof Environment: Improvements to Hybrid and a Case Study Martin, Alan J. 24 January 2011 (has links) We present a series of improvements to the Hybrid system, a formal theory implemented in Isabelle/HOL to support specifying and reasoning about formal systems using higher-order abstract syntax (HOAS). We modify Hybrid's type of terms, which is built definitionally in terms of de Bruijn indices, to exclude at the type level terms with `dangling' indices. We strengthen the injectivity property for Hybrid's variable-binding operator, and develop rules for compositional proof of its side condition, avoiding conversion from HOAS to de Bruijn indices. We prove representational adequacy of Hybrid (with these improvements) for a lambda-calculus-like subset of Isabelle/HOL syntax, at the level of set-theoretic semantics and without unfolding Hybrid's definition in terms of de Bruijn indices. In further work, we prove an induction principle that maintains some of the benefits of HOAS even for open terms. We also present a case study of the formalization in Hybrid of a small programming language, Mini-ML with mutable references, including its operational semantics and a type-safety property. This is the largest case study in Hybrid to date, and the first to formalize a language with mutable references. We compare four variants of this formalization based on the two-level approach adopted by Felty and Momigliano in other recent work on Hybrid, with various specification logics (SLs), including substructural logics, formalized in Isabelle/HOL and used in turn to encode judgments of the object language. We also compare these with a variant that does not use an intermediate SL layer. In the course of the case study, we explore and develop new proof techniques, particularly in connection with context invariants and induction on SL statements. formal proof higher-order abstract syntax higher-order logic Hybrid interactive theorem proving Isabelle/HOL representational adequacy set-theoretic semantics variable binding
7	Reasoning Using Higher-Order Abstract Syntax in a Higher-Order Logic Proof Environment: Improvements to Hybrid and a Case Study Martin, Alan J. January 2010 (has links) We present a series of improvements to the Hybrid system, a formal theory implemented in Isabelle/HOL to support specifying and reasoning about formal systems using higher-order abstract syntax (HOAS). We modify Hybrid's type of terms, which is built definitionally in terms of de Bruijn indices, to exclude at the type level terms with `dangling' indices. We strengthen the injectivity property for Hybrid's variable-binding operator, and develop rules for compositional proof of its side condition, avoiding conversion from HOAS to de Bruijn indices. We prove representational adequacy of Hybrid (with these improvements) for a lambda-calculus-like subset of Isabelle/HOL syntax, at the level of set-theoretic semantics and without unfolding Hybrid's definition in terms of de Bruijn indices. In further work, we prove an induction principle that maintains some of the benefits of HOAS even for open terms. We also present a case study of the formalization in Hybrid of a small programming language, Mini-ML with mutable references, including its operational semantics and a type-safety property. This is the largest case study in Hybrid to date, and the first to formalize a language with mutable references. We compare four variants of this formalization based on the two-level approach adopted by Felty and Momigliano in other recent work on Hybrid, with various specification logics (SLs), including substructural logics, formalized in Isabelle/HOL and used in turn to encode judgments of the object language. We also compare these with a variant that does not use an intermediate SL layer. In the course of the case study, we explore and develop new proof techniques, particularly in connection with context invariants and induction on SL statements. formal proof higher-order abstract syntax higher-order logic Hybrid interactive theorem proving Isabelle/HOL representational adequacy set-theoretic semantics variable binding
8	Proof-producing resolution of indirect jumps in the binary intermediate representation BIR / Bevis-producerande bestämning av indirekta hopp i den binära mellanliggande representationen BIR Westerberg, Adrian January 2021 (has links) HolBA is a binary analysis library that can be used to formally verify binary programs using contracts. It is developed in the interactive theorem prover HOL4 to achieve a high degree of trust in verification, the result of verification is a machine-checked proof demonstrating its correctness. This thesis presents two proof-producing procedures. The first resolve indirect jumps in BIR, the binary intermediate language used in HolBA, given their possible targets. The second transfers contracts proved on resolved BIR programs without indirect jumps to the original ones containing indirect jumps. This allows the existing weakest precondition generator to automatically prove contracts on loop-free BIR fragments containing indirect jumps. The implemented proof-producing procedures were evaluated on a small binary program and generated synthetic BIR programs. It was found that the first proof-producing procedure is not very efficient, which could pose a problem when verifying large binary programs. Future work could include improving the efficiency of the first proof-producing procedure and integrate it with an external tool that automatically finds possible targets of indirect jumps. / HolBA är ett bibliotek för binär analys som kan användas för att formellt verifiera binära program med kontrakt. Det är utvecklat i den interaktiva teorembevisaren HOL4 för att åstadkomma en hög grad av tillit till verifiering, resultatet av verifiering är ett maskin-kontrollerat bevis som demonstrerar dess korrekthet. Detta arbete presenterar två bevis-producerande procedurer. Den första bestämmer indirekta hopp i BIR, den binära mellanliggande representationen som används i HolBA, givet deras möjliga mål. Den andra överför kontrakt bevisade för bestämda BIR program utan indirekta hopp till originalen med indirekta hopp. Detta möjliggör den existerande svagaste förutsättning generatorn att automatiskt bevisa kontrakt för sling-fria BIR fragment som innehåller indirekta hopp. De implementerade bevis-producerande procedurerna utvärderades med ett litet binärt program och med genererade syntetiska BIR program. Det visades att den första bevis-producerande proceduren inte är särskilt effektiv, vilket skulle kunna vara ett problem vid verifiering av stora binära program. Framtida arbete skulle kunna inkludera att förbättra effektiviteten för den första bevis-producerande proceduren och att integrera den med ett externt verktyg som automatiskt kan hitta de möjliga målen för indirekta hopp. Formal verification Binary analysis Interactive theorem proving Indirect jump resolution Formell verifiering Binär analys Interaktiv teorembevisning Indirekt hopp bestämning Computer Sciences Datavetenskap (datalogi)

Search results