Global ETD Search

1	Recompiling DSP applications to x86 using LLVM IR Stenberg, David January 2014 (has links) This thesis describes the design and implementation of a prototype LLVM compiler backend, x86-64p, that compiles code written for a DSP architecture, FADER, into executables for the x86-64 architecture. The prototype takes LLVM IR generated for the FADER architecture and compiles x86-64 executables that emulate the properties of the DSP architecture, e.g. the multiple address spaces, the big-endianness and the support for fixed-point arithmetics. The backend is compared to a previous solution, C-Emu, that converts the DSP code to normal C code that is compiled using a normal x86-64 compiler. The two solutions are compared in terms of their correctness, debuggability and performance. The created prototype handles code containing low-level architectural assumptions better than C-Emu. However, the added emulation reduces the debuggability and performance of the generated executables. We have measured a runtime overhead of up to a factor of two compared to C-Emu. We also present some possible solutions for these issues. DSP compiler LLVM LLVM IR translation endian emulation adress spaces Computer Engineering Datorteknik
2	Dekódování binárního kódu do vyšší formy reprezentace / Binary-Code Decoding to a High-Level Representation Macko, Lukáš January 2015 (has links) The thesis deals with reverse techniques in software engineering. It presents practical application of software reverse engineering, used tools and approaches. The topic of instruction decoding is discussed in detail. Two basic methods are presented-linear sweep and recursive descent. Their strengths and weaknesses are highlighted. Subsequently a decompiler developed by AVG Technologies is introduced. The decompiler is retargetable. This feature allows to decompile applications of multiple platforms into various target languages. The aim of the thesis is to design and implement algorithm for decoding binary files into high-level representation. The designed algorithm is based on modified recursive descent algorithm, which uses control flow information. In order to achieve more accurate decoding results, symbol table records and other additional information are used. The proposed algorithm was implemented for the AVG Technologies retargetable decompiler. The tests showed that the implemented algorithm improved the function detection in decoded programs. Furthermore, the implemented solution allows to decode files that could not be analysed using the previous version of the decompiler.
3	Generický zpětný překlad programů v bajtkódu do vyšší formy reprezentace / Generic Decompilation of Bytecode into High-Level Representation Mrázek, Petr January 2013 (has links) The work describes methods and principles of decompilation, basic information about reverse engineering and its use in both software engineering and engineering in general. Furthermore, it introduces the decompiler developed within the Lissom project at BUT FIT. The goal of the work is to design and implement a retargetable decompiler for bytecode, which extends the original decompiler.
4	Rekonstrukce datových typů při zpětném překladu kódu / Reconstruction of Data Types for Decompilation Matula, Peter January 2013 (has links) This document describes methods for a reconstruction of data types in the decompilation problem. It defines the concept of reverse engineering and introduces decompiler developed by the Lissom project. It presents existing methods of reconstruction of the simple and complex data types, and explains in detail approaches based on data-flow analysis and analysis of the memory operation offsets. The core of this thesis is the design of a new technique of reconstructing simple and complex data types, suitable for deployment in a retargetable decompiler environment of the Lissom project. Basic principles of the new technique, its implementation and related changes in decompiler and intermediate language are described. The solution is tested and the conclusion discusses the achievements, shortcomings and direction of the further work.
5	Compositional Decompilation using LLVM IR Eklind, Robin January 2015 (has links) Decompilation or reverse compilation is the process of translating low-level machine-readable code into high-level human-readable code. The problem is non-trivial due to the amount of information lost during compilation, but it can be divided into several smaller problems which may be solved independently. This report explores the feasibility of composing a decompilation pipeline from independent components, and the potential of exposing those components to the end-user. The components of the decompilation pipeline are conceptually grouped into three modules. Firstly, the front-end translates a source language (e.g. x86 assembly) into LLVM IR; a platform-independent low-level intermediate representation. Secondly, the middle-end structures the LLVM IR by identifying high-level control flow primitives (e.g. pre-test loops, 2-way conditionals). Lastly, the back-end translates the structured LLVM IR into a high-level target programming language (e.g. Go). The control flow analysis stage of the middle-end uses subgraph isomorphism search algorithms to locate control flow primitives in CFGs, both of which are described using Graphviz DOT files. The decompilation pipeline has been proven capable of recovering nested pre-test and post-test loops (e.g. while, do-while), and 1-way and 2-way conditionals (e.g. if, if-else) from LLVM IR. Furthermore, the data-driven design of the control flow analysis stage facilitates extensions to identify new control flow primitives. There is huge potential for future development. The Go output could be made more idiomatic by extending the post-processing stage, using components such as Grind by Russ Cox which moves variable declarations closer to their usage. The language-agnostic aspects of the design will be validated by implementing components in other languages; e.g. data flow analysis in Haskell. Additional back-ends (e.g. Python output) will be implemented to verify that the general decompilation tasks (e.g. control flow analysis, data flow analysis) are handled by the middle-end. / <p>BSc dissertation written during an ERASMUS exchange from Uppsala University to the University of Portsmouth.</p> binary analysis composition compositional control flow analysis decompilation decompiler golang intermediate representation LLVM IR post-processing Computer Sciences Datavetenskap (datalogi)
6	Analýza a převod kódů do vyššího programovacího jazyka / Code Analysis and Transformation To a High-Level Language Křoustek, Jakub Unknown Date (has links) This paper describes methods and procedures used for code analysis and transformation. It contains basic information of a science discipline called reverse engineering and its use in information technologies. The primary objective is a construction of a generic reverse compiler or decompiler, i.e. tool that can recompile from binary form (optionally from symbolic machine code) to a high level language. This operation is highly dependent on the concrete instruction set and processor architecture. This problem is solved with description of semantic of each instruction by a special language designed for this use. The output is the high level language code and is functionally equivalent to the input. The program is therefore able to work with each instruction set and code written by it can be transformed into the chosen high level language. This proposal is implemented in practice as a part of project Lissom. Generic decompiler is completely new idea. The thesis contains entirely new techniques from theory of compilers and optimizations made by the author.
7	Migrace zdrojových kódů pomocí dekompilace / Source-Code Migration Using Decompilation Korec, Tomáš January 2014 (has links) This thesis deals with source-code migration of high-level programming languages using decompilation. A migration tool developed within the thesis is built on top of the middle-end and back-end parts of Lissom project decompiler. Several compilers generating LLVM IR code from input languages are discussed. Compilers suitable for integration to the migration tool were chosen. Compiled LLVM IR code is an input of the decompiler's optimizing middle-end. The output from the migration tool is a code in the C language or Python-like language generated by the back-end of the decompiler. The input languages are Fortran and its dialects, C/C++/Objective-C/Objective-C++, and D. The thesis describes problems connected with migration of these languages, their solutions, and ways to improve quality and readability of the produced source code.
8	Statická detekce malware nad LLVM IR / Static Behavioral Malware Detection over LLVM IR Surovič, Marek January 2016 (has links) Tato práce se zabývá metodami pro behaviorální detekci malware, které využívají techniky formální analýzy a verifikace. Základem je odvozování stromových automatů z grafů závislostí systémových volání, které jsou získány pomocí statické analýzy LLVM IR. V rámci práce je implementován prototyp detektoru, který využívá překladačovou infrastrukturu LLVM. Pro experimentální ověření detektoru je použit překladač jazyka C/C++, který je schopen generovat mutace malware za pomoci obfuskujících transformací. Výsledky předběžných experimentů a případná budoucí rozšíření detektoru jsou diskutovány v závěru práce.
9	Rozvoj instrumentace programu při překladu / Development of Instrumentation during Compilation Ševčík, Václav January 2020 (has links) The focus of this master's thesis is on the topic of instrumentation during the compilation process in the LLVM compiler. The tool enables to instrument memory accesses and functions. The instrumentation is realized through adding a novel pass to the LLVM's optimalization phase. Information about variables are managed by the created framework. The framework is linked with the program. The overhead of the instrumentation increases duration of the program by about 14 % in the case of switched off indirect addressing and 23 % in the case of switched on indirect addressing. The main benefit of the work is the possibility of easy instrumentation of the program which can even monitor operation of local variables through indirect addressing) and support multithread programs. The framework is part of Testos's tools where it provides automatic instrumentation in the Spectra tool.

Search results