Global ETD Search

1	Tools, Techniques, and Trade-offs when Porting Large Software Systems to New Environments Kågström, Simon January 2008 (has links) Computer hardware and software evolve very fast. With the advent of chip-multiprocessors and symmetric multithreading, multiprocessor hardware configurations are becoming prevalent. For software, new hardware and requirements such as security, performance and maintainability drive the development of new runtime environments, virtual machines and programming methodologies. These trends present problems when porting legacy software. Multiprocessor hardware require ports of uniprocessor operating system kernels while new software environments might require that programs have to be ported to different languages. This thesis examines the tradeoff between performance and development effort for software porting with case studies in operating system porting to multiprocessors and tool support for porting C and C++ applications to Java virtual machines. The thesis consists of seven papers. The first paper is a survey of existing multiprocessor development approaches and focuses on the tradeoff between performance and implementation effort. The second and third papers describe the evolution a traditional lock-based multiprocessor port, going from a serialized “giant locked” port and evolving into a coarse-grained implementation. The fourth paper instead presents an alternative porting approach which aims to minimize development effort. The fifth paper describes a tool for efficient instrumentation of programs, which can be used during the development of large software systems such as operating system kernels. The sixth and seventh papers finally describe a binary translator which translates MIPS binaries into Java bytecode to allow low-effort porting of C and C++ applications to Java virtual machines. The first main contributions of this thesis is an in-depth investigation of the techniques used when porting operating system kernels to multiprocessors, focusing on development effort and performance. The traditional approach used in the second and third papers required longer development time than expected, and the alternative approach in the fourth paper can therefore be preferable in some cases. The second main contribution is the development of a binary translator that targets portability of C and C++ applications to J2ME devices. The last two papers show that the approach is functional and has good enough performance to be feasible in real-life situations. Operating systems Binary Translation
2	Dynamic Binary Translation on the .NET Platform Wright, Patrick Andrew 26 August 2014 (has links) Emulation is the practice of simulating one computer system on another. There are many methods of implementing an emulator. They exist on a performance continuum from simple interpretation to dynamic binary translation extended with various optimizations. Optimizations are diverse, including just in time compilation, large translation units, shadow stack, register mapping and many more. The goal of this thesis is to develop a high performance, portable emulator for the ARM v4 architecture without requiring substantial code analysis. This thesis describes the implementation of a dynamic binary translator translating to an intermediate language targeting a virtual machine. Targeting a virtual machine ensures that the emulator is portable. Optimizations implemented include forming large translation units and branch straightening in hot regions. The particular combination of translating to intermediate form for a virtual machine, and creating large translation units from hot regions does not seem to appear in the literature. The performance of the described dynamic binary translator exceeds the performance of an interpreter on the same platform by an order of magnitude. Code analysis was only used to straighten branches in hot regions. While many popular dynamic binary translation optimizations are not readily applicable when using a virtual machine target, the performance achieved shows that using virtual machine as translation target is viable method of implementing dynamic binary translator. / Graduate Dynamic binary translation
3	Low overhead dynamic binary translation for ARM D'Antras, Bernard January 2017 (has links) Driven by Moore's Law, many computer architectures - ARM, x86, MIPS, PowerPC, SPARC - have evolved from 32-bit to 64-bit. To support existing applications, these have all kept support for a 32-bit compatibility mode. However, this comes at a cost in hardware complexity, power consumption and development time. Dynamic binary translation - recompiling binaries into the new instruction set at runtime - can be used instead of specific hardware for this purpose. While this approach has previously been used to assist architecture transition, these translators have all traded-off performance and transparency, a measure of how accurately they emulate the 32-bit environment. This thesis addresses ARM's transition from AArch32 to AArch64 through MAMBO-X64, a dynamic binary translator developed to support this transition. A range of novel optimizations were devised to improve translation performance while maintaining strict transparency. This follows a common theme of exploiting existing hardware features such as hardware return prediction, virtual memory and virtualization extensions to offset translation overheads. HyperMAMBO-X64 - a variant of MAMBO-X64 integrated in a hypervisor - was also developed to support system-level translation while remaining transparent to guest operating systems. Results demonstrate that the cost of binary translation is reduced, delivering performance competitive with the manufacturer's hardware. Performance in several benchmarks even exceeds that from the integrated compatibility mode. Thus MAMBO-X64 not only provides a means for architectural upgrade, but also an alternative to the expense of the legacy support currently employed. 004
4	Specification-Driven Dynamic Binary Translation Tröger, Jens January 2005 (has links) Machine emulation allows for the simulation of a real or virtual machine, the source machine, on various host computers. A machine emulator interprets programs that are compiled for the emulated machine, but normally at a much reduced speed. Therefore, in order to increase the executions peed of such interpreted programs, a machine emulator may apply different dynamic optimization techniques. In our research we focus on emulators for real machines, i.e. existing computer architectures, and in particular on dynamic binary translation as the optimization technique. With dynamic binary translation, the machine instructions of the interpreted source program are translated in to machine instructions for the host machine during the interpretation of the program. Both, the machine emulator and its dynamic binary translator a resource and host machine specific, respectively, and are therefore traditionally hand-written. In this thesis we introduce the Walkabout/Yirr-Ma framework. Walkabout, initially developed by Sun Micro systems, allows among other things for the generation of instrumented machine emulators from a certain type of machine specification files. We extended Walkabout with our generic dynamic optimization framework ‘Yirr-Ma’ which defines an interface for the implementation of various dynamic optimizers: by instrumenting a Walkabout emulator’s instruction interpretation functions, Yirr-Ma observes and intercepts the interpretation of a source machine program, and applies dynamic optimizations to selected traces of interpreted instructions on demand. One instance of Yirr-Ma’s interface for dynamic optimizers implements our specification-driven dynamic binary translator, the major contribution of this thesis. At first we establish two things: a formal framework that describes the process of machine emulation by abstracting from real machines, and different classes of applicable dynamic optimizations. We define dynamic optimizations by a set of functions over the abstracted machine, and dynamic binary translation as one particular optimization function. Using this formalism, we then derive the upper bound for quality of dynamically translated machine instructions. Yirr-Ma’s dynamic binary translator implements the optimization functions of our formal framework by modules which are either generated from, or parameterized by, machine specification files. They thus allow for the adaptation of the dynamic binary translator to different source and host machines without hand-writing machine dependent code. Machine emulation dynamic optimization dynamic binary translation
5	Parallelization of virtual machines for efficient multi-processor emulation Chakravarthy, Ramapriyan Sreenath 09 November 2010 (has links) Simulation is an essential tool for computer systems research. The speed of the simulator has a first-order effect on what experiments can be run. The recent shift in the industry to multi-core processors has put even more pressure on simulation performance, especially since most simulators are single-threaded. In this thesis, we present QEMU-MP, a parallelized version of a fast functional simulator called QEMU. / text Computer simulation Functional simulation Parallelization QEMU Binary translation
6	Sistema de tradução binária de dois níveis para execução multi-ISA / Tow-level binary translation system for multiple-isa execution Fajardo Junior, Jair January 2011 (has links) Atualmente, a adição de uma nova função implementada em hardware em um processador não deve impor nenhuma mudança no conjunto de instruções (ISA – Instruction Set Architecture) suportado para atingir melhorias em seu desempenho. O objetivo é manter a compatibilidade retroativa e futura de programas já compilados. Todavia, este fato se torna, muitas vezes, um fator impeditivo para o aprimoramento ou desenvolvimento de uma nova arquitetura. Desta maneira, a utilização de mecanismos de Tradução Binária abre novas oportunidades aos projetistas, já que estes mecanismos permitem a execução de programas já compilados em arquiteturas que suportam conjuntos de instruções diferentes do previsto inicialmente. Assim, para eliminar o custo adicional apresentado por estes sistemas de tradução, será proposto um novo mecanismo de tradução binária dinâmico de dois níveis. Enquanto o primeiro nível é responsável pela tradução de facto das instruções do conjunto nativo para instruções de uma linguagem de máquina intermediária, o segundo nível otimiza estas instruções já traduzidas para serem executadas na arquitetura alvo. O sistema é totalmente flexível, pois pode suportar a tradução de conjuntos de instruções completamente diferentes; assim como a utilização de arquiteturas de hardware com as mais diversas características. Este trabalho apresenta o primeiro esforço nesta direção: um estudo de caso onde ocorre a tradução de código x86 para MIPS (linguagem intermediária), que será otimizado para ser executado em uma arquitetura que realiza reconfiguração dinâmica. Resta demonstrado que é possível manter a compatibilidade binária, com melhoria no desempenho em torno de 45% em média e consumo de energia semelhante ao da execução nativa. / In these days, every new added hardware feature must not change the underlying instruction set architecture (ISA), in order to avoid adaptation or recompilation of existing code. Therefore, Binary Translation (BT) opens new possibilities for designers, previously tied to a specific ISA and all its legacy hardware issues, since it allows the execution of already compiled applications on different architectures. To overcome the BT inherent performance penalty, we propose a new mechanism based on a dynamic two-level binary translation system. While the first level is responsible for the BT de facto to an intermediate machine language, the second level optimizes the already translated instructions to be executed on the target architecture. The system is totally flexible, supporting the porting of radically different ISAs and the employment of different target architectures. This work presents the first effort towards this direction: it translates code implemented in the x86 ISA to MIPS assembly (the intermediate language), which will be optimized by the target architecture: a dynamically reconfigurable architecture. In this work is showed that is possible to maintain binary compatibility with performance improvements on average 45% and similar energy consumption when compared to native execution. Microeletrônica Sistemas embarcados Binary translation Embedded systems Reconfigurable architectures
7	Tracerory - Dynamic Tracematches and Unread Memory Detection for C/C++ Eyolfson, Jonathan January 2011 (has links) Dynamic binary translation allows us to analyze a program during execution without the need for a compiler or the program's source code. In this work, we present two applications of dynamic binary translation: tracematches and unread memory detection. Libraries are ubiquitous in modern software development. Each library requires that its clients follow certain conventions, depending on the domain of the library. Tracematches are a particularly expressive notation for specifying library usage conventions, but have only been implemented on top of Java. In this work, we leverage dynamic binary translation to enable the use of tracematches on executables, particularly for compiled C/C++ programs. The presence of memory that is never read, or memory writes that are never read during execution is wasteful, and may be also be indicative of bugs. In addition to tracematches, we present an unread memory detector. We built this detector using dynamic binary translation. We have implemented a tool which monitors tracematches on top of the Pin framework along with unread memory. We describe the operation of our tool using a series of motivating examples and then present our overall monitoring approach. Finally, we include benchmarks showing the overhead of our tool on 4 open source projects and report qualitative results. runtime monitoring dynamic binary translation Electrical and Computer Engineering
8	A framework for rapid development of dynamic binary translators Holm, David January 2004 (has links) <p>Binary recompilation and translation play an important role in computer systems today. It is used by systems such as Java and .NET, and system emulators like VMWare and VirtualPC. A dynamic binary translator have several things in common with a regular compiler but as they usually have to translate code in real-time several constraints have to be made, especially when it comes to making code optimisations.</p><p>Designing a dynamic recompiler is a complex process that involves repetitive tasks. Translation tables have to be constructed for the source architecture which contains the data necessary to translate each instruction into binary code that can be executed on the target architecture. This report presents a method that allows a developer to specify how the source and target architectures work using a set of scripting languages. The purpose of these languages is to relocate the repetitive tasks to computer software, so that they do not have to be performed manually by programmers. At the end of the report a simple benchmark is used to evaluate the performance of a basic IA32 emulator running on a PowerPC target that have been implemented using the system described here. The results of the benchmark is compared to the results of running the same benchmark on other, existing, emulators in order to show that the system presented here can compete with the existing methods used today.</p><p>Several ongoing research projects are looking into ways of designing binary translators. Most of these projects focus on ways of optimising code in real-time and how to solve the problems related to binary translation, such as handling self-modifying code.</p> Emulation Binary Translation Dynamic Recompilation JIT-compiler Computer science Datalogi
9	Tracerory - Dynamic Tracematches and Unread Memory Detection for C/C++ Eyolfson, Jonathan January 2011 (has links) Dynamic binary translation allows us to analyze a program during execution without the need for a compiler or the program's source code. In this work, we present two applications of dynamic binary translation: tracematches and unread memory detection. Libraries are ubiquitous in modern software development. Each library requires that its clients follow certain conventions, depending on the domain of the library. Tracematches are a particularly expressive notation for specifying library usage conventions, but have only been implemented on top of Java. In this work, we leverage dynamic binary translation to enable the use of tracematches on executables, particularly for compiled C/C++ programs. The presence of memory that is never read, or memory writes that are never read during execution is wasteful, and may be also be indicative of bugs. In addition to tracematches, we present an unread memory detector. We built this detector using dynamic binary translation. We have implemented a tool which monitors tracematches on top of the Pin framework along with unread memory. We describe the operation of our tool using a series of motivating examples and then present our overall monitoring approach. Finally, we include benchmarks showing the overhead of our tool on 4 open source projects and report qualitative results. runtime monitoring dynamic binary translation Electrical and Computer Engineering
10	Guided automatic binary parallelisation Zhou, Ruoyu January 2018 (has links) For decades, the software industry has amassed a vast repository of pre-compiled libraries and executables which are still valuable and actively in use. However, for a significant fraction of these binaries, most of the source code is absent or is written in old languages, making it practically impossible to recompile them for new generations of hardware. As the number of cores in chip multi-processors (CMPs) continue to scale, the performance of this legacy software becomes increasingly sub-optimal. Rewriting new optimised and parallel software would be a time-consuming and expensive task. Without source code, existing automatic performance enhancing and parallelisation techniques are not applicable for legacy software or parts of new applications linked with legacy libraries. In this dissertation, three tools are presented to address the challenge of optimising legacy binaries. The first, GBR (Guided Binary Recompilation), is a tool that recompiles stripped application binaries without the need for the source code or relocation information. GBR performs static binary analysis to determine how recompilation should be undertaken, and produces a domain-specific hint program. This hint program is loaded and interpreted by the GBR dynamic runtime, which is built on top of the open-source dynamic binary translator, DynamoRIO. In this manner, complicated recompilation of the target binary is carried out to achieve optimised execution on a real system. The problem of limited dataflow and type information is addressed through cooperation between the hint program and JIT optimisation. The utility of GBR is demonstrated by software prefetch and vectorisation optimisations to achieve performance improvements compared to their original native execution. The second tool is called BEEP (Binary Emulator for Estimating Parallelism), an extension to GBR for binary instrumentation. BEEP is used to identify potential thread-level parallelism through static binary analysis and binary instrumentation. BEEP performs preliminary static analysis on binaries and encodes all statically-undecided questions into a hint program. The hint program is interpreted by GBR so that on-demand binary instrumentation codes are inserted to answer the questions from runtime information. BEEP incorporates a few parallel cost models to evaluate identified parallelism under different parallelisation paradigms. The third tool is named GABP (Guided Automatic Binary Parallelisation), an extension to GBR for parallelisation. GABP focuses on loops from sequential application binaries and automatically extracts thread-level parallelism from them on-the-fly, under the direction of the hint program, for efficient parallel execution. It employs a range of runtime schemes, such as thread-level speculation and synchronisation, to handle runtime data dependences. GABP achieves a geometric mean of speedup of 1.91x on binaries from SPEC CPU2006 on a real x86-64 eight-core system compared to native sequential execution. Performance is obtained for SPEC CPU2006 executables compiled from a variety of source languages and by different compilers.

Search results