81 |
An automated OpenCL FPGA compilation framework targeting a configurable, VLIW chip multiprocessorParker, Samuel J. January 2015 (has links)
Modern system-on-chips augment their baseline CPU with coprocessors and accelerators to increase overall computational capacity and power efficiency, and thus have evolved into heterogeneous systems. Several languages have been developed to enable this paradigm shift, including CUDA and OpenCL. This thesis discusses a unified compilation environment to enable heterogeneous system design through the use of OpenCL and a customised VLIW chip multiprocessor (CMP) architecture, known as the LE1. An LLVM compilation framework was researched and a prototype developed to enable the execution of OpenCL applications on the LE1 CPU. The framework fully automates the compilation flow and supports work-item coalescing to better utilise the CPU cores and alleviate the effects of thread divergence. This thesis discusses in detail both the software stack and target hardware architecture and evaluates the scalability of the proposed framework on a highly precise cycle-accurate simulator. This is achieved through the execution of 12 benchmarks across 240 different machine configurations, as well as further results utilising an incomplete development branch of the compiler. It is shown that the problems generally scale well with the LE1 architecture, up to eight cores, when the memory system becomes a serious bottleneck. Results demonstrate superlinear performance on certain benchmarks (x9 for the bitonic sort benchmark with 8 dual-issue cores) with further improvements from compiler optimisations (x14 for bitonic with the same configuration).
|
82 |
An LLVM backend for the Open Modelica Compiler / Konstruktion av en LLVM-baserad backend för OpenModelica-kompilatornTinnerholm, John January 2019 (has links)
This thesis presents the construction and evaluation of an LLVM based codegenerator, an LLVM backend. The introduction of an LLVM based backend into the OpenModelica compiler was done to examine the advantages and disadvantages of compiling Modelica and MetaModelica to LLVM IR instead of C. To answer this question, the LLVM backend was compared against the existing interpreter and C code generator using four different schemes with corresponding cases. This comparison was made both for both optimised and unoptimised code. From the experiments, it was concluded that an LLVM backend can be used to improve runtime and compile time performance in the OpenModelica Interactive environment.
|
83 |
Retargeting a C Compiler for a DSP Processor / Anpassning av en C-kompilator för kodgenerering till en DSP-processorAntelius, Henrik January 2004 (has links)
<p>The purpose of this thesis is to retarget a C compiler for a DSP processor. </p><p>Developing a new compiler from scratch is a major task. Instead, modifying an existing compiler so that it generates code for another target is a common way to develop compilers for new processors. </p><p>This is called retargeting. This thesis describes how this was done with the LCC C compiler for the Motorola DSP56002 processor.</p>
|
84 |
A Template-Based Code Generator for the OpenModelica CompilerLindberg, Rickard January 2010 (has links)
<p>A new, template-based code generator has been implemented for the OpenModelica compiler. All data needed for target code generation has been collected in a new data structure that is then sent to templates which generate target code based on that data. This simplifies the implementation of the code generator and also makes it possible to write a different set of templates to generate target code in a different language.</p><p>The new, template-based code generator currently only supports generation of target code for simulating Modelica models. In that scenario it translates models roughly at the same speed as the old code generator.</p>
|
85 |
Data mining flow graphs in a dynamic compilerJocksch, Adam 11 1900 (has links)
This thesis introduces FlowGSP, a general-purpose sequence mining algorithm for flow graphs.
FlowGSP ranks sequences according to the frequency with which they occur and according to their
relative cost. This thesis also presents two parallel implementations of FlowGSP. The first implementation uses JavaTM threads and is designed for use on workstations equipped with multi-core
CPUs. The second implementation is distributed in nature and intended for use on clusters.
The thesis also presents results from an application of FlowGSP to mine program profiles in
the context of the development of a dynamic optimizing compiler. Interpreting patterns within raw
profiling data is extremely difficult and heavily reliant on human intuition.
FlowGSP has been tested on performance-counter profiles collected from the IBM WebSphere
Application Server. This investigation identifies a number of sequences which are known to be typical of WebSphere Application Server behavior, as well as some sequences which were previously
unknown.
|
86 |
Impacts of Compiler Optimizations on Address Bus Energy: An Empirical StudyTOMIYAMA, Hiroyuki 01 October 2004 (has links)
No description available.
|
87 |
Constraint Programming Techniques for Optimal Instruction SchedulingMalik, Abid 03 1900 (has links)
Modern processors have multiple pipelined functional units
and can issue more than one instruction per clock cycle. This
puts great pressure on the instruction scheduling phase in
a compiler to expose maximum instruction level parallelism.
Basic blocks and superblocks are commonly used regions of code
in a program for instruction scheduling. Instruction scheduling
coupled with register allocation is also a well studied problem
to produce better machine code. Scheduling basic blocks and
superblocks optimally with or with out register allocation is
NP-complete, and is done sub-optimally in production compilers
using heuristic approaches. In this thesis, I present a
constraint programming approach to the superblock and basic
block instruction scheduling problems for both idealized and
realistic architectures. Basic block scheduling with register
allocation with no spilling allowed is also considered. My
models for both basic block and superblock scheduling are
optimal and fast enough to be incorporated into production
compilers. I experimentally evaluated my optimal schedulers on
the SPEC 2000 integer and floating point benchmarks. On this
benchmark suite, the optimal schedulers were very robust and
scaled to the largest basic blocks and superblocks. Depending
on the architectural model, between 99.991\% to 99.999\% of all
basic blocks and superblocks were solved to optimality. The
schedulers were able to routinely solve the largest blocks,
including blocks with up to 2600 instructions. My results compare
favorably to the best previous optimal approaches, which are based
on integer programming and enumeration.
My approach for basic block scheduling without allowing spilling was good
enough to solve 97.496\% of all basic blocks in the SPEC 2000
benchmark. The approach was able to solve basic blocks as
large as 50 instructions for both idealized and realistic
architectures within reasonable time limits.
Again, my results compare favorably to recent work
on optimal integrated code generation, which is
based on integer programming.
|
88 |
An LLVM Compiler for CALUllah, Haseeb, Mofakhar, Amir January 2011 (has links)
Massively parallel architectures are gaining momentum thanks to the opportunities for both high performance and low-power consumption. After being a matter for experiments in academia, manycores are now in production and industries with products that rely on high performance, for example, in the field of telecom and radar, are in the process of adopting them. In order to encourage adoption, there is a need for compiler technologies that can make appropriate use of the opportunities of these architectures. Programming languages with constructs that the compiler can easily identify as parallel are a reasonable starting point for these technologies. For example, in CAL (Caltrop Actor Language) [1], a program is organized as a network of actors and these actors can be executed in parallel. As a first approach, if there were enough processors for all actors in the network, each actor could be mapped on to one processor. Writing a compiler is a comprehensive software engineering project and there are a number of tools, data structures and algorithms that have to be chosen and that facilitate the task. Some tools are lexer and parser generators, some data structures are abstract syntax trees and intermediate representations for code generation, some of the algorithms are for analyzing properties of the program and for translating between different data structures. LLVM (Low Level Virtual Machine) [7] is a compiler infrastructure that offers a well defined language and target independent intermediate representation for programs. LLVM also provides compile-time, link-time and run-time optimization. This report discusses the front end of a compiler for CAL, including generation of code to LLVM. The work described in this report translates CAL programs into LLVM's intermediate representation and includes a runtime system that handles intercommunication links between actors and actor scheduling. This is an intermediate step in generating LLVM code for parallel architectures. With this compiler, application developers that need to evaluate different parallel platforms will be able to have their applications written in CAL and only the back-end, form LLVM to the platform, will need to be re-programmed.
|
89 |
A Template-Based Code Generator for the OpenModelica CompilerLindberg, Rickard January 2010 (has links)
A new, template-based code generator has been implemented for the OpenModelica compiler. All data needed for target code generation has been collected in a new data structure that is then sent to templates which generate target code based on that data. This simplifies the implementation of the code generator and also makes it possible to write a different set of templates to generate target code in a different language. The new, template-based code generator currently only supports generation of target code for simulating Modelica models. In that scenario it translates models roughly at the same speed as the old code generator.
|
90 |
Retargeting a C Compiler for a DSP Processor / Anpassning av en C-kompilator för kodgenerering till en DSP-processorAntelius, Henrik January 2004 (has links)
The purpose of this thesis is to retarget a C compiler for a DSP processor. Developing a new compiler from scratch is a major task. Instead, modifying an existing compiler so that it generates code for another target is a common way to develop compilers for new processors. This is called retargeting. This thesis describes how this was done with the LCC C compiler for the Motorola DSP56002 processor.
|
Page generated in 0.045 seconds