• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 24
  • 4
  • 2
  • Tagged with
  • 33
  • 33
  • 16
  • 13
  • 8
  • 7
  • 7
  • 7
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Lifting the Abstraction Level of Compiler Transformations

Tang, Xiaolong 16 December 2013 (has links)
Production compilers implement optimizing transformation rules for built-in types. What justifies applying these optimizing rules is the axioms that hold for built-in types and the built-in operations supported by these types. Similar axioms also hold for user-defined types and the operations defined on them, and therefore justify a set of optimization rules that may apply to user-defined types. Production compilers, however, do not attempt to construct and apply these optimization rules to user-defined types. Built-in types together the axioms that apply to them are instances of more general algebraic structures. So are user-defined types and their associated axioms. We use the technique of generic programming, a programming paradigm to design efficient, reusable software libraries, to identify the commonality of classes of types, whether built-in or user-defined, convey the semantics of the classes of types to compilers, design scalable and effective program analysis for them, and eventually apply optimizing rules to the operations on them. In generic programming, algorithms and data structures are defined in terms of such algebraic structures. The same definitions are reused for many types, both built-in and user-defined. This dissertation applies generic programming to compiler analyses and transformations. Analyses and transformations are specified for general algebraic structures, and they apply to all types, both built-in and primitive types.
12

An Optimization Compiler Framework Based on Polyhedron Model for GPGPUs

Liu, Lifeng 31 May 2017 (has links)
No description available.
13

INSTRUCTION SCHEDULING TO HIDE LOAN/STORE LATENCY IN IRREGULAR ARCHITECTURE EMBEDDED PROCESSORS

BHALGAT, ASHISH ZUMBARLAL 11 October 2001 (has links)
No description available.
14

Applying Polyhedral Transformation to Fortran Programs

Gururaghavendran, Ashwin 31 March 2011 (has links)
No description available.
15

Automatic Task Formation Techniques for the Multi-level Computing Architecture

Stewart, Kirk 30 July 2008 (has links)
The Multi-Level Computing Architecture (MLCA) is a multiprocessor system-on-chip architecture designed for multimedia applications. It provides a programming model that simplifies the process of writing parallel applications by eliminating the need for explicit synchronization. However, developers must still invest effort to design applications that fully exploit the MLCA’s multiprocessing capabilities. We present a set of compiler techniques to streamline the process of developing applications for the MLCA. We present an algorithm to automatically partition a sequential application into tasks that can be executed in parallel. We also present code generation algorithms to translate annotated, sequential C code to the MLCA’s programming model. We provide an experimental evaluation of these techniques, performed with a prototype compiler based upon the open-source ORC compiler and integrated with the MLCA Optimizing Compiler. This evaluation shows that the performance of automatically generated code compares favourably to that of manually written code.
16

Automatic Task Formation Techniques for the Multi-level Computing Architecture

Stewart, Kirk 30 July 2008 (has links)
The Multi-Level Computing Architecture (MLCA) is a multiprocessor system-on-chip architecture designed for multimedia applications. It provides a programming model that simplifies the process of writing parallel applications by eliminating the need for explicit synchronization. However, developers must still invest effort to design applications that fully exploit the MLCA’s multiprocessing capabilities. We present a set of compiler techniques to streamline the process of developing applications for the MLCA. We present an algorithm to automatically partition a sequential application into tasks that can be executed in parallel. We also present code generation algorithms to translate annotated, sequential C code to the MLCA’s programming model. We provide an experimental evaluation of these techniques, performed with a prototype compiler based upon the open-source ORC compiler and integrated with the MLCA Optimizing Compiler. This evaluation shows that the performance of automatically generated code compares favourably to that of manually written code.
17

A High Performance Register Allocator for Vector Architectures with a Unified Register-Set

Su, Yu-Dan 29 June 2012 (has links)
This thesis describes a compiler optimization targeted for machines with unified, vector-based register sets. This optimization combines register allocation and instruction scheduling. It examines places where the code performs computations on scalar variables. The goal is to identify instances where the same operation is performed. For example, a program might calculate ¡§base+offset¡¨ and then calculate ¡§i+j¡¨. Even though these computations are unrelated, yet they use the same operator; if ¡§base¡¨ and ¡§i¡¨ are packed into one vector register, while ¡§offset¡¨ and ¡§j¡¨ are packed into another, then these two computations can be performed simultaneously through the vectors¡¦ parallel addition operation. This would reduce the execution time of the compiled code. Although other researchers have considered similar packing methods, their work has been limited by the hardware that they were studying. Such hardware usually imposed high costs for moving data between scalar and vector register banks. This present thesis, however, considers a novel hardware architecture that imposes no such costs. As a consequence, we are able to obtain significant speedups. The architecture that we consider is a Graphics Processing Unit (GPU) for embedded systems that is under development at this university. This GPU has a single register set for integers, float, and vectors.
18

Optimizing System Performance and Dependability Using Compiler Techniques

Rajagopalan, Mohan January 2006 (has links)
As systems become more complex, there are increasing demands for improvement with respect to attributes such as performance, dependability, and security. Optimization is defined as theprocess of making the most effective use of a set of resources with respect to some attribute. Existing optimization techniques, however, have two fundamental limitations. They target individual parts of a system without considering the potentially significant global picture, and they are designed to improve a single attribute at a time. These limitations impose significant restrictions on the kinds of optimization possible, the effectiveness of the techniques, and the ability to improvethe optimization process itself.This dissertation presents holistic system optimization, a new approach to optimization based on taking a broad view of a system. Unlike current approaches, holistic optimizations consider different kinds of interactions at multiple levels in a system, and target a variety of metrics uniformly. A key component of this research has been the use of proven compiler techniques to ensure transparency, automation, and correctness. These techniques have been implemented in Cassyopia, a software prototype of a framework for performing holistic optimization.The core of this work is three new holistic optimizations, which are also presented. The first describes profile-directed static optimizations designed to improve the performance of eventbased programs by spanning boundaries that separate code that raises events from handlers that field them. The second, system call clustering, improves the system call behavior of an entire program by grouping together calls that can be executed in a single boundary crossing. In thiscase, the optimization spans kernel and user address spaces. Finally, authenticated system calls optimize system security through a novel implementation of an efficient system call monitor. This example demonstrates how the new approach can be used to create new optimizations that not only span address space boundaries but also target attributes such as dependability. All of these optimizations involve the application of standard compiler techniques in non-traditional contexts and demonstrate how systems can be improved beyond what is possible using existing techniques.
19

Compiler Optimization Effects on Register Collisions

Tan, Jonathan S 01 June 2018 (has links) (PDF)
We often want a compiler to generate executable code that runs as fast as possible. One consideration toward this goal is to keep values in fast registers to limit the number of slower memory accesses that occur. When there are not enough physical registers available for use, values are ``spilled'' to the runtime stack. The need for spills is discovered during register allocation wherein values in use are mapped to physical registers. One factor in the efficacy of register allocation is the number of values in use at one time (register collisions). Register collision is affected by compiler optimizations that take place before register allocation. Though the main purpose of compiler optimizations is to make the overall code better and faster, some optimizations can actually increase register collisions. This may force the register allocation process to spill. This thesis studies the effects of different compiler optimizations on register collisions.
20

Compiler Techniques for Transformation Verification, Energy Efficiency and Cache Modeling

Bao, Wenlei 13 September 2018 (has links)
No description available.

Page generated in 0.1262 seconds