• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 116
  • 33
  • 28
  • 7
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 282
  • 175
  • 115
  • 98
  • 93
  • 68
  • 44
  • 40
  • 40
  • 37
  • 37
  • 36
  • 35
  • 34
  • 31
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A prototype Prolog-based code generator environment

Fan, Wei Yi January 1993 (has links)
No description available.
2

The design and implementation of RRCGS (Retargetable RISC Code Generator for the SUIF)

Alfawzan, Thalaya January 2000 (has links)
No description available.
3

Explicit data graph compilation

Smith, Aaron Lee, 1977- 19 August 2010 (has links)
Technology trends such as growing wire delays, power consumption limits, and diminishing clock rate improvements, present conventional instruction set architectures such as RISC, CISC, and VLIW with difficult challenges. To show continued performance growth, future microprocessors must exploit concurrency power efficiently. An important question for any future system is the division of responsibilities between programmer, compiler, and hardware to discover and exploit concurrency. In this research we develop the first compiler for an Explicit Data Graph Execution (EDGE) architecture and show how to solve the new challenge of compiling to a block-based architecture. In EDGE architectures, the compiler is responsible for partitioning the program into a sequence of structured blocks that logically execute atomically. The EDGE ISA defines the structure of, and the restrictions on, these blocks. The TRIPS prototype processor is an EDGE architecture that employs four restrictions on blocks intended to strike a balance between software and hardware complexity. They are: (1) fixed block sizes (maximum of 128 instructions), (2) restricted number of loads and stores (no more than 32 may issue per block), (3) restricted register accesses (no more than eight reads and eight writes to each of four banks per block), and (4) constant number of block outputs (each block must always generate a constant number of register writes and stores, plus exactly one branch). The challenges addressed in this thesis are twofold. First, we develop the algorithms and internal representations necessary to support the new structural constraints imposed by the block-based EDGE execution model. This first step provides correct execution and demonstrates the feasibility of EDGE compilers. Next, we show how to optimize blocks using a dataflow predication model and provide results showing how the compiler is meeting this challenge on the SPEC2000 benchmarks. Using basic blocks as the baseline performance, we show that optimizations utilizing the dataflow predication model achieve up to 64% speedup on SPEC2000 with an average speedup of 31%. / text
4

Aspect structure of compilers

Paudel, Jeeva 16 September 2009
Compilers are among the most widely-studied pieces of software; and, modularizing these valuable artifacts is a recurring theme in research. However, modularization of cross-cutting concerns in compilers is not yet well explored. Even today, implementation of one compiler concern scatters across and tangles with the implementation of several other concerns, thereby leading to a mismatch between different compiler modules and the operations they represent. Essentially, current compiler implementations fail to explicitly identify the control dependencies of different phases, and separately characterize the actions to execute during those phases. As a result, information about their program-execution path remains non-intuitive: it stays hidden within the program structure and cuts-across several phase implementations. Consequently, this makes compiler designs and artifacts difficult to comprehend, maintain and reuse. Such limitations occur primarily as a result of the inability of mainstream object-oriented languages, such as Java, to organize the cross-cutting concerns into clean modular units.<p> This thesis demonstrates how such modularity-issues in compilers can be addressed with the help of a relatively new, yet powerful programming paradigm called aspect-oriented programming.
5

Aspect structure of compilers

Paudel, Jeeva 16 September 2009 (has links)
Compilers are among the most widely-studied pieces of software; and, modularizing these valuable artifacts is a recurring theme in research. However, modularization of cross-cutting concerns in compilers is not yet well explored. Even today, implementation of one compiler concern scatters across and tangles with the implementation of several other concerns, thereby leading to a mismatch between different compiler modules and the operations they represent. Essentially, current compiler implementations fail to explicitly identify the control dependencies of different phases, and separately characterize the actions to execute during those phases. As a result, information about their program-execution path remains non-intuitive: it stays hidden within the program structure and cuts-across several phase implementations. Consequently, this makes compiler designs and artifacts difficult to comprehend, maintain and reuse. Such limitations occur primarily as a result of the inability of mainstream object-oriented languages, such as Java, to organize the cross-cutting concerns into clean modular units.<p> This thesis demonstrates how such modularity-issues in compilers can be addressed with the help of a relatively new, yet powerful programming paradigm called aspect-oriented programming.
6

Methodology of dynamic compiler option selection based on static program analysis implementation and evaluation /

Park, Eun Jung. January 2007 (has links)
Thesis (M.S.)--University of Delaware, 2007. / Principal faculty advisor: Guang R. Gao, Dept. of Electrical and Computer Engineering. Includes bibliographical references.
7

A parallel functional language compiler for message-passing multicomputers

Junaidu, Sahalu B. January 1998 (has links)
The research presented in this thesis is about the design and implementation of Naira, a parallel, parallelising compiler for a rich, purely functional programming language. The source language of the compiler is a subset of Haskell 1.2. The front end of Naira is written entirely in the Haskell subset being compiled. Naira has been successfully parallelised and it is the largest successfully parallelised Haskell program having achieved good absolute speedups on a network of SUN workstations. Having the same basic structure as other production compilers of functional languages, Naira's parallelisation technology should carry forward to other functional language compilers. The back end of Naira is written in C and generates parallel code in the C language which is envisioned to be run on distributed-memory machines. The code generator is based on a novel compilation scheme specified using a restricted form of Milner's 7r-calculus which achieves asynchronous communication. We present the first working implementation of this scheme on distributed-memory message-passing multicomputers with split-phase transactions. Simulated assessment of the generated parallel code indicates good parallel behaviour. Parallelism is introduced using explicit, advisory user annotations in the source' program and there are two major aspects of the use of annotations in the compiler. First, the front end of the compiler is parallelised so as to improve its efficiency at compilation time when it is compiling input programs. Secondly, the input programs to the compiler can themselves contain annotations based on which the compiler generates the multi-threaded parallel code. These, therefore, make Naira, unusually and uniquely, both a parallel and a parallelising compiler. We adopt a medium-grained approach to granularity where function applications form the unit of parallelism and load distribution. We have experimented with two different task distribution strategies, deterministic and random, and have also experimented with thread-based and quantum- based scheduling policies. Our experiments show that there is little efficiency difference for regular programs but the quantum-based scheduler is the best in programs with irregular parallelism. The compiler has been successfully built, parallelised and assessed using both idealised and realistic measurement tools: we obtained significant compilation speed-ups on a variety of simulated parallel architectures. The simulated results are supported by the best results obtained on real hardware for such a large program: we measured an absolute speedup of 2.5 on a network of 5 SUN workstations. The compiler has also been shown to have good parallelising potential, based on popular test programs. Results of assessing Naira's generated unoptimised parallel code are comparable to those produced by other successful parallel implementation projects.
8

Prefetching for complex memory access patterns

Ainsworth, Sam January 2018 (has links)
Modern-day workloads, particularly those in big data, are heavily memory-latency bound. This is because of both irregular memory accesses, which have no discernable pattern in their memory addresses, and large data sets that cannot fit in any cache. However, this need not be a barrier to high performance. With some data structure knowledge it is typically possible to bring data into the fast on-chip memory caches early, so that it is already available by the time it needs to be accessed. This thesis makes three contributions. I first contribute an automated software prefetching compiler technique to insert high-performance prefetches into program code to bring data into the cache early, achieving 1.3x geometric mean speedup on the most complex processors, and 2.7x on the simplest. I also provide an analysis of when and why this is likely to be successful, which data structures to target, and how to schedule software prefetches well. Then I introduce a hardware solution, the configurable graph prefetcher. This uses the example of breadth-first search on graph workloads to motivate how a hardware prefetcher armed with data-structure knowledge can avoid the instruction overheads, inflexibility and limited latency tolerance of software prefetching. The configurable graph prefetcher sits at the L1 cache and observes memory accesses, which can be configured by a programmer to be aware of a limited number of different data access patterns, achieving 2.3x geometric mean speedup on graph workloads on an out-of-order core. My final contribution extends the hardware used for the configurable graph prefetcher to make an event-triggered programmable prefetcher, using a set of a set of very small micro-controller-sized programmable prefetch units (PPUs) to cover a wide set of workloads. I do this by developing a highly parallel programming model that can be used to issue prefetches, thus allowing high-throughput prefetching with low power and area overheads of only around 3%, and a 3x geometric mean speedup for a variety of memory-bound applications. To facilitate its use, I then develop compiler techniques to help automate the process of targeting the programmable prefetcher. These provide a variety of tradeoffs from easiest to use to best performance.
9

The implementation of an optimizing compiler for Icon.

Walker, Kenneth William. January 1991 (has links)
There are many optimizations that can be applied while translating Icon programs. These optimizations and the analyses needed to apply them are of interest for two reasons. First, Icon's unique combination of characteristics requires developing new techniques for implementing them. Second, these optimizations are used in variety of languages and Icon can be used as a medium for extending the state of the art. Many of these optimizations require detailed control of the generated code. Previous production implementations of the Icon programming language have been interpreters. The virtual machine code of an interpreter is seldom flexible enough to accommodate these optimizations and modifying the virtual machine to add the flexibility destroys the simplicity that justified using an interpreter in the first place. These optimizations can only reasonably be implemented in a compiler. In order to explore these optimizations for Icon programs, a compiler was developed. This dissertation describes the compiler and the optimizations it employs. It also describes a run-time system designed to support the analyses and optimizations. Icon variables are untyped. The compiler contains a type inferencing system that determines what values variables and expression may take on during program execution. This system is effective in the presence of values with pointer semantics and of assignments to components of data structures. The compiler stores intermediate results in temporary variables rather than on a stack. A simple and efficient algorithm was developed for determining the lifetimes of intermediate results in the presence of goal-directed evaluation. This allows an efficient allocation of temporary variables to intermediate results. The compiler uses information from type inferencing and liveness analysis to simplify generated code. Performance measurements on a variety of Icon programs show these optimizations to be effective.
10

Performance improvement through predicated execution in VLIW machines

Biglari-Abhari, Morteza. January 2000 (has links) (PDF)
Bibliography: leaves 136-153. Investigates techniques to achieve performance improvement in Very Long Instruction Word machines through predicated execution.

Page generated in 0.1045 seconds