• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 8
  • 3
  • Tagged with
  • 12
  • 12
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.

Advanced microarchitecture and circuit design techniques for on-chip memories in CMOS technology /

Hsu, Steven K. January 1900 (has links)
Thesis (Ph. D.)--Oregon State University, 2007. / Printout. Includes bibliographical references (leaves 107-115). Also available on the World Wide Web.

Asymmetric clustering using a register cache /

Morrison, Roger Allen. January 1900 (has links)
Thesis (M.S.)--Oregon State University, 2006. / Printout. Includes bibliographical references (leaves 19-20). Also available on the World Wide Web.

An aggressive live range splitting and coalescing framework for efficient registrar allocation

Kaluskar, Vivek P., January 2003 (has links) (PDF)
Thesis (M.S. in C.S.)--College of Computing, Georgia Institute of Technology, 2004. Directed by Santosh Pande. / Includes bibliographical references (leaves 73-74).

An aggressive live range splitting and coalescing framework for efficient registrar allocation

Kaluskar, Vivek P. 01 December 2003 (has links)
No description available.

Transparent spilling and refilling of partitioned overlapping register window register organizations with a remote instruction pointer

Mayhew, David Evan 24 October 2005 (has links)
Register allocation is critical to processor performance. Registers are the fastest storage system available to a processor. The more capable a register set's organization is at maintaining process context, the fewer the number of memory accesses the processor will need to make. Overlapping register windows have better context maintenance capabilities than single register set organizations, but overlapping register windows also show significant performance degradation if program behavior causes the register window store to overflow. Program behavior makes window overflow of simple overlapping register window organizations unavoidable. Attempts to minimize the impact of overflow by increasing the size of the register store negatively impact register access time, increases device count, and increases context switch latency. The combination of a transparent spill and refill mechanism and a small register store, allows the store to perform like a much larger store, but does not negatively impact register cycle time, and it decreases context switch latency. Transparent register spilling and refilling can be accomplished by the inclusion of a set of simple state machines, and dedicated register and memory ports. The transparent spill/refill mechanism's external port interfaces very well with established peripheral processing capabilities on many multi-processor architectures. The inclusion of an instruction repetition capability can facilitate global register storage and retrieval, and can decrease context switch latency. Register performance can be further enhanced by partitioning the register set into data typed. register groups. Register partitioning allows a high degree of parallelism, without necessitating the inclusion of register set with high port counts and register access conflicts. Partitioned register sets can the spatially proximate to processing units whose functionality is optimized for operations on specific data types. A remote instruction pointer with a partitioned code address register set and processing capability can decrease branch latency, improve call/return performance, and simplify general case return address maintenance. A partitioned, transparently spilled/refilled register organization minimizes explicit register storing and retrieving, supports the creation of large register-based working sets, and facilitates a simple parallel processing paradigm that allows a high degree sub processing unit independence. / Ph. D.

Very large register file for BLAS-3 operations.

January 1995 (has links)
by Aylwin Chung-Fai, Yu. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1995. / Includes bibliographical references (leaves 117-118). / Abstract --- p.i / Acknowledgement --- p.iii / List of Tables --- p.v / List of Figures --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- BLAS-3 Operations --- p.2 / Chapter 1.2 --- Organization of Thesis --- p.2 / Chapter 1.3 --- Contribution --- p.3 / Chapter 2 --- Background Studies --- p.4 / Chapter 2.1 --- Registers & Cache Memory --- p.4 / Chapter 2.2 --- Previous Research --- p.6 / Chapter 2.3 --- Problem of Register & Cache --- p.8 / Chapter 2.4 --- BLAS-3 Operations On RISC Microprocessor --- p.10 / Chapter 3 --- Compiler Optimization Techniques for BLAS-3 Operations --- p.12 / Chapter 3.1 --- One-Dimensional Q-Way J-Loop Unrolling --- p.13 / Chapter 3.2 --- Two-Dimensional P×Q -Ways I×J-Loops Unrolling --- p.15 / Chapter 3.3 --- Addition of Code to Remove Redundant Code --- p.17 / Chapter 3.4 --- Simulation Result --- p.17 / Chapter 3.5 --- Summary --- p.23 / Chapter 4 --- Architectural Model of Very Large Register File --- p.25 / Chapter 4.1 --- Architectural Model --- p.26 / Chapter 4.2 --- Traditional Register File vs. Very Large Register File --- p.32 / Chapter 5 --- Ideal Case Study of Very Large Register File --- p.35 / Chapter 5.1 --- Matrix Multiply --- p.36 / Chapter 5.2 --- LU Decomposition --- p.41 / Chapter 5.3 --- Convolution --- p.50 / Chapter 6 --- Worst Case Study of Very Large Register File --- p.58 / Chapter 6.1 --- Matrix Multiply --- p.59 / Chapter 6.2 --- LU Decomposition --- p.65 / Chapter 6.3 --- Convolution --- p.74 / Chapter 7 --- Proposed Case Study of Very Large Register File --- p.81 / Chapter 7.1 --- Matrix Multiply --- p.82 / Chapter 7.2 --- LU Decomposition --- p.91 / Chapter 7.3 --- Convolution --- p.102 / Chapter 7.4 --- Comparison --- p.111 / Chapter 8 --- Conclusion & Future Work --- p.114 / Chapter 8.1 --- Summary --- p.114 / Chapter 8.2 --- Future Work --- p.115 / Bibliography --- p.117

Single-level dynamic register caching architecture for high-performance superscalar processors /

Liebert, John A. January 1900 (has links)
Thesis (M.S.)--Oregon State University, 2007. / Printout. Includes bibliographical references (leaves 30-32). Also available on the World Wide Web.

Asymmetric clustering using a register cache

Morrison, Roger Allen 18 April 2006 (has links)
Graduation date: 2006 / Conventional register files spread porting resources uniformly across all registers. This paper proposes a method called Asymmetric Clustering using a Register Cache (ACRC). ACRC utilizes a fast register cache that concentrates valuable register file ports to the most active registers thereby reducing the total register file area and power consumption. A cluster of functional units and a highly ported register cache execute the majority of instructions, while a second cluster with a full register file having fewer read ports processes instructions with source registers not found in the register cache. An ‘in-cache’ marking system tracks the contents of the register cache and routes instructions to the correct cluster. This system utilizes logic similar to the ‘ready’ bit system found in wake-up and select logic keeping the additional logic required to a minimum. When using a 256-entry register file, this design reduces the total register file area by an estimated 65% while exhibiting similar IPC performance compared to a non-clustered 8-way processor. As the feature size becomes smaller and processor clocks become faster, the number of clock cycles needed to access the register file will increase. Therefore, the smaller register file area requirement and subsequent smaller register file delay of ACRC will lead to better IPC performance than conventional processors.

Low-power high-performance register file design for chip multiprocessors

Khasawneh, Shadi Turki. January 2006 (has links)
Thesis (M.S.)--State University of New York at Binghamton, Department of Computer Science, Thomas J. Watson School of Engineering and Applied Science, 2006. / Includes bibliographical references.

Managing datapath resources in an out-of-order processor for performance and energy efficiency

Zeng, Hui. January 2009 (has links)
Thesis (Ph. D.)-- State University of New York at Binghamton, Thomas J. Watson School of Engineering and Applied Science, Department of Computer Science, 2009.

Page generated in 0.0703 seconds