• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 10
  • 3
  • Tagged with
  • 31
  • 31
  • 31
  • 20
  • 13
  • 8
  • 7
  • 7
  • 6
  • 6
  • 4
  • 4
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Instruction history management for high-performance microprocessors

Bhargava, Ravindra Nath 28 August 2008 (has links)
Not available / text
12

Scalable primary cache memory architectures

Agarwal, Vikas 28 August 2008 (has links)
Not available / text
13

Efficient adaptation of multiple microprocessor resources for energy reduction using dynamic optimization

Hu, Shiwen 28 August 2008 (has links)
Not available / text
14

Microarchitecture techniques to improve the design of superscalar microprocessors

Chamdani, Joseph Irawan 05 1900 (has links)
No description available.
15

Design of a microprocessor based instrumentation module for signal processing applications

Ramachandran, Ashok. January 1984 (has links)
Call number: LD2668 .T4 1984 R35 / Master of Science
16

CounterDataFlow architecture : design and performance

Miller, Michael F., (Michael Frederic) 17 July 1997 (has links)
The counterflow pipeline concept was originated by Sproull and Sutherland to demonstrate the concept of asynchronous circuits. This architecture relies on distributed decision making and localized clocking and data movement. We have taken these ideas and reformulated them into a substantially faster more scalable architecture that has the same distributed decision making and locality for clocking and data, but adds very aggressive speculation, no stalls, and other desirable characteristics. A high level Java simulator has been built to explore the design tradeoffs and evaluate performance. / Graduation date: 1998
17

Clock routing for high performance microprocessor designs.

January 2011 (has links)
Tian, Haitong. / Chinese abstract is on unnumbered page. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (p. 65-74). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivations --- p.1 / Chapter 1.2 --- Our Contributions --- p.2 / Chapter 1.3 --- Organization of the Thesis --- p.3 / Chapter 2 --- Background Study --- p.4 / Chapter 2.1 --- Traditional Clock Routing Problem --- p.4 / Chapter 2.2 --- Tree-Based Clock Routing Algorithms --- p.5 / Chapter 2.2.1 --- Clock Routing Using H-tree --- p.5 / Chapter 2.2.2 --- Method of Means and Medians(MMM) --- p.6 / Chapter 2.2.3 --- Geometric Matching Algorithm (GMA) --- p.8 / Chapter 2.2.4 --- Exact Zero-Skew Algorithm --- p.9 / Chapter 2.2.5 --- Deferred Merge Embedding (DME) --- p.10 / Chapter 2.2.6 --- Boundary Merging and Embedding (BME) Algorithm --- p.14 / Chapter 2.2.7 --- Planar Clock Routing Algorithm --- p.17 / Chapter 2.2.8 --- Useful-skew Tree Algorithm --- p.18 / Chapter 2.3 --- Non-Tree Clock Distribution Networks --- p.19 / Chapter 2.3.1 --- Grid (Mesh) Structure --- p.20 / Chapter 2.3.2 --- Spine Structure --- p.20 / Chapter 2.3.3 --- Hybrid Structure --- p.21 / Chapter 2.4 --- Post-grid Clock Routing Problem --- p.22 / Chapter 2.5 --- Limitations of the Previous Work --- p.24 / Chapter 3 --- Post-Grid Clock Routing Problem --- p.26 / Chapter 3.1 --- Introduction --- p.26 / Chapter 3.2 --- Problem Definition --- p.27 / Chapter 3.3 --- Our Approach --- p.30 / Chapter 3.3.1 --- Delay-driven Path Expansion Algorithm --- p.31 / Chapter 3.3.2 --- Pre-processing to Connect Critical ports --- p.34 / Chapter 3.3.3 --- Post-processing to Reduce Capacitance --- p.36 / Chapter 3.4 --- Experimental Results --- p.39 / Chapter 3.4.1 --- Experiment Setup --- p.39 / Chapter 3.4.2 --- Validations of the Delay and Slew Estimation --- p.39 / Chapter 3.4.3 --- Comparisons with the Tree Grow (TG) Approach --- p.41 / Chapter 3.4.4 --- Lowest Achievable Delays --- p.42 / Chapter 3.4.5 --- Simulation Results --- p.42 / Chapter 4 --- Non-tree Based Post-Grid Clock Routing Problem --- p.44 / Chapter 4.1 --- Introduction --- p.44 / Chapter 4.2 --- Handling Ports with Large Load Capacitances --- p.46 / Chapter 4.2.1 --- Problem Ports Identification --- p.47 / Chapter 4.2.2 --- Non-Tree Construction --- p.47 / Chapter 4.2.3 --- Wire Link Selection --- p.48 / Chapter 4.3 --- Path Expansion in Non-tree Algorithm --- p.51 / Chapter 4.4 --- Limitations of the Non-tree Algorithm --- p.51 / Chapter 4.5 --- Experimental Results --- p.51 / Chapter 4.5.1 --- Experiment Setup --- p.51 / Chapter 4.5.2 --- Validations of the Delay and Slew Estimation --- p.52 / Chapter 4.5.3 --- Lowest Achievable Delays --- p.53 / Chapter 4.5.4 --- Results on New Benchmarks --- p.53 / Chapter 4.5.5 --- Simulation Results --- p.55 / Chapter 5 --- Efficient Partitioning-based Extension --- p.57 / Chapter 5.1 --- Introduction --- p.57 / Chapter 5.2 --- Partition-based Extension --- p.58 / Chapter 5.3 --- Experimental Results --- p.61 / Chapter 5.3.1 --- Experiment Setup --- p.61 / Chapter 5.3.2 --- Running Time Improvement with Partitioning Technique --- p.61 / Chapter 6 --- Conclusion --- p.63 / Bibliography --- p.65
18

A partitioning-based approach to the fitting problem in special architecture EPLDs

Goller, Steffan 01 January 1992 (has links)
In this thesis, we describe an architecture-driven fitting algorithm for an Application-Specific EPLD, the CY7C361, from Cypress Semiconductor. Traditional placement and routing tools for PLDs perform placement and routing separately. Several placement possibilities are created and the router tries to realize the connections between the physical locations of the cells on the chip. The Cypress CY7C361 has a very unique chip architecture with a highly limited connectivity between the physical cells. Therefore, it is necessary to consider the mutability when the placement of cells is performed. The combination of the two stages is called fitting. The specific architecture-dependent constraints, imposed on the connectivity of the CY7C361 chip were used to develop a hierarchical partitioning structure of the algorithm. This approach limits very effectively the solution space in the early stage of the search for a solution of the fitting problem. The partitioning approach for the fitting algorithm is not limited on the Cypress CY7C361. It can be applied to other architectures with similar connectivity restrictions, too.
19

Power estimation of superscalar microprocessor using VHDL model

Zhang, Wanpeng 22 November 1999 (has links)
Power optimization becomes more and more important due to the design cost and reliability. Sometimes high power consumption means expensive package cost and low reliability. The first step in optimizing power consumption is determining where power is consumed within a processor. While system-level code tracing and bit transition calculation are not enough to estimate the power distribution, transistor-level HSPICE simulation to model a microprocessor is too complex and time-consuming. In our research, a VHDL model with enhanced signal tracing function will be developed based on the existing VHDL behavior model. The power consumption of superscalar microprocessor in terms of switching activity and capacitance will be carefully studied. Two factors served as the basis for study: accessibility and importance for power calculations. A brief examination of the datapath suggests that the register file, the instruction cache and data cache are some of the major contributors to power usage. It was therefore decided to track the input and output bit transitions to these modules. These transitions are counted along with the number of accesses to each of the modules. In order to gather all of this data, the original VHDL model simulator has been enhanced. As instructions pass through the CPU, additional code is required to track and record the necessary information. For each individual instruction in the ISA, various information is recorded based on the elements in the processor that the instruction affects. For instance, if the simulator is about to execute a load instruction, the instruction uses the programmer counter, the instruction bus, data bus, the address bus, the ALU (adder) and the register file. The information being recorded for each of these elements must be updated to reflect the execution of that particular load instruction. Also, the inside circuit of each module, i.e. register file, instruction cache and data cache and the 6-transistor memory cell layout considering the 0.25��m CMOS technology will be studied in order to extract the capacitance. We do not need very accurate, absolute power estimation, therefore, we will try to keep the model simple. / Graduation date: 2000
20

Characterization and Avoidance of Critical Pipeline Structures in Aggressive Superscalar Processors

Sassone, Peter G. 20 July 2005 (has links)
In recent years, with only small fractions of modern processors now accessible in a single cycle, computer architects constantly fight against propagation issues across the die. Unfortunately this trend continues to shift inward, and now the even most internal features of the pipeline are designed around communication, not computation. To address the inward creep of this constraint, this work focuses on the characterization of communication within the pipeline itself, architectural techniques to avoid it when possible, and layout co-design for early detection of problems. I present work in creating a novel detection tool for common case operand movement which can rapidly characterize an applications dataflow patterns. The results produced are suitable for exploitation as a small number of patterns can describe a significant portion of modern applications. Work on dynamic dependence collapsing takes the observations from the pattern results and shows how certain groups of operations can be dynamically grouped, avoiding unnecessary communication between individual instructions. This technique also amplifies the efficiency of pipeline data structures such as the reorder buffer, increasing both IPC and frequency. I also identify the same sets of collapsible instructions at compile time, producing the same benefits with minimal hardware complexity. This technique is also done in a backward compatible manner as the groups are exposed by simple reordering of the binarys instructions. I present aggressive pipelining approaches for these resources which avoids the critical timing often presumed necessary in aggressive superscalar processors. As these structures are designed for the worst case, pipelining them can produce greater frequency benefit than IPC loss. I also use the observation that the dynamic issue order for instructions in aggressive superscalar processors is predictable. Thus, a hardware mechanism is introduced for caching the wakeup order for groups of instructions efficiently. These wakeup vectors are then used to speculatively schedule instructions, avoiding the dynamic scheduling when it is not necessary. Finally, I present a novel approach to fast and high-quality chip layout. By allowing architects to quickly evaluate what if scenarios during early high-level design, chip designs are less likely to encounter implementation problems later in the process.

Page generated in 0.1693 seconds