Spelling suggestions: "subject:"pipeline."" "subject:"pipelined.""
1 |
High-performance hybrid wave-pipeline scheme as it applies to adder micro-architecturesLevy, James E., January 2005 (has links) (PDF)
Thesis (M.S.)--Washington State University. / Includes bibliographical references.
|
2 |
Various perspectives of loop pipeliningUgurdag, Hasan Fatih January 1995 (has links)
No description available.
|
3 |
A high performance low power mesochronous pipeline architecture for computer systemsTatapudi, Suryanarayana Bhimeshwara. January 2006 (has links) (PDF)
Thesis (Ph. D.)--Washington State University, May 2006. / Includes bibliographical references (p. 94-96).
|
4 |
High Speed Circuit Design Based on a Hybrid of Conventional and Wave PipeliningSulistyo, Jos Budi 03 October 2005 (has links)
The increasing capabilities of multimedia appliances demand arithmetic circuits with higher speed and reasonable power dissipation. A common technique to attain those goals is synchronous pipelining, which increases the throughput of a circuit at the expense of longer latency, and it is therefore suitable where throughput takes priority over latency.
Two synchronous pipelining approaches, conventional pipelining and wave pipelining, are commonly employed. Conventional pipelining uses registers to divide the circuit into shorter paths and synchronize among sub-blocks, while wave pipelining uses the delay of combinational elements to perform those tasks. As wave pipelining does not introduce additional registers, in principle, it can attain a higher throughput and lower power consumption. However, its throughput is limited by delay variations, while delay balancing often leads to increased power dissipation.
This dissertation proposes a hybrid pipelining method called HyPipe, which divides the circuit into sub-blocks using conventional pipelining, and applies wave pipelining to each sub-block. Each sub-block is derived from a single base circuit, leading to a better delay balance and greater throughput than with heterogeneous circuits. Another requirement for wave pipelining to achieve high speed is short signal rise and fall times. Since CMOS wide-NAND and wide-NOR gates exhibit long rise and fall times and large delay variations, they should be decomposed. We show that the straightforward decomposition using alternating levels of NAND and NOR gates results in large delay variations. Therefore, we propose a new decomposition method using only one gate type. Our method reduces delay variations by up to 39%, and it is appropriate for wave pipelining based on standard-cells or sea-of-gates.
We laid out a 4x4 HyPipe multiplier as a proof of concept and performed a post-layout SPICE simulation. The multiplier achieves a throughput of 4.17 billion multiplications per second or a clock period of 2.52 four-load inverter delays, which is almost twice the speed of any existing multiplier in the open literature. When the supply voltage is reduced to 1.2 V from 1.8 V, its power consumption is reduced from 76.2 mW to 18.2 mW while performing 2.33 billion multiplications per second. / Ph. D.
|
5 |
Implementation and comparison of two wakeup logic for out-of-order superscalar microprocessorsLee, Hsien-Yen 22 August 2002 (has links)
The wakeup logic in out-of-order superscalar microprocessors is responsible for
resolving the data dependency hazard between instructions. Its performance is critical
because it may prevent the processor to have deeper pipelines or to achieve the highest IPC
(Instructions Per Cycle) possible.
In this thesis, we implemented the circuit and layout for two types of wakeup logic
(CAM-type and RAM-type) used in the modem microprocessors. These two
implementations are simulated extensively using a circuit level simulator - HSPICE, with
full parasitic loads. We, then, made comparison between the CAM-type and RAM-type
wakeup circuits.
From the simulation results, the CAM-type wakeup logic has a better performance than
the RAM-type wakeup logic if a larger number of physical registers is employed by the
processor. The performance impacts caused by varying the other superscalar design
parameters, such as instruction window size and issue width, are not much different for both
types of wakeup logic implementations. / Graduation date: 2003
|
6 |
Advanced middleware support for distributed data-intensive applicationsDu, Wei. January 2005 (has links)
Thesis (Ph. D.)--Ohio State University, 2005. / Title from first page of PDF file. Document formatted into pages; contains xix, 183 p.; also includes graphics (some col.). Includes bibliographical references (p. 170-183). Available online via OhioLINK's ETD Center
|
7 |
Application of Data Pipelining Technology in Cheminformatics and BioinformaticsMao, Linyong 12 1900 (has links)
Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Master of Sciences in the School of Informatics Indiana University December 2002 / Data pipelining is the processing, analysis, and mining of large volumes of data through a branching network of computational steps. A data pipelining system consists of a collection of modular computational components and a network for streaming data between them. By defining a logical path for data through a network of computational components and configuring each component accordingly, a user can create a protocol to perform virtually any desired function with data and extract knowledge from them. A set of data pipelines were constructed to explore the relationship between the biodegradability and structural properties of halogenated aliphatic compounds in a data set in which each compound has one degradation rate and nine structure-derived properties. After training, the data pipeline was able to calculate the degradation rates of new compounds with a relatively accurate rate. A second set of data pipelines was generated to cluster new DNA sequences. The data pipelining technology was applied to identify a core sequence to represent a DNA cluster and construct the 95% confidence distance interval for the cluster. The result shows that 74% of the DNA sequences were correctly clustered and there was no false clustering.
|
8 |
Exploiting level sensitive latches in wire pipeliningSeth, Vikram 17 February 2005 (has links)
The present research presents procedures for exploitation of level sensitive latches in wire pipelining. The user gives a Steiner tree, having a signal source and set of destination or sinks, and the location in rectangular plane, capacitive load and required arrival time at each of the destinations. The user also defines a library of non-clocked (buffer) elements and clocked elements (flip-flop and latch), also known as synchronous elements. The first procedure performs concurrent repeater and synchronous element insertion in a bottom-up manner to find the minimum latency that may be achieved between the source and the destinations. The second procedure takes additional input (required latency) for each destination, derived from previous procedure, and finds the repeater and synchronous element assignments for all internal nodes of the Steiner tree, which minimize overall area used. These procedures utilize the latency and area advantages of latch based pipelining over flip-flop based pipelining. The second procedure suggests two methods to tackle the challenges that exist in a latch based design. The deferred delay padding technique is introduced, which removes the short path violations for latches with minimal extra cost.
|
9 |
Exploiting level sensitive latches in wire pipeliningSeth, Vikram 17 February 2005 (has links)
The present research presents procedures for exploitation of level sensitive latches in wire pipelining. The user gives a Steiner tree, having a signal source and set of destination or sinks, and the location in rectangular plane, capacitive load and required arrival time at each of the destinations. The user also defines a library of non-clocked (buffer) elements and clocked elements (flip-flop and latch), also known as synchronous elements. The first procedure performs concurrent repeater and synchronous element insertion in a bottom-up manner to find the minimum latency that may be achieved between the source and the destinations. The second procedure takes additional input (required latency) for each destination, derived from previous procedure, and finds the repeater and synchronous element assignments for all internal nodes of the Steiner tree, which minimize overall area used. These procedures utilize the latency and area advantages of latch based pipelining over flip-flop based pipelining. The second procedure suggests two methods to tackle the challenges that exist in a latch based design. The deferred delay padding technique is introduced, which removes the short path violations for latches with minimal extra cost.
|
10 |
Toward a software pipelining framework for many-core chipsRibutzka, Juergen. January 2009 (has links)
Thesis (M.S.)--University of Delaware, 2009. / Principal faculty advisor: Guang R. Gao, Dept. of Electrical & Computer Engineering. Includes bibliographical references.
|
Page generated in 0.0937 seconds