Global ETD Search

Return to search

Efficient fault tolerance for pipelined structures and its application to superscalar and dataflow machines

Silicon reliability has reemerged as a very important problem in digital system design. As voltage and device dimensions shrink, combinational logic is becoming sensitive to temporary errors caused by single event upsets, transistor and interconnect aging and circuit variability. In particular, computational functional units are very challenging to protect because current redundant execution techniques have a high power and area overhead, cannot guarantee detection of some errors and cause a substantial performance degradation. As traditional worst-case design rules that guarantee error avoidance become too conservative to be practical, new microarchitectures need to be investigated to address this problem. To this end, this dissertation introduces Self-Imposed Temporal Redundancy (SITR), a speculative microarchitectural temporal redundancy technique suitable for pipelined computational functional units. SITR is able to detect most temporary errors, is area and energy-efficient and can be easily incorporated in an out-of-order microprocessor. SITR can also be used as a throttling mechanism against thermal viruses and, in some cases, allows designers to design very aggressive bypass networks capable of achieving high instruction throughput, by tolerating timing violations. To address the performance degradation caused by redundant execution, this dissertation proposes using a tiled-data ow model of computation because it enables the design of scalable, resource-rich computational substrates. Starting with the WaveScalar tiled-data flow architecture, we enhance the reliability of its datapath, including computational logic, interconnection network and storage structures. Computations are performed speculatively using SITR while traditional information redundancy techniques are used to protect data transmission and storage. Once a value has been verified, confirmation messages are transmitted to consumer instructions. Upon error detection, nullification messages are sent to the instructions affected by the error. Our experimental results demonstrate that the slowdown due to redundant computation and error recovery on the tiled-data flow machine is consistently smaller than on a superscalar von Neumann architecture. However, the number of additional messages required to support SITR execution is substantial, increasing power consumption. To reduce this overhead without significantly affecting performance, we introduce wave-based speculation, a mechanism targeted for data flow architectures that enables speculation only when it is likely to benefit performance. / text

http://hdl.handle.net/2152/18262

Fault-tolerant computing

Computers, Pipeline--Reliability

Data flow computing

Computer architecture

Identifer	oai:union.ndltd.org:UTEXAS/oai:repositories.lib.utexas.edu:2152/18262
Date	10 October 2012
Creators	Mizan, Elias, 1976-
Source Sets	University of Texas
Language	English
Detected Language	English
Format	electronic
Rights	Copyright is held by the author. Presentation of this material on the Libraries' web site by University Libraries, The University of Texas at Austin was made possible under a limited license grant from the author who has retained all copyrights in the works.

Page generated in 0.0016 seconds

Efficient fault tolerance for pipelined structures and its application to superscalar and dataflow machines

Description

Links & Downloads

Tags

Additional Fields