Return to search

Hardware Error Detection Using AN-Codes

Due to the continuously decreasing feature sizes and the increasing complexity of integrated circuits, commercial off-the-shelf (COTS) hardware is becoming less and less reliable. However, dedicated reliable hardware is expensive and usually slower than commodity hardware. Thus, economic pressure will most likely result in the usage of unreliable COTS hardware in safety-critical systems.

The usage of unreliable, COTS hardware in safety-critical systems results in the need for software-implemented solutions for handling execution errors caused by this unreliable hardware. In this thesis, we provide techniques for detecting hardware errors that disturb the execution of a program. The detection provided facilitates handling of these errors, for example, by retry or graceful degradation.

We realize the error detection by transforming unsafe programs that are not guaranteed to detect execution errors into safe programs that detect execution errors with a high probability. Therefore, we use arithmetic AN-, ANB-, ANBD-, and ANBDmem-codes. These codes detect errors that modify data during storage or transport and errors that disturb computations as well. Furthermore, the error detection provided is independent of the hardware used.

We present the following novel encoding approaches:
- Software Encoded Processing (SEP) that transforms an unsafe binary into a safe execution at runtime by applying an ANB-code, and
- Compiler Encoded Processing (CEP) that applies encoding at compile time and provides different levels of safety by using different arithmetic codes.

In contrast to existing encoding solutions, SEP and CEP allow to encode applications whose data and control flow is not completely predictable at compile time.

For encoding, SEP and CEP use our set of encoded operations also presented in this thesis. To the best of our knowledge, we are the first ones that present the encoding of a complete RISC instruction set including boolean and bitwise logical operations, casts, unaligned loads and stores, shifts and arithmetic operations.

Our evaluations show that encoding with SEP and CEP significantly reduces the amount of erroneous output caused by hardware errors. Furthermore, our evaluations show that, in contrast to replication-based approaches for detecting errors, arithmetic encoding facilitates the detection of permanent hardware errors.
This increased reliability does not come for free. However, unexpectedly the runtime costs for the different arithmetic codes supported by CEP compared to redundancy increase only linearly, while the gained safety increases exponentially.

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa.de:bsz:14-qucosa-69872
Date08 July 2011
CreatorsSchiffel, Ute
ContributorsTechnische Universität Dresden, Fakultät Informatik, Prof. Christof Fetzer, Prof. Christof Fetzer, Prof. Dr. Wolfgang Ehrenberger
PublisherSaechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typedoc-type:doctoralThesis
Formatapplication/pdf

Page generated in 0.0047 seconds