Return to search

Fault-Tolerant Average Execution Time Optimization for General-Purpose Multi-Processor System-On-Chips

Fault tolerance is due to the semiconductor technology development important, not only for safety-critical systems but also for general-purpose (non-safety critical) systems. However, instead of guaranteeing that deadlines always are met, it is for general-purpose systems important to minimize the average execution time (AET) while ensuring fault tolerance. For a given job and a soft (transient) no-error probability, we define mathematical formulas for AET using voting (active replication), rollback-recovery with checkpointing (RRC) and a combination of these (CRV) where bus communication overhead is included. And, for a given multi-processor system-on-chip (MPSoC), we define integer linear programming (ILP) models that minimize the AET including bus communication overhead when: (1) selecting the number of checkpoints when using RRC or a combination where RRC is included, (2) finding the number of processors and job-to-processor assignment when using voting or a combination where voting is used, and (3) defining fault tolerance scheme (voting, RRC or CRV) per job and defining its usage for each job. Experiments demonstrate significant savings in AET.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-17705
Date January 2009
CreatorsVäyrynen, Mikael
PublisherLinköpings universitet, Institutionen för datavetenskap, Linköpings universitet, Institutionen för systemteknik, Linköpings universitet, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/masterThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0018 seconds