Spelling suggestions: "subject:"rollback recovery with checkpoint"" "subject:"rollback recovery with checkpoints""
1 |
Fault-Tolerant Average Execution Time Optimization for General-Purpose Multi-Processor System-On-ChipsVäyrynen, Mikael January 2009 (has links)
<p>Fault tolerance is due to the semiconductor technology development important, not only for safety-critical systems but also for general-purpose (non-safety critical) systems. However, instead of guaranteeing that deadlines always are met, it is for general-purpose systems important to minimize the average execution time (AET) while ensuring fault tolerance. For a given job and a soft (transient) no-error probability, we define mathematical formulas for AET using voting (active replication), rollback-recovery with checkpointing (RRC) and a combination of these (CRV) where bus communication overhead is included. And, for a given multi-processor system-on-chip (MPSoC), we define integer linear programming (ILP) models that minimize the AET including bus communication overhead when: (1) selecting the number of checkpoints when using RRC or a combination where RRC is included, (2) finding the number of processors and job-to-processor assignment when using voting or a combination where voting is used, and (3) defining fault tolerance scheme (voting, RRC or CRV) per job and defining its usage for each job. Experiments demonstrate significant savings in AET.</p>
|
2 |
Fault-Tolerant Average Execution Time Optimization for General-Purpose Multi-Processor System-On-ChipsVäyrynen, Mikael January 2009 (has links)
Fault tolerance is due to the semiconductor technology development important, not only for safety-critical systems but also for general-purpose (non-safety critical) systems. However, instead of guaranteeing that deadlines always are met, it is for general-purpose systems important to minimize the average execution time (AET) while ensuring fault tolerance. For a given job and a soft (transient) no-error probability, we define mathematical formulas for AET using voting (active replication), rollback-recovery with checkpointing (RRC) and a combination of these (CRV) where bus communication overhead is included. And, for a given multi-processor system-on-chip (MPSoC), we define integer linear programming (ILP) models that minimize the AET including bus communication overhead when: (1) selecting the number of checkpoints when using RRC or a combination where RRC is included, (2) finding the number of processors and job-to-processor assignment when using voting or a combination where voting is used, and (3) defining fault tolerance scheme (voting, RRC or CRV) per job and defining its usage for each job. Experiments demonstrate significant savings in AET.
|
Page generated in 0.1264 seconds