• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 116
  • 25
  • 21
  • 8
  • 8
  • 8
  • 8
  • 8
  • 8
  • 4
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 198
  • 198
  • 198
  • 100
  • 43
  • 36
  • 34
  • 34
  • 32
  • 32
  • 29
  • 25
  • 23
  • 22
  • 22
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

A Single Fault-Tolerant Dual Channel Controller

Lowery, Thomas J. 01 January 1984 (has links) (PDF)
The advent of the VLSI technology makes it feasible to offer a once expensive system attribute called fault-tolerance to a wide variety of applications. This can be accomplished by using off-the-shelf single board computers and peripherals as the heart of the system. Custom design boards can then be added to meet the specific requirements of each application.
62

Modeling reconfiguration algorithms for regular architecture

DeBrunner, Linda Sumners 12 October 2005 (has links)
Three models are proposed to evaluate and design distributed reconfigurable systems for fault tolerant, highly reliable applications. These models serve as valuable tools for developing fault tolerant systems. In each model, cells work together in parallel to change the global structure through a series of separate actions. In the Local Supervisor Model (LSM), selected cells guide the reconfiguration process. In the Tessellation Automata Model (TAM), each cell determines its next state based on its state and its neighbors' states, and communicates its state information to its neighbors. In the Interconnected Finite State Machine Model (IFS:MM:), each cell determines its next state and outputs based on its state and its inputs. The hierarchical nature of the TAM and IFSMM provides advantages in evaluating, comparing, and designing systems. The use of each of these models in describing systems is demonstrated. The IFSMM: is emphasized since it is the most versatile of the three models. The IFSMM: is used to identify algorithm weaknesses and improvements, compare existing algorithms, and develop a novel design for a reconfigurable hypercube. / Ph. D.
63

Group-based checkpoint/rollback recovery for large scale message-passing systems

Ho, Chun-yin., 何俊賢. January 2008 (has links)
published_or_final_version / Computer Science / Master / Master of Philosophy
64

Automated fault localization: a statistical predicate analysis approach

Hu, Peifeng., 胡佩鋒. January 2006 (has links)
published_or_final_version / abstract / Computer Science / Doctoral / Doctor of Philosophy
65

Consul: A communication substrate for fault-tolerant distributed programs.

Mishra, Shivakant. January 1992 (has links)
As human dependence on computing technology increases, so does the need for computer system dependability. This dissertation introduces Consul, a communication substrate designed to help improve system dependability by providing a platform for building fault-tolerant, distributed systems based on the replicated state machine approach. The key issues in this approach--ensuring replica consistency and reintegrating recovering replicas--are addressed in Consul by providing abstractions called fault-tolerant services. These include a broadcast service to deliver messages to a collection of processes reliably and in some consistent order, a membership service to maintain a consistent system-wide view of which processes are functioning and which have failed, and a recovery service to recover a failed process. Fault-tolerant services are implemented in Consul by a unified collection of protocols that provide support for managing communication, redundancy, failures, and recovery in a distributed system. At the heart of Consul is Psync, a protocol that provides for multicast communication based on a context graph that explicitly records the partial (or causal) order of messages. This graph also serves as the basis for novel algorithms used in the ordering, membership, and recovery protocols. The ordering protocol combines the semantics of the operations encoded in messages with the partial order provided by Psync to increase the concurrency of the application. Similarly, the membership protocol exploits the partial ordering to allow different processes to conclude that a failure has occurred at different times relative to the sequence of messages received, thereby reducing the amount of synchronization required. The recovery protocol combines checkpointing with the replay of messages stored in the context graph to recover the state of a failed process. Moreover, this collection of protocols is implemented in a highly-configurable manner, thus allowing a system builder to easily tailor an instance of Consul from this collection of building-block protocols. Consul is built in the x-Kernel and executes standalone on a collection of Sun 3 work-stations. Initial testing and performance studies have been done using two applications: a replicated directory and a distributed wordgame. These studies show that the semantic based order is more efficient than a total order in many situations, and that the overhead imposed by the checkpointing, membership, and recovery protocols is insignificant.
66

Fault tolerance and reliability patterns

Unknown Date (has links)
The need to achieve dependability in critical infrastructures has become indispensable for government and commercial enterprises. This need has become more necessary with the proliferation of malicious attacks on critical systems, such as healthcare, aerospace and airline applications. Additionally, due to the widespread use of web services in critical systems, the need to ensure their reliability is paramount. We believe that patterns can be used to achieve dependability. We conducted a survey of fault tolerance, reliability and web service products and patterns to better understand them. One objective of our survey is to evaluate the state of these patterns, and to investigate which standards are being used in products and their tool support. Our survey found that these patterns are insufficient, and many web services products do not use them. In light of this, we wrote some fault tolerance and web services reliability patterns and present an analysis of them. / by Ingrid A. Buckley. / Thesis (M.S.C.S.)--Florida Atlantic University, 2008. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2008. Mode of access: World Wide Web.
67

Towards a methodology for building reliable systems

Unknown Date (has links)
Reliability is a key system characteristic that is an increasing concern for current systems. Greater reliability is necessary due to the new ways in which services are delivered to the public. Services are used by many industries, including health care, government, telecommunications, tools, and products. We have defined an approach to incorporate reliability along the stages of system development. We first did a survey of existing dependability patterns to evaluate their possible use in this methodology. We have defined a systematic methodology that helps the designer apply reliability in all steps of the development life cycle in the form of patterns. A systematic failure enumeration process to define corresponding countermeasures was proposed as a guideline to define where reliability is needed. We introduced the idea of failure patterns which show how failures manifest and propagate in a system. We also looked at how to combine reliability and security. Finally, we defined an approach to certify the level of reliability of an implemented web service. All these steps lead towards a complete methodology. / by Ingrid A. Buckley. / Thesis (Ph.D.)--Florida Atlantic University, 2012. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2012. Mode of access: World Wide Web.
68

Fault-tolerant and security mechanisms for mobile agent systems. / Fault-tolerant & security mechanisms for mobile agent systems

January 2006 (has links)
Leung Kwai Ki. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2006. / Includes bibliographical references (leaves 152-161). / Abstracts in English and Chinese. / Abstract --- p.i / 論文摘要 --- p.iii / Acknowledgements --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Contributions of this thesis --- p.3 / Chapter 1.2 --- Thesis structure --- p.3 / Chapter 2 --- Mobile Agent Paradigm --- p.6 / Chapter 3 --- Analysis on Fault-tolerant Mechanisms --- p.9 / Chapter 3.1 --- Design considerations --- p.9 / Chapter 3.1.1 --- Infrastructure failure --- p.10 / Chapter 3.1.2 --- Unfavorable outcomes --- p.10 / Chapter 3.1.3 --- Exactly-once property --- p.11 / Chapter 3.1.4 --- Blocking --- p.13 / Chapter 3.1.5 --- Network partitioning --- p.14 / Chapter 3.1.6 --- Domino effect --- p.15 / Chapter 3.1.7 --- Inter-agent communications and global consistency --- p.16 / Chapter 3.1.8 --- Platform dependent and platform independent approaches . . --- p.17 / Chapter 3.1.9 --- ACID in mobile agent systems --- p.17 / Chapter 3.2 --- Basic Mechanisms --- p.18 / Chapter 3.2.1 --- Replication mechanisms --- p.19 / Chapter 3.2.2 --- Checkpointing and logging --- p.22 / Chapter 3.2.3 --- Comparison between the replication and checkpointing mechanisms --- p.25 / Chapter 3.2.4 --- Rollback --- p.26 / Chapter 3.2.5 --- Diagnosis and planning --- p.26 / Chapter 3.3 --- Analysis of current approaches --- p.27 / Chapter 3.3.1 --- Infrastructure failure handling --- p.27 / Chapter 3.3.2 --- Unfavorable outcomes prevention --- p.38 / Chapter 3.3.3 --- Diagnosis and planning --- p.40 / Chapter 3.3.4 --- Summary --- p.42 / Chapter 3.4 --- Related work of analysing fault-tolerant mechanisms --- p.43 / Chapter 3.5 --- Summary --- p.43 / Chapter 4 --- Flexible Monitor Chain --- p.45 / Chapter 4.1 --- Overview --- p.45 / Chapter 4.2 --- Assumptions --- p.47 / Chapter 4.3 --- Protocol --- p.48 / Chapter 4.4 --- Different scenarios of failure --- p.51 / Chapter 4.5 --- Performance evaluation --- p.53 / Chapter 4.5.1 --- Simulation model --- p.53 / Chapter 4.5.2 --- Results and discussions --- p.55 / Chapter 4.6 --- Discussions --- p.58 / Chapter 4.6.1 --- Preservation of the exactly-once property --- p.58 / Chapter 4.6.2 --- High flexibility in the management of monitors --- p.59 / Chapter 4.6.3 --- High stability --- p.59 / Chapter 4.6.4 --- Feasibility to be applied in an open environment --- p.60 / Chapter 4.6.5 --- Overcoming the problem of network partitioning --- p.60 / Chapter 4.6.6 --- Lightweightedness --- p.60 / Chapter 4.6.7 --- Global consistency and domino effect --- p.61 / Chapter 4.7 --- Summary --- p.61 / Chapter 5 --- Transaction and Rollback Models --- p.62 / Chapter 5.1 --- Simple E-Marketplace --- p.64 / Chapter 5.2 --- Transaction and rollback models --- p.66 / Chapter 5.2.1 --- Distributed transaction without rollback (Ml) --- p.67 / Chapter 5.2.2 --- A chained-transaction (M2) --- p.67 / Chapter 5.2.3 --- A chained-transaction with flexible rollback scheme (M3) . --- p.69 / Chapter 5.3 --- Performance evaluation --- p.71 / Chapter 5.3.1 --- Experimental setup --- p.71 / Chapter 5.3.2 --- Results and discussions --- p.73 / Chapter 5.4 --- Summary --- p.77 / Chapter 6 --- Dependent Partial Rollback --- p.79 / Chapter 6.1 --- Overview --- p.80 / Chapter 6.2 --- Formal representation --- p.83 / Chapter 6.3 --- Assumptions --- p.85 / Chapter 6.4 --- Protocol --- p.86 / Chapter 6.5 --- Discussions --- p.89 / Chapter 6.5.1 --- Assumption: Weak migration and the effect of a stage --- p.90 / Chapter 6.5.2 --- Assumption: Failure free environment --- p.92 / Chapter 6.5.3 --- Assumption: guarantee of rollback --- p.92 / Chapter 6.5.4 --- Assumption: Domino effect --- p.93 / Chapter 6.5.5 --- Platform independence --- p.94 / Chapter 6.5.6 --- High efficiency --- p.94 / Chapter 6.5.7 --- Stage-based design --- p.94 / Chapter 6.5.8 --- Autonomy --- p.95 / Chapter 6.5.9 --- High flexibility --- p.95 / Chapter 6.6 --- Related Works --- p.96 / Chapter 6.7 --- Implementation of SEMP with dependent partial rollback --- p.97 / Chapter 6.8 --- Summary --- p.99 / Chapter 7 --- Analysis on Security Mechanisms --- p.100 / Chapter 7.1 --- Classifications of security issues --- p.100 / Chapter 7.2 --- Analysis of current approaches --- p.103 / Chapter 7.2.1 --- Encrypting functions and data --- p.103 / Chapter 7.2.2 --- Computing with encrypted functions --- p.106 / Chapter 7.2.3 --- Trusted environment --- p.107 / Chapter 7.2.4 --- Limitation of execution time --- p.109 / Chapter 7.2.5 --- Execution tracing --- p.110 / Chapter 7.3 --- Execution tracing --- p.111 / Chapter 7.4 --- Summary --- p.116 / Chapter 8 --- Execution Tracing with Randomly-Selected Hosts --- p.117 / Chapter 8.1 --- Overview --- p.117 / Chapter 8.2 --- Assumptions --- p.119 / Chapter 8.3 --- Protocol --- p.120 / Chapter 8.4 --- Performance evaluation --- p.121 / Chapter 8.4.1 --- Simple sgent system --- p.121 / Chapter 8.4.2 --- Experimental setup --- p.123 / Chapter 8.4.3 --- Results and discussions --- p.123 / Chapter 8.5 --- Discussions --- p.124 / Chapter 8.5.1 --- Detect the modifications of the code and data --- p.124 / Chapter 8.5.2 --- Against masquerade --- p.125 / Chapter 8.5.3 --- Against skip from re-execution --- p.125 / Chapter 8.5.4 --- Against collaboration --- p.125 / Chapter 8.5.5 --- Higher privacy --- p.126 / Chapter 8.5.6 --- Low workload on the trusted host --- p.126 / Chapter 8.5.7 --- Feasible to be used in the open environment --- p.126 / Chapter 8.5.8 --- Secure data collection --- p.126 / Chapter 8.5.9 --- Comparison with the existing approaches --- p.127 / Chapter 8.5.10 --- Weaknesses --- p.128 / Chapter 8.6 --- Optimizations --- p.128 / Chapter 8.6.1 --- Sampling --- p.128 / Chapter 8.6.2 --- Inserting sub-state and request on demand --- p.129 / Chapter 8.7 --- Summary --- p.129 / Chapter 9 --- FTS Framework --- p.131 / Chapter 9.1 --- Assumptions --- p.132 / Chapter 9.2 --- Abstract framework --- p.132 / Chapter 9.2.1 --- Different agents and their duties --- p.132 / Chapter 9.2.2 --- Messaging --- p.135 / Chapter 9.3 --- Implementation in Jade --- p.135 / Chapter 9.3.1 --- Characteristics of Jade --- p.137 / Chapter 9.3.2 --- Core implementation details --- p.138 / Chapter 9.4 --- Performance Evaluation --- p.144 / Chapter 9.4.1 --- Experimental Setup --- p.144 / Chapter 9.4.2 --- Experimental Results --- p.145 / Chapter 9.5 --- Discussions --- p.147 / Chapter 9.5.1 --- High worker survivability --- p.148 / Chapter 9.5.2 --- Low blocking chance --- p.148 / Chapter 9.5.3 --- Trusted Third Party Hosts --- p.149 / Chapter 9.6 --- Summary --- p.149 / Chapter 10 --- Conclusions and Future Works --- p.150 / Bibliography --- p.152 / Publications --- p.161
69

On fault tolerance, performance, and reliability for wireless and sensor networks. / CUHK electronic theses & dissertations collection

January 2005 (has links)
Finally, to obtain a long network lifetime without sacrificing crucial aspects of quality of service (area coverage, sensing reliability, and network connectivity) in wireless sensor networks, we present sensibility-based sleeping configuration protocols (SSCPs) with two sensing models: Boolean sensing model (BSM) and collaborative sensing model (CSM). (Abstract shortened by UMI.) / Furthermore, we extend the traditional reliability analysis. Wireless networks inherit the unique handoff characteristic which leads to different communication structures of various types with a number of components and links. Therefore, the traditional definition of two-terminal reliability is not applicable anymore. We propose a new term, end-to-end mobile reliability, to integrate those different communication structures into one metric, which includes not only failure parameters but also service parameters. Nevertheless, it is still a monotonically decreasing function of time. With the proposed end-to-end mobile reliability, we could identify the reliability importance of imperfect components in wireless networks. / The emerging mobile wireless environment poses exciting challenges for distributed fault-tolerant (FT) computing. This thesis develops a message logging and recovery protocol on the top of Wireless CORBA to complement FT-CORBA specified for wired networks. It employs the storage available at access bridge (AB) as the stable storage for logging messages and saving checkpoints on behalf of mobile hosts (MHs). Our approach engages both the quasi-sender-based and the receiver-based message logging techniques and conducts seamless handoff in the presence of failures. / Then we extend the analysis of the program execution time without and with checkpointing in the presence of MH failures from wired to wireless networks. Due to the underlying message-passing communication mechanism, we employ the number of received computational messages instead of time to indicate the completion of program execution at an MH. Handoff is another distinct factor that should be taken into consideration in mobile wireless environments. Three checkpointing strategies, deterministic, random, and time-based checkpointing, are investigated. In our approach, failures may occur during checkpointing and recovery periods. / Chen Xinyu. / "June 2005." / Adviser: Michael R. Lyu. / Source: Dissertation Abstracts International, Volume: 67-07, Section: B, page: 3889. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (p. 180-198). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307.
70

Coverage-based testing strategies and reliability modeling for fault-tolerant software systems. / CUHK electronic theses & dissertations collection

January 2006 (has links)
Finally, we formulate the relationship between code coverage and fault detection. Although our two current models are in simple mathematical formats, they can predict the percentage of fault detected by the code coverage achieved for a certain test set. We further incorporate such formulation into traditional reliability growth models, not only for fault-tolerant software, but also for general software system. Our empirical evaluations show that our new reliability model can achieve more accurate reliability assessment than the traditional Non-homogenous Poisson model. / Furthermore, to investigate some "variants" as well as "invariants" of fault-tolerant software, we perform an empirical investigation on evaluating reliability features by a comprehensive comparison between two projects: our project and NASA 4-University project. Based on the same specification for program development, these two projects encounter some common as well as different features. The testing results of two comprehensive operational testing procedures involving hundreds of thousands test cases are collected and compared. Similar as well as dissimilar faults are observed and analyzed, indicating common problems related to the same application in both projects. The small number of coincident failures in the two projects, nevertheless, provide a supportive evidence for N-version programming, while the observed reliability improvement implies some trends in the software development in the past twenty years. / Motivated by the lack of real-world project data for investigation on software testing and fault tolerance techniques together, we conduct a real-world project and engage multiple programming teams to independently develop program versions based on an industry-scale avionics application. Detailed experimentations are conducted to study the nature, source, type, detectability, and effect of faults uncovered in the program versions, and to learn the relationship among these faults and the correlation of their resulting failures. Coverage-based testing as well as mutation testing techniques are adopted to reproduce mutants with real faults, which facilitate the investigation on the effectiveness of data flow coverage, mutation coverage, and fault coverage for design diversity. / Next, we investigate the effect of code coverage on fault detection which is the underlying intuition of coverage-based testing strategies. From our experimental data, we find that code coverage is a moderate indicator for the capability of fault detection on the whole test set. But the effect of code coverage on fault detection varies under different testing profiles. The correlation between the two measures is high with exceptional test cases, but weak in normal testing. Moreover, our study shows that code coverage can be used as a good filter to reduce the size of the effective test set, although it is more evident for exceptional test cases. / Software permeates our modern society, and its complexity and criticality is ever increasing. Thus the capability to tolerate software faults, particularly for critical applications, is evident. While fault-tolerant software is seen as a necessity, it also remains as a controversial technique and there is a lack of conclusive assessment about its effectiveness. / Then, based on the preliminary experimental data, further experimentation and detailed analyses on the correlations among these faults and the relation to their resulting failures are studied. The results are further applied to the current reliability modeling techniques for fault-tolerant software to examine their effectiveness and accuracy. / This thesis aims at providing a quantitative assessment scheme for a comprehensive evaluation of fault-tolerant software including reliability model comparisons and trade-off studies with software testing techniques. First of all, we propose a comprehensive procedure in assessing fault-tolerant software for software reliability engineering, which is composed of four tasks: modeling, experimentation, evaluation and economics. Our ultimate objective is to construct a systematic approach to predicting the achievable reliability based on the software architecture and testing evidences, through an investigation of testing and modeling techniques for fault-tolerant software. / Cai Xia. / "September 2006." / Adviser: Rung Tsong Michael Lyu. / Source: Dissertation Abstracts International, Volume: 68-03, Section: B, page: 1715. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2006. / Includes bibliographical references (p. 165-181). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.

Page generated in 0.0981 seconds