Global ETD Search

121	Cross-Layer Fault-Tolerant Design and Analysis for High Manufacturing Yield and System Reliability Guo, Jianghao 26 May 2016 (has links) No description available. Computer Engineering Fault tolerance cross layer computer architecture system performance
122	Implementation of Logic Fault Tolerance on a Dynamically Reconfigurable FPGA Jayarama, Kiran January 2016 (has links) No description available. Electrical Engineering Reconfigurable FPGA Fault Tolerance Dynamically Reconfigurable FPGA
123	A Foundation for Fault Tolerant Components Leal, William Milo 17 December 2001 (has links) No description available. Computer Science component fault tolerance refinement tolerance refinement locality composition
124	Scalable design of fault-tolerance for wireless sensor networks Demirbas, Murat 29 September 2004 (has links) No description available. Computer Science Fault-tolerance Sensor networks Self-stabilization
125	High performance and network fault tolerant MPI with multi-pathing over infiniBand Vishnu, Abhinav 11 December 2007 (has links) No description available. Computer Science InfiniBand MPI Network Fault Tolerance Hot-Spot
126	Network Fault Resilient MPI for Multi-Rail Infiniband Clusters Pai Raikar, Siddhesh Prakash Sunita January 2011 (has links) No description available. Computer Science Multirail Multi-rail fault tolerance resilience
127	FEASIBILITY STUDIES OF STATISTIC MULTIPLEXED COMPUTING Celik, Yasin January 2018 (has links) In 2012, when Professor Shi introduced me to the concept of Statistic Multiplexed Computing (SMC), I was skeptical. It contradicted everything I have learned and heard about distributed and parallel computing. However, I did believe that unhandled failures in any application will negatively impact its scalability. For that, I agreed to take on the feasibility study of SMC for practical applications. After six+ years research and experimentations, it became clear to me that the most widely believed misconception is “either performance or reliability” when upscaling a distributed application. This conception was the result of the direct use of hop-by-hop communication protocols in distributed application construction. Terminology: Hop-by-hop data protocol is a two-sided reliable lossless data communication protocol for transmitting data between a sender and a receiver. Either the sender or the receiver crash will cause data losses. Examples: MPI, RPC, RMI, OpenMP. End-to-end data protocol is a single-sided reliable lossless data communication protocol for transmitting data between application programs. All runtime available processors, networks and storage will be automatically dispatched to the best effort support of the reliable communication regardless transient and permanent device failures. Examples: HDFS, Blockchain, Fabric and SMC. Active end-to-end data protocol is a single-sided reliable lossless data communication pro- tocol for transmitting data and automatically synchronizing application programs. Example: SMC (AnkaCom, AnkaStore (this dissertation)). Unlike the hop-by-hop protocols, the use of end-to-end protocol forms an application- dependent overlay network. An overlay network for distributed and parallel computing application, such as Blockchain, has been proven to defy the “common wisdom” for two important distributed computing challenges: a) Extreme scale computing without single-point failures is practically feasible. Thus, all transaction or data losses can be eliminated. b) Extreme scale synchronized transaction replication is practically feasible. Thus, the CAP conjecture and theorem become irrelevant. Unlike passive overlay networks, such as the HDFS and Blockchain, this dissertation study proves that an active overlay network can deliver higher performance, higher reliability and security at the same time as the application up scales. Although application-level security is not part of this dissertation, it is easy to see that application-level end-to-end protocols will fundamentally eliminate the “man-in-the-middle” attacks. This will nullify many well-known attacks. With the zero-single-point failure and zero impact synchronous replication features, SMC applications are naturally resistant to DDoS and ransomware attacks. This dissertation explores practical implementations of the SMC concept for compute intensive (CI) and data intensive (DI) applications. This defense will disclose the details of CI and DI runtime implementations and results of inductive computational experiments. The computational environments include the NSF Chameleon bare-metal HPC cloud and Temple’s TCloud cluster. / Computer and Information Science Computer Science Distributed Systems Fault Tolerance Hpc Reliability Scalability Storage
128	Decentralized Crash-Resilient Runtime Verification Kazemlou, Shokoufeh January 2017 (has links) This is the final revision of my M.Sc. Thesis. / Runtime Verification is a technique to extract information from a running system in order to detect executions violating a given correctness specification. In this thesis, we study distributed synchronous/asynchronous runtime verification of systems. In our setting, there is a set of distributed monitors that have only partial views of a large system and are subject to failures. In this context, it is unavoidable that monitors may have different views of the underlying system, and therefore may have different valuations of the correctness property. In this thesis, we propose an automata-based synchronous monitoring algorithm that copes with f crash failures in a distrbuted setting. The algorithm solves the synchronous monitoring problem in f + 1 rounds of communication, and significantly reduces the message size overhead. We also propose an algorithm for distributed crash-resilient asynchronous monitoring that consistently monitors the system under inspection without any communication between monitors. Each local monitor emits a verdict set solely based on its own partial observation, and the intersection of the verdict sets will be the same as the verdict computed by a centralized monitor that has full view of the system. / Thesis / Master of Science (MSc)
129	A Low-latency Consensus Algorithm for Geographically Distributed Systems Arun, Balaji 15 May 2017 (has links) This thesis presents Caesar, a novel multi-leader Generalized Consensus protocol for geographically replicated systems. Caesar is able to achieve near-perfect availability, provide high performance - low latency and high throughput compared to the existing state-of-the- art, and tolerate replica failures. Recently, a number of state-of-the-art consensus protocols that implement the Generalized Consensus definition have been proposed. However, the major limitation of these existing approaches is the significant performance degradation when application workload produces conflicting requests. Caesar's main goal is to overcome this limitation by changing the way a fast decision is taken: its ordering protocol does not reject a fast decision for a client request if a quorum of nodes reply with different dependency sets for that request. It only switches to a slow decision if there is no chance to agree on the proposed order for that request. Caesar is able to achieve this using a combination of wait condition and logical time stamping. The effectiveness of Caesar is demonstrated through an evaluation study performed on Amazon's EC2 infrastructure using 5 geo-replicated sites. Caesar outperforms other multi-leader (e.g., EPaxos) competitors by as much as 1.7x in presence of 30% conflicting requests, and single-leader (e.g., Multi-Paxos) by as much as 3.5x. The protocol is also resistant to heavy client loads unlike existing protocols. / Master of Science Multi-Leader Consensus State Machine Replication Fault Tolerance Distributed Systems
130	Modeling of Power Consumption and Fault Tolerance for Electronic Textiles Sheikh, Tanwir Abdulwahid 22 October 2003 (has links) The developments in textile technology now enable the weaving of conductive wires into the fabrics. This allows the introduction of electronic components such as sensors, actuators and computational devices on the fabrics, creating electronic textiles (e-textiles). E-textiles can be either wearable or non-wearable. However, regardless of their form, e-textiles are placed in a tightly constrained design space requiring high computational performance, limited power consumption, and fault tolerance. The purpose of this research is to create simulation models for power consumption and fault behavior of e-textile applications. For the power consumption model, the power profile of the computational elements must be tracked dynamically based upon the power states of the e-textile components. For the fault behavior model, the physical nature of the e-textile and the faults developed can adversely affect the accuracy of results from the e-textile. Open and short circuit faults can disconnect or drain the battery respectively, affecting both battery life and the performance of the e-textile. This thesis describes the development of both of these models and their interfaces. It then presents simulation results of the performance of an acoustic beamforming e-textile in the presence and absence of faults, using those results to explore the battery life and fault tolerance of several battery configurations. / Master of Science e-textiles fault tolerance power consumption Electronic Textiles physical modeling

Search results