Global ETD Search

1	APPLICATION AWARE FOR BYZANTINE FAULT TOLERANCE Chai, Hua 09 December 2014 (has links) No description available. Computer Engineering Byzantine fault tolerance
2	Scalable Byzantine State Machine Replication: Designs, Techniques, and Implementations Arun, Balaji 02 July 2021 (has links) State machine replication (SMR) is one of the most widely studied and used methodology for building highly available distributed applications and services. SMR replicates a service across a set of computing hosts, and executes client operations on the replicas in an agreed- upon total order, ensuring linearizability of the replicated shared state. The problem of determining a total order reduces to one of computing consensus. State-of-the-art consensus protocols are inadequate for newer classes of applications such as Blockchains and for geographically distributed infrastructures. The widely used Crash Fault Tolerance (CFT) fault model of consensus protocols is prone to malicious and adversarial behaviors as well as non-crash faults such as software bugs. The Byzantine fault-tolerance (BFT) model and its trust-based variant, the hybrid model, permit stronger failure adversaries. However, state-of-the-art Byzantine and hybrid consensus protocols have performance limitations in geographically distributed environments: they designate a primary replica for proposing total-orders, which becomes a bottleneck and yields sub-optimal latencies for faraway clients. Additionally, they do not scale to hundreds of replicas and provide consistent performance as the system size grows. To overcome these limitations and develop highly scalable SMR solutions, this dissertation presents two leaderless consensus protocols, namely ezBFT and Dester, for the Byzantine and hybrid models, respectively. These protocols enable every replica to receive and order client commands. Additionally, they exchange command dependencies to collectively order commands without relying on a primary. Our experimental evaluations in a 7-node geographically distributed setup reveals that ezBFT improves client-side latency by as much as 40% over state-of-the-art BFT protocols including PBFT, FaB, and Zyzzyva. Dester, for the hybrid model, reduces latency by as much as 30% over ezBFT. Next, the dissertation presents a new paradigm called DQBFT for designing consensus protocols that can scale to hundreds of nodes in geographically distributed environments. Since leaderless protocols exchange command dependencies, they do not scale to hundreds of nodes. DQBFT overcomes this scalability limitation by decentralizing only the heavy task of replicating commands and centralizing the process of ordering the commands. While DQBFT can be used to enhance existing primary-based protocols, Destiny is a hybrid instantiation of the DQBFT paradigm using linear communication for better scalability than naive instantiations. Experimental evaluations in a 193-node geographically distributed setup reveal that Destiny achieves ≈ 3× better throughput and ≈50% better latency than state-of-the-art BFT protocols including Hotstuff, SBFT, and Hybster. Lastly, the dissertation presents two techniques for designing and implementing BFT protocols with reduced development costs. The dissertation presents Bumblebee, a methodology for manually transforming CFT protocols to tolerate Byzantine faults using trusted execution environments that are increasingly available in commodity hardware. Bumblebee is based on the observation that CFT protocols are incapable of tolerating non-malicous non-crash faults, but they are nevertheless deployed in many production systems. Bumblebee provides a Generic Algorithm that can represent protocols in both CFT and hybrid fault models, thus allowing easy construction of hybrid protocols using CFT protocols as baselines. The dissertation constructs hybrid instantiations of CFT protocols including Paxos, Raft, and M2Paxos. Experimental evaluations of the hybrid variants reveal that they perform at par with native hybrid protocols, but incur a 30% overhead over their CFT counterparts. Hybrid protocols rely on the integrity of trusted execution environments, which are increasingly subject to security exploits. To withstand exploits, the dissertation presents DuoBFT, a protocol that exposes both the BFT and hybrid fault models within a single consensus protocol. This enables consensus under both fault models within the same protocol and without additional redundancy, allowing DuoBFT to achieve the performance of hybrid protocols and the security of BFT protocols. Experimental evaluations reveal that DuoBFT achieves the best of both hybrid and BFT fault models with less than 10% overhead. / Doctor of Philosophy / Computers are ubiquitous; they perform some of the most complex and safety-critical tasks such as controlling aircraft, managing the financial markets, and maintaining sensitive medical records. The undeniable fact is that computers are faulty. They are prone to crash and can behave arbitrarily. Even the most robust computers such as those that are sent to the outer space eventually fail. External phenomenon such as power outages and network disruptions affect their operation. To make computing systems reliable, researchers and practitioners have long focussed on interconnecting many individual computers and programming them to effectively be duplicates of one another. This way when one computer fails in a system, the rest of the computers still ensure that the system as a whole is operational. Duplication requires that multiple computers effectively perform the same task. In order for multiple computers to perform the same task together, they should first agree on the task. More generally, since computing systems perform multiple tasks, they should agree on the sequence of tasks that they will individually perform and follow the agreement. This is what is known as the State Machine Replication technique. State Machine Replication (SMR) is a powerful technique that is applicable to numerous computing applications. Blockchain systems, the technology behind the cryptocurrencies such as Bitcoin and Ethereum, uses the SMR technique. In the context of Blockchain, the added challenge in that some of the computers involved in SMR can be programmed by adversarial parties and could act in a way to jeopardize the integrity of the whole system. For Bitcoin and Ethereum, this could mean embezzlement of hundreds or even millions of dollars worth of cryptocurrencies. Certain SMR systems are capable of tolerating such intrusions and ensure system integrity. Such systems are deemed to be Byzantine tolerant. This dissertation presents designs, techniques, and implementations of Byzantine State Machine Replication systems. The problems addressed in this dissertation are those that plague existing Byzantine SMR systems making them suboptimal for newer applications such as Blockchains. First, when computers that participate in SMR are spread around the world, their performance is dependent on the communication latencies between any two pair of computers. Second, the number of computers required is proportional the number of adversarial computers that need to be tolerated. Consequently, certain SMR systems for Blockchains require hundreds of computers to tolerate heavy adversarial behavior. Many existing SMR technique perform poorly under these scenarios. The techniques presented in this dissertation address various permutations of these challenges. Byzantine Fault Tolerance State Machine Replication
3	Secure Store : A Secure Distributed Storage Service Lakshmanan, Subramanian 12 August 2004 (has links) As computers become pervasive in environments that include the home and community, new applications are emerging that will create and manipulate sensitive and private information. These applications span systems ranging from personal to mobile and hand held devices. They would benefit from a data storage service that protects the integrity and confidentiality of the stored data and is highly available. Such a data repository would have to meet the needs of a variety of applications, handling data with varying security and performance requirements. Providing simultaneously both high levels of security and high levels of performance may not be possible when many nodes in the system are under attack. The agility approach to building secure distributed services advocates the principle that the overhead of providing strong security guarantees should be incurred only by those applications that require such high levels of security and only at times when it is necessary to defend against high threat levels. A storage service that is designed for a variety of applications must follow the principles of agility, offering applications a range of options to choose from for their security and performance requirements. This research presents secure store, a secure and highly available distributed store to meet the performance and security needs of a variety of applications. Secure store is designed to guarantee integrity, confidentiality and availability of stored data even in the face of limited number of compromised servers. Secure store is designed based on the principles of agility. Secure store integrates two well known techniques, namely replication and secret-sharing, and exploits the tradeoffs that exist between security and performance to offer applications a range of options to choose from to suit their needs. This thesis makes several contributions, including (1) illustration of the the principles of agility, (2) a novel gossip-style secure dissemination protocol whose performance is comparable to the best-possible benign-case protocol in the absence of any malicious activity, (3) demonstration of the performance benefits of using weaker consistency models for data access, and (4) a technique called collective endorsement that can be used in other secure distributed applications. Consistency Secure dissemination Byzantine fault tolerance Storage security Distributed storage
4	Tuplebiz : um espaço de tuplas distribuido e com suporte a transações resilientes a falhas bizantinas / Tuplebiz: a distributed tuple space resilient to byzantine faults Souza, Gisele Pinheiro January 2012 (has links) Os modelos de coordenação de comunicação possibilitam a cooperação entre os diversos processos que fazem parte de um sistema distribuído. O modelo de coordenação de espaço de dados compartilhado, o qual é representado pelo espaço de tuplas, permite que a comunicação tenha tanto desacoplamento referencial quanto temporal. Devido essas características, o espaço de tuplas é frequentemente usado em aplicações pervasivas e paralelas. A habilidade de tolerar a falhas é importante para ambos os tipos de aplicações. Para aplicações pervasivas na área médica, uma falha pode custar vidas. Nesse contexto, esse trabalho propõe o Tuplebiz, um espaço de tuplas distribuído que suporta transações em um ambiente sujeito a falhas bizantinas. As falhas bizantinas encapsulam uma variedade de comportamentos faltosos que podem ocorrer no sistema. O Tuplebiz é dividido em partições de dados para facilitar a distribuição entre diferentes servidores. Cada partição garante tolerância a falhas por meio de replicação de máquina de estados. Adicionalmente, o Tuplebiz também provê transações que possuem as propriedades ACID, isto é, as propriedades de atomicidade, consistência, isolamento e durabilidade. O gerente de transações é responsável por garantir o isolamento das transações. Testes de desempenho e injeção de falhas foram realizados. A latência do Tuplebiz sem falhas é aproximadamente 2,8 vezes maior que a latência de um sistema não replicado. Os testes de injeção tiveram como base um framework de testes de injeção de falhas para sistemas tolerantes a falhas bizantinas. Os testes avaliaram os seguintes tipos de falha: mensagens perdidas, atrasos de envio de mensagens, corrupção de mensagens, suspensão do sistema e crash. A latência no caso de falhas foi maior que no caso sem falhas, mas todas as falhas foram suportadas pelo Tuplebiz. Como estudo de caso, é revisada a integração do Tuplebiz com a Guaraná, uma linguagem específica de domínio usada para modelar soluções de integração de sistemas. As tarefas de uma solução de integração na Guaraná são centralizadas atualmente. A proposta de integração prevê a distribuição das tarefas entre diferentes servidores. / The coordination models enable the communication among the process in a distributed system. The shared data model is time and referential decoupled, which is represented by tuple spaces. For this reason, the tuple space is used by parallel and pervasive applications. The fault tolerance is very important for both type of application. For healthcare applications, the fault can cost a life. In this context, this work introduces the Tuplebiz, a distributed tuple space that supports transactions in environment where byzantine faults can occur. Byzantine faults include many types of system faults. The Tuplebiz is spitted in partitions. The main idea behind it is to distribute the tuple space among servers. Each partition guarantees the fault tolerance by using state machine replication. Furthermore, Tuplebiz has transaction support, which follows the ACID properties (atomicity, consistency, isolation, durability). The transaction manager is responsible for maintaining the isolation. Performance and fault injection tests were made in order to evaluate the Tuplebiz. The Tuplebiz latency is approximately 2.8 times bigger than the one for a non replicated system. The injection tests were based on an injection fault framework for byzantine faults. The tests applied were: lost message, delay message, corrupted message, system suspension and crash. The latency was worst on those cases; however the Tuplebiz was able to deal with all of them. Also, a case is presented. This case shows the integration between Tuplebiz and Guaraná, which is a domain specific language, used for designing Enterprise Application Integration applications. The solution integration tasks are centralized nowadays. The integration approach aims to distribute the tasks among servers. Sistemas operacionais Tolerancia : Falhas Tuple space Byzantine fault Transaction
5	Tuplebiz : um espaço de tuplas distribuido e com suporte a transações resilientes a falhas bizantinas / Tuplebiz: a distributed tuple space resilient to byzantine faults Souza, Gisele Pinheiro January 2012 (has links) Os modelos de coordenação de comunicação possibilitam a cooperação entre os diversos processos que fazem parte de um sistema distribuído. O modelo de coordenação de espaço de dados compartilhado, o qual é representado pelo espaço de tuplas, permite que a comunicação tenha tanto desacoplamento referencial quanto temporal. Devido essas características, o espaço de tuplas é frequentemente usado em aplicações pervasivas e paralelas. A habilidade de tolerar a falhas é importante para ambos os tipos de aplicações. Para aplicações pervasivas na área médica, uma falha pode custar vidas. Nesse contexto, esse trabalho propõe o Tuplebiz, um espaço de tuplas distribuído que suporta transações em um ambiente sujeito a falhas bizantinas. As falhas bizantinas encapsulam uma variedade de comportamentos faltosos que podem ocorrer no sistema. O Tuplebiz é dividido em partições de dados para facilitar a distribuição entre diferentes servidores. Cada partição garante tolerância a falhas por meio de replicação de máquina de estados. Adicionalmente, o Tuplebiz também provê transações que possuem as propriedades ACID, isto é, as propriedades de atomicidade, consistência, isolamento e durabilidade. O gerente de transações é responsável por garantir o isolamento das transações. Testes de desempenho e injeção de falhas foram realizados. A latência do Tuplebiz sem falhas é aproximadamente 2,8 vezes maior que a latência de um sistema não replicado. Os testes de injeção tiveram como base um framework de testes de injeção de falhas para sistemas tolerantes a falhas bizantinas. Os testes avaliaram os seguintes tipos de falha: mensagens perdidas, atrasos de envio de mensagens, corrupção de mensagens, suspensão do sistema e crash. A latência no caso de falhas foi maior que no caso sem falhas, mas todas as falhas foram suportadas pelo Tuplebiz. Como estudo de caso, é revisada a integração do Tuplebiz com a Guaraná, uma linguagem específica de domínio usada para modelar soluções de integração de sistemas. As tarefas de uma solução de integração na Guaraná são centralizadas atualmente. A proposta de integração prevê a distribuição das tarefas entre diferentes servidores. / The coordination models enable the communication among the process in a distributed system. The shared data model is time and referential decoupled, which is represented by tuple spaces. For this reason, the tuple space is used by parallel and pervasive applications. The fault tolerance is very important for both type of application. For healthcare applications, the fault can cost a life. In this context, this work introduces the Tuplebiz, a distributed tuple space that supports transactions in environment where byzantine faults can occur. Byzantine faults include many types of system faults. The Tuplebiz is spitted in partitions. The main idea behind it is to distribute the tuple space among servers. Each partition guarantees the fault tolerance by using state machine replication. Furthermore, Tuplebiz has transaction support, which follows the ACID properties (atomicity, consistency, isolation, durability). The transaction manager is responsible for maintaining the isolation. Performance and fault injection tests were made in order to evaluate the Tuplebiz. The Tuplebiz latency is approximately 2.8 times bigger than the one for a non replicated system. The injection tests were based on an injection fault framework for byzantine faults. The tests applied were: lost message, delay message, corrupted message, system suspension and crash. The latency was worst on those cases; however the Tuplebiz was able to deal with all of them. Also, a case is presented. This case shows the integration between Tuplebiz and Guaraná, which is a domain specific language, used for designing Enterprise Application Integration applications. The solution integration tasks are centralized nowadays. The integration approach aims to distribute the tasks among servers. Sistemas operacionais Tolerancia : Falhas Tuple space Byzantine fault Transaction
6	Tuplebiz : um espaço de tuplas distribuido e com suporte a transações resilientes a falhas bizantinas / Tuplebiz: a distributed tuple space resilient to byzantine faults Souza, Gisele Pinheiro January 2012 (has links) Os modelos de coordenação de comunicação possibilitam a cooperação entre os diversos processos que fazem parte de um sistema distribuído. O modelo de coordenação de espaço de dados compartilhado, o qual é representado pelo espaço de tuplas, permite que a comunicação tenha tanto desacoplamento referencial quanto temporal. Devido essas características, o espaço de tuplas é frequentemente usado em aplicações pervasivas e paralelas. A habilidade de tolerar a falhas é importante para ambos os tipos de aplicações. Para aplicações pervasivas na área médica, uma falha pode custar vidas. Nesse contexto, esse trabalho propõe o Tuplebiz, um espaço de tuplas distribuído que suporta transações em um ambiente sujeito a falhas bizantinas. As falhas bizantinas encapsulam uma variedade de comportamentos faltosos que podem ocorrer no sistema. O Tuplebiz é dividido em partições de dados para facilitar a distribuição entre diferentes servidores. Cada partição garante tolerância a falhas por meio de replicação de máquina de estados. Adicionalmente, o Tuplebiz também provê transações que possuem as propriedades ACID, isto é, as propriedades de atomicidade, consistência, isolamento e durabilidade. O gerente de transações é responsável por garantir o isolamento das transações. Testes de desempenho e injeção de falhas foram realizados. A latência do Tuplebiz sem falhas é aproximadamente 2,8 vezes maior que a latência de um sistema não replicado. Os testes de injeção tiveram como base um framework de testes de injeção de falhas para sistemas tolerantes a falhas bizantinas. Os testes avaliaram os seguintes tipos de falha: mensagens perdidas, atrasos de envio de mensagens, corrupção de mensagens, suspensão do sistema e crash. A latência no caso de falhas foi maior que no caso sem falhas, mas todas as falhas foram suportadas pelo Tuplebiz. Como estudo de caso, é revisada a integração do Tuplebiz com a Guaraná, uma linguagem específica de domínio usada para modelar soluções de integração de sistemas. As tarefas de uma solução de integração na Guaraná são centralizadas atualmente. A proposta de integração prevê a distribuição das tarefas entre diferentes servidores. / The coordination models enable the communication among the process in a distributed system. The shared data model is time and referential decoupled, which is represented by tuple spaces. For this reason, the tuple space is used by parallel and pervasive applications. The fault tolerance is very important for both type of application. For healthcare applications, the fault can cost a life. In this context, this work introduces the Tuplebiz, a distributed tuple space that supports transactions in environment where byzantine faults can occur. Byzantine faults include many types of system faults. The Tuplebiz is spitted in partitions. The main idea behind it is to distribute the tuple space among servers. Each partition guarantees the fault tolerance by using state machine replication. Furthermore, Tuplebiz has transaction support, which follows the ACID properties (atomicity, consistency, isolation, durability). The transaction manager is responsible for maintaining the isolation. Performance and fault injection tests were made in order to evaluate the Tuplebiz. The Tuplebiz latency is approximately 2.8 times bigger than the one for a non replicated system. The injection tests were based on an injection fault framework for byzantine faults. The tests applied were: lost message, delay message, corrupted message, system suspension and crash. The latency was worst on those cases; however the Tuplebiz was able to deal with all of them. Also, a case is presented. This case shows the integration between Tuplebiz and Guaraná, which is a domain specific language, used for designing Enterprise Application Integration applications. The solution integration tasks are centralized nowadays. The integration approach aims to distribute the tasks among servers. Sistemas operacionais Tolerancia : Falhas Tuple space Byzantine fault Transaction
7	BYZANTINE FAULT TOLERANT COORDINATION FOR WEB SERVICES ATOMIC TRANSACTIONS Zhang, Honglei 20 December 2007 (has links) No description available. replicas BYZANTINE WEB SERVICES BYZANTINE FAULT coordinator 2PC di¿¿¿¿¿¿¿erent
8	Highly available storage with minimal trust Mahajan, Prince 05 July 2012 (has links) Storage services form the core of modern Internet-based services spanning commercial, entertainment, and social-networking sectors. High availability is crucial for these services as even an hour of unavailability can cost them millions of dollars in lost revenue. Unfortunately, it is difficult to build highly available storage services that provide useful correctness properties. Both benign (system crashes, power out- ages etc.) and Byzantine faults (memory or disk corruption, software or configuration errors etc.) plague the availability of these services. Furthermore, the goal of high availability conflicts with our desire to provide good performance and strong correctness guarantees. For example, the Consistency, Availability, and Partition- resilience (CAP) theorem states that a storage service that must be available despite network partitions cannot enforce strong consistency. Similarly, the tradeoff between latency and durability dictates that a low-latency service cannot ensure durability in the presence of data-center wide failures. This dissertation explores the theoretical and practical limits of storage services that can be safe and live despite the presence of benign and Byzantine faults. On the practical front, we use cloud storage as a deployment model to build Depot, a highly available storage service that addresses the above challenges. Depot minimizes the trust clients have to put in the third party storage provider. As a result, Depot clients can continue functioning despite benign or Byzantine faults of the cloud servers. Yet, Depot provides stronger availability, durability, and consistency properties than those provided by many of the existing cloud deployments, without incurring prohibitive performance cost. For example, in contrast to Amazon S3’s eventual consistency, Depot provides a variation of causal consistency on each volume, while tolerating Byzantine faults. On the theoretical front, we explore the consistency-availability tradeoffs. Tradeoffs between consistency and availability have proved useful for designers in deciding how much to strengthen consistency if high availability is desired or how much to compromise availability if strong consistency is essential. We explore the limits of such tradeoffs by attempting to answer the question: What are the semantics that can be implemented without compromising availability? In this work, we investigate this question for both fail-stop and Byzantine failure models. An immediate benefit of answering this question is that we can compare and contrast the consistency provided by Depot with that achievable by an optimal implementation. More crucially, this result complements the CAP theorem. While, the CAP theorem defines a set of properties that cannot be achieved, this work identifies the limits of properties that can be achieved. / text Cloud storage Byzantine fault tolerance Fork-Join-Causal (FJC) consistency Fork consistency
9	UpRight fault tolerance Clement, Allen Grogan 13 November 2012 (has links) Experiences with computer systems indicate an inconvenient truth: computers fail and they fail in interesting ways. Although using redundancy to protect against fail-stop failures is common practice, non-fail-stop computer and network failures occur for a variety of reasons including power outage, disk or memory corruption, NIC malfunction, user error, operating system and application bugs or misconfiguration, and many others. The impact of these failures can be dramatic, ranging from service unavailability to stranding airplane passengers on the runway to companies closing. While high-stakes embedded systems have embraced Byzantine fault tolerant techniques, general purpose computing continues to rely on techniques that are fundamentally crash tolerant. In a general purpose environment, the current best practices response to non-fail-stop failures can charitably be described as pragmatic: identify a root cause and add checksums to prevent that error from happening again in the future. Pragmatic responses have proven effective for patching holes and protecting against faults once they have occurred; unfortunately the initial damage has already been done, and it is difficult to say if the patches made to address previous faults will protect against future failures. We posit that an end-to-end solution based on Byzantine fault tolerant (BFT) state machine replication is an efficient and deployable alternative to current ad hoc approaches favored in general purpose computing. The replicated state machine approach ensures that multiple copies of the same deterministic application execute requests in the same order and provides end-to-end assurance that independent transient failures will not lead to unavailability or incorrect responses. An efficient and effective end-to-end solution covers faults that have already been observed as well as failures that have not yet occurred, and it provides structural confidence that developers won't have to track down yet another failure caused by some unpredicted memory, disk, or network behavior. While the promise of end-to-end failure protection is intriguing, significant technical and practical challenges currently prevent adoption in general purpose computing environments. On the technical side, it is important that end-to-end solutions maintain the performance characteristics of deployed systems: if end-to-end solutions dramatically increase computing requirements, dramatically reduce throughput, or dramatically increase latency during normal operation then end-to-end techniques are a non-starter. On the practical side, it is important that end-to-end approaches be both comprehensible and easy to incorporate: if the cost of end-to-end solutions is rewriting an application or trusting intricate and arcane protocols, then end-to-end solutions will not be adopted. In this thesis we show that BFT state machine replication can and be used in deployed systems. Reaching this goal requires us to address both the technical and practical challenges previously mentioned. We revisiting disparate research results from the last decade and tweak, refine, and revise the core ideas to fit together into a coherent whole. Addressing the practical concerns requires us to simplify the process of incorporating BFT techniques into legacy applications. / text Fault tolerance Dependability Byzantine fault tolerance UpRight fault tolerance Replicated state machine
10	Securing multi-robot systems with inter-robot observations and accusations Wardega, Kacper Tomasz 24 May 2023 (has links) In various industries, such as manufacturing, logistics, agriculture, defense, search and rescue, and transportation, Multi-robot systems (MRSs) are increasingly gaining popularity. These systems involve multiple robots working together towards a shared objective, either autonomously or under human supervision. However, as MRSs operate in uncertain or even adversarial environments, and the sensors and actuators of each robot may be error-prone, they are susceptible to faults and security threats unique to MRSs. Classical techniques from distributed systems cannot detect or mitigate these threats. In this dissertation, novel techniques are proposed to enhance the security and fault-tolerance of MRSs through inter-robot observations and accusations. A fundamental security property is proposed for MRSs, which ensures that forbidden deviations from a desired multi-robot motion plan by the system supervisor are detected. Relying solely on self-reported motion information from the robots for monitoring deviations can leave the system vulnerable to attacks from a single compromised robot. The concept of co-observations is introduced, which are additional data reported to the supervisor to supplement the self-reported motion information. Co-observation-based detection is formalized as a method of identifying deviations from the expected motion plan based on discrepancies in the sequence of co-observations reported. An optimal deviation-detecting motion planning problem is formulated that achieves all the original application objectives while ensuring that all forbidden plan-deviation attacks trigger co-observation-based detection by the supervisor. A secure motion planner based on constraint solving is proposed as a proof-of-concept to implement the deviation-detecting security property. The security and resilience of MRSs against plan deviation attacks are further improved by limiting the information available to attackers. An efficient algorithm is proposed that verifies the inability of an attacker to stealthily perform forbidden plan deviation attacks with a given motion plan and announcement scheme. Such announcement schemes are referred to as horizon-limiting. An optimal horizon-limiting planning problem is formulated that maximizes planning lookahead while maintaining the announcement scheme as horizon-limiting. Co-observations and horizon-limiting announcements are shown to be efficient and scalable in protecting MRSs, including systems with hundreds of robots, as evidenced by a case study in a warehouse setting. Lastly, the Decentralized Blocklist Protocol (DBP), a method for designing Byzantine-resilient decentralized MRSs, is introduced. DBP is based on inter-robot accusations and allows cooperative robots to identify misbehavior through co-observations and share this information through the network. The method is adaptive to the number of faulty robots and is widely applicable to various decentralized MRS applications. It also permits fast information propagation, requires fewer cooperative observers of application-specific variables, and reduces the worst-case connectivity requirement, making it more scalable than existing methods. Empirical results demonstrate the scalability and effectiveness of DBP in cooperative target tracking, time synchronization, and localization case studies with hundreds of robots. The techniques proposed in this dissertation enhance the security and fault-tolerance of MRSs operating in uncertain and adversarial environments, aiding in the development of secure MRSs for emerging applications. Computer engineering Byzantine fault tolerance Multi-agent systems Multi-robot systems

Search results