Global ETD Search

141	Architecture hybride tolérante aux fautes pour l'amélioration de la robustesse des circuits et systèmes intégrés numériques. / A Hybrid Fault-Tolerant Architecture for Robustness Improvement of Digital Integrated Circuits and Systems Tran, Duc Anh 21 December 2012 (has links) L'évolution de la technologie CMOS consiste à la miniaturisation continue de la taille des transistors. Cela permet la réalisation de circuits et systèmes intégrés de plus en plus complexes et plus performants, tout en réduisant leur consommation énergétique, ainsi que leurs coûts de fabrication. Cependant, chaque nouveau noeud technologique CMOS doit faire face aux problèmes de fiabilité, dues aux densités de fautes et d'erreurs croissantes. Par conséquence, les techniques de tolérance aux fautes, qui utilisent des ressources redondantes pour garantir un fonctionnement correct malgré la présence des fautes, sont devenus indispensables dans la conception numérique. Ce thèse étudie une nouvelle architecture hybride tolérante aux fautes pour améliorer la robustesse des circuits et systèmes numériques. Elle s'adresse à tous les types d'erreur dans la partie combinatoire des circuits, c'est-à-dire des erreurs permanentes (« hard errors »), des erreurs transitoires (« SETs ») et des comportements temporels fautifs (« timing errors »). L'architecture proposée combine la redondance de l'information (pour la détection d'erreur), la redondance de temps (pour la correction des erreurs transitoires) et la redondance matérielle (pour la correction des erreurs permanentes). Elle permet de réduire considérablement la consommation d'énergie, tout en ayant une surface de silicium similaire comparée aux solutions existantes. En outre, elle peut également être utilisée dans d'autres applications, telles que pour traiter des problèmes de vieillissement, pour tolérer des fautes dans les architectures pipelines, et pour être combiné avec des systèmes avancés de protection des erreurs transitoires dans la partie séquentielle des circuits logiques (« SEUs »). / Evolution of CMOS technology consists in continuous downscaling of transistor features sizes, which allows the production of smaller and cheaper integrated circuits with higher performance and lower power consumption. However, each new CMOS technology node is facing reliability problems due to increasing rate of faults and errors. Consequently, fault-tolerance techniques, which employ redundant resources to guarantee correct operations of digital circuits and systems despite the presence of faults, have become essential in digital design. This thesis studies a novel hybrid fault-tolerant architecture for robustness improvement of digital circuits and systems. It targets all kinds of error in combinational part of logic circuits, i.e. hard, SETs and timing errors. Combining information redundancy for error detection, timing redundancy for transient error correction and hardware redundancy for permanent error corrections, the proposed architecture allows significant power consumption saving, while having similar silicon area compared to existing solutions. Furthermore, it can also be used in other applications, such as dealing with aging phenomenon, tolerating faults in pipeline architecture, and being combined with advanced SEUs protection scheme for sequential parts of logic circuits. Tolérance aux fautes Circuit numérique Logique combinatoire Fault tolerance Digital circuit Combinational logic
142	A framework for maximizing the survivability of network dependent services Aktop, Baris 03 1900 (has links) Approved for public release; distribution in unlimited. / As a consequence of the developments in information technology and the Internet, the world is getting increasingly dependent upon distributed systems and network services. Unfortunately, the security of these services has not kept pace with the advances in information technology itself. Security practitioners accept that, a system that is connected to an unbounded network, e.g., the Internet, will be vulnerable to attacks regardless of its security features. However, the emerging discipline of survivability can help ensure that such systems deliver essential services and maintain essential properties, such as integrity, confidentiality and performance, despite the presence of intrusions. Although survivability has been accepted as a means of sufficiently addressing the security problems of current network services, unfortunately, the studies that have been done on network survivability so far are not mature enough and they lack quantifiable metrics. To address this lack of network survivability measure, a global connectivity metric is developed in this thesis. Additionally, an election protocol based on this metric is designed for the SAAM prototype to enhance the survivability of the SAAM server. / Lieutenant Junior Grade, Turkish Navy Computer networks Security measures Network survivability fault tolerance SAAM
143	Prévention et détection des interférences inter-aspects : méthode et application à l'aspectisation de la tolérance aux fautes / Aspect onlated programs testing Lauret, Jimmy 15 May 2013 (has links) La programmation orientée aspects (POA) sépare les différentes préoccupations composant un système informatique pour améliorer la modularité. La POA offre de nombreux bénéfices puisqu'elle permet de séparer le code fonctionnel du code non-fonctionnel améliorant ainsi leur réutilisation et la configurabilitè des systèmes informatiques. La configurabilité est un élément essentiel pour assurer la résilience des systèmes informatiques, puisqu’elle permet de modifier les mécanismes de sûreté de fonctionnement. Cependant le paradigme de programmation orientée aspect introduit de nouveaux défis pour le test. Dans les systèmes de grande taille où plusieurs préoccupations non fonctionnelles cohabitent, une implémentation à l'aide d'aspects de ces préoccupations peut être problématique. Partageant le même flot de données et le même flot de contrôle les aspects implémentant les différentes préoccupations peuvent écrire dans des variables lues par d'autres aspects ou interrompre le flot de contrôle commun aux différents aspects empêchant ainsi l'exécution de certains d'entre eux. Dans cette thèse nous nous intéressons plus spécifiquement aux interférences entre aspects dans le cadre du développement de mécanismes de tolérance aux fautes implémentés sous forme d’aspects. Ces interférences sont dues à une absence de déclaration de précédence entre les aspects ou à une déclaration de précédence erronée. Afin de mieux maîtriser l’assemblage des différents aspects composant un mécanisme de tolérance aux fautes, nous avons développé une méthode alliant l'évitement à la détection des interférences au niveau du code. Le but de l'évitement est d'empêcher l'introduction d'interférences en imposant une déclaration de précédence entre les aspects lors de l'intégration des aspects. La détection permet d'exhiber lors du test les erreurs introduites dans la déclaration des précédences. Ces deux facettes de notre approche sont réalisées grâce à l’utilisation d’une extension d'AspectJ appelée AIRIA. Les constructions d'AIRIA permettent l’instrumentation et donc la détection des interférences entre aspects, avec des facilités de compilation permettant de mettre en œuvre l’évitement d’interférences. Notre approche est outillée et vise à limiter le temps de déboguage : le testeur peut se concentrer directement sur les points où une interférence se produit. Nous illustrons notre approche sur une étude de cas: un protocole de réplication duplex. Dans ce contexte le protocole est implémenté en utilisant des aspects à grain fin permettant ainsi une meilleure configurabilité de la politique de réplication. Nous montrons que l'assemblage de ces aspects à grain fin donne lieu à des interférences de flot de données et flot de contrôle qui sont détectées par notre approche d'instrumentation. Nous définissons un ensemble d'aspects interférant pour l'exemple, et nous montrons comment notre approche permet la détection d'interférences. / Aspect-oriented programming (AOP) separates the different concerns of a computer software system to improve modularity. AOP offers many benefits since it allows separating the functional code from the non-functional code, thus improving reuse and configurability of computer systems. Configurability is essential to ensure the resilience of computer systems, since it allows modifying the dependability mechanisms. However, the paradigm of aspectoriented programming introduces new challenges regarding testing. In large systems where multiple non-functional concerns coexist, an AOP implementation of these concerns can be problematic. Sharing the same data flow and the same control flow, aspects implementing different concerns can write into variables read by other aspects, or interrupt the control flow involving various aspects, and thus preventing the execution of some aspects in the chain. In this work we focus more specifically on interference between aspects implementing fault tolerance mechanisms. This interference is due to a lack of declaration of fine-grain precedence between aspects or an incorrect precedence declaration. To better control the assembly of the various aspects composing fault tolerance mechanisms, we have developed a method combining avoidance of interferences with runtime detection interferences at code level. The purpose of avoidance is to prevent the introduction of interference by requiring a statement of precedence between aspects during the aspects integration. Detection allows exhibiting during the test, errors introduced in the precedence statement. These two aspects of our approach are performed through the use of an extension called AspectJ AIRIA. AIRIA ‘s constructs allow instrumentation and therefore the detection of interference between aspects, with facilities compilation to implement the interference avoidance. Our approach is designed and equipped to limit the debugging time : the tester can focus directly on the points where an interference occurs. Finaly, we illustrate our approach on a case study : a duplex replication protocol. In this context, the protocol is implemented using fine grained aspects allowing a better configurability of the replication policy.We show that the assembly of these fine-grained aspects gives rise to interference data flow and control flow that are detected by our instrumentation approach. We define a set of interfering aspects in this example, and show how our approach allows the detection of interferences. Test Tolérance aux fautes Evitement Détection Interférence Aspect Résilience Composition Fault tolerance Detection Interference Aspect Resilience
144	Conception de réseaux optiques en tenant compte de la tolérance aux fautes d’un ensemble quelconque de liens / Optical network design considering fault tolerance to any set of link failures Jara, Nicolás 25 July 2018 (has links) L'augmentation rapide de la demande en bande passante dans les réseaux de télécommunication d'aujourd'hui a provoqué une augmentation correspondante de l'utilisation de technologies basées dans les réseaux optiques de type WDM. Ceci étant, la recherche a identifié une limite forte dans la capacité de croissance de ces infrastructures, du point de la vitesse de transmission, limite qui sera atteinte bientôt. Cette situation conduit à des efforts de recherche pour faire évoluer les architectures courantes vers de nouvelles solutions capables d'absorber cette croissance dans la demande. Par exemple, les réseaux d'aujourd'hui sont opérés de façon statique. Ceci est inefficace dans l'utilisation des ressources, et la nécessité d'améliorer cet état de fait est reconnue par la recherche ainsi que par l'industrie. Plusieurs solutions ont été proposées pour passer à des modes de fonctionnement dynamiques, mais les diminutions des coûts qu'ont été obtenues n'ont pas encore convaincu les industriels. Cette thèse fait une nouvelle proposition de cette nature, qui inclut une nouvelle et très rapide méthodologie pour évaluer la probabilité de blocage dans ce type de système, qui est le cœur de notre procédure de conception. Le travail réalisé a conduit à la découverte de solutions pour l'ensemble des problèmes principaux d'une architecture de transmission optique. Il s'agit de décider chemins à utiliser par chaque utilisateur et la longueur d'onde (Wavelength Assignment Problem). Ensuite, il faut choisir le nombre total de longueurs d'onde qui sera nécessaire (Wavelength Dimensioning Problem). Enfin, il faut proposer les procédures à suivre en cas de défaillance d'un ou de plusieurs liens du réseau (Fault Tolerance Problem). La thèse propose une solution globale à cet ensemble de problèmes, et montre que les gains que l'on peut espérer dans l'opération de ces réseaux sont significativement plus importants qu'avec les autres propositions existantes. / The rapid increase in demand for bandwidth from existing networks has caused a growth in the use of technologies based on WDM optical networks. Nevertheless, this decade researchers have recognized a “Capacity Crunch” on optical networks, i.e. transmission capacity limit on optical fiber is close to be reached in the near future. This situation claims to evolve the current WDM optical networks architectures. For example, optical networks are operated statically. This operation is inefficient in the usage of network resources. To solve this problem Dynamic optical networks solve this inefficiences, but it has not been implemented since network cost savings are not enough to convince enterprises. The design of dynamic optical networks decomposes into different tasks, where the engineers must organize the way the main system's resources are used. All of these tasks, have to guarantee certain level of quality of service pre-established on the Service Level Agreement. Then, we propose a new fast and accurate analytical method to evaluate the blocking probability in these systems. This evaluation allows network designers to quickly solve higher order problems. More specifically, network operators face the challenge of solving: which wavelength is going to be used by each user (known as Wavelength Assignment), the number of wavelengths needed on each network link (called as Wavelength Dimensioning), the set of paths enabling each network user to transmit (known as Routing) and how to deal with link failures when the network is operating (called as Fault Tolerance capacity). This thesis proposes a joint solution to these problems, and it may provide sufficient network cost savings to foster telecommunications companies to migrate from the current static operation to a dynamic one. Réseaux optiques dynamiques Probabilité de blocage Tolérance aux fautes Optical netowrks Blocking probability Fault tolerance
145	Agentes móveis em grades oportunistas: uma abordagem para tolerância a falhas / Mobile Agents in opportunistic grids: an approach for tolerating failures Pinheiro, Vinicius Gama 24 April 2009 (has links) Grades oportunistas são ambientes distribuídos que permitem o aproveitamento do poder de processamento ocioso de recursos computacionais dispersos geograficamente em diferentes domínios administrativos. São características desses ambientes a alta heterogeneidade e a variação na disponibilidade dos seus recursos. Nesse contexto, o paradigma de agentes móveis surge como uma alternativa promissora para superar os desafios impostos na construção de grades oportunistas. Esses agentes podem ser utilizados na construção de mecanismos que permitam a progressão de execução das aplicações mesmo na presença de falhas. Esses mecanismos podem ser utilizados isoladamente, ou em conjunto, de forma a se adequar a diferentes cenários de disponibilidade de recursos. Neste trabalho, descrevemos a arquitetura do middleware MAG (Mobile Agents for Grid Computing Environment) e o que ele pode fazer em ambientes de grades oportunistas. Utilizamos esse middleware como base para a implementação de um mecanismo de tolerância a falhas baseado em replicação e salvaguarda periódica de tarefas. Por fim, analisamos os resultados obtidos através de experimentos e simulações. / Opportunistic grids are distributed environments built to leverage the computacional power of idle resources geographically spread across different administrative domains. These environments comprise many charateristics such as high level heterogeneity and variation on resource availability. The mobile agent paradigm arises as a promising alternative to overcome the construction challenges of opportunistic grids. These agents can be used to implement mechanisms that enable the progress on the execution of applications even in the presence of failures. These mechanisms can be combined in a flexible manner to meet different scenarios of resource availability. In this work, we describe the architecture of the MAG middleware (Mobile Agents for Grid Computing Environment) and what it can do in an opportunistic grid environment. We use this middleware as a foundation for the development of a fault tolerance mechanism based on task replication and checkpointing. Finally, we analize experimental and simulation results. agentes móveis computação oportunista fault-tolerance mobile agents opportunistic grids tolerância a falhas
146	Implementação de mecanismos tolerantes a falhas em uma arquitetura SOA com Qos / Implementation of fault tolerant mechanisms in a SOA architecture with QoS Oliveira, Edvard Martins de 28 August 2013 (has links) Esta dissertação de mestrado tem como objetivo avaliar a integração de políticas de tolerância a falhas em uma arquitetura de Web Services com múltiplos módulos. A arquitetura utilizada é denominada WSARCH, e foi desenvolvida para o estudo das relações e interoperabilidade entre serviçcos. Os mecanismos de tolerência a falhas foram integrados aos módulos da arquitetura, testados, comparados e avaliados. A avaliação de desempenho mostrou que os mecanismos de tolerância a falhas introduzidos foram eficientes e apresentaram resultados adequados. As técnicas de reputação utilizadas na seleção de serviço atuaram satisfatoriamente e foram consideradas um importante avanço nos mecanismos da arquitetura / This master\'s thesis aims to evaluate the integration of fault tolerance mechanisms in a Web Services architecture with multiple modules. The architecture used is named WSARCH and was developed for the study of interactions and interoperability of services. WSARCH is an architecture conceived to receive tests and experiments involving concepts of Web Services. The fault tolerance tools were integrated in the architecture, tested, evaluated and comparated. The performance evaluation showed that the fault tolerance mechanisms introduced were ecient and presented appropriate results. The reputation techniques utilized in service selection operated successfully and were considered an important advance in the mechanisms of the architecture Distributed systems Fault tolerance QoS QoS Sistemas distribuídos SOA SOA Tolerância a falhas Web service Web services
147	Resilient system design and efficient link management for the wireless communication of an ocean current turbine test bed Unknown Date (has links) To ensure that a system is robust and will continue operation even when facing disruptive or traumatic events, we have created a methodology for system architects and designers which may be used to locate risks and hazards in a design and enable the development of more robust and resilient system architectures. It uncovers design vulnerabilities by conducting a complete exploration of a systems’ component operational state space by observing the system from multi-dimensional perspectives and conducts a quantitative design space analysis by means of probabilistic risk assessment using Bayesian Networks. Furthermore, we developed a tool which automated this methodology and demonstrated its use in an assessment of the OCTT PHM communication system architecture. To boost the robustness of a wireless communication system and efficiently allocate bandwidth, manage throughput, and ensure quality of service on a wireless link, we created a wireless link management architecture which applies sensor fusion to gather and store platform networked sensor metrics, uses time series forecasting to predict the platform position, and manages data transmission for the links (class based, packet scheduling and capacity allocation). To validate our architecture, we developed a link management tool capable of forecasting the link quality and uses cross-layer scheduling and allocation to modify capacity allocation at the IP layer for various packet flows (HTTP, SSH, RTP) and prevent congestion and priority inversion. Wireless sensor networks (WSN) are vulnerable to a plethora of different fault types and external attacks after their deployment. To maintain trust in these systems and increase WSN reliability in various scenarios, we developed a framework for node fault detection and prediction in WSNs. Individual wireless sensor nodes sense characteristics of an object or environment. After a smart device successfully connects to a WSN’s base station, these sensed metrics are gathered, sent to and stored on the device from each node in the network, in real time. The framework issues alerts identifying nodes which are classified as faulty and when specific sensors exceed a percentage of a threshold (normal range), it is capable of discerning between faulty sensor hardware and anomalous sensed conditions. Furthermore we developed two proof of concept, prototype applications based on this framework. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2013. Fault tolerance (Engineering) Reliability (Engineering) Sensor networks -- Security measures Systems engineering
148	Desenvolvimento de um sistema microcomputador tolerante a falha com arquitetura em anel / Development of a fault tolerant microcomputer system with ring architecture Fischer, Deisy Piedade Munhoz 26 October 1990 (has links) Neste trabalho é apresentado um sistema microcomputador tolerante a falhas, com redundância modular tripla (TMR). Este sistema é caracterizado por uma Arquitetura em Anel implementada com três módulos processadores. A estrutura em Anel é uma arquitetura onde os módulos adjacentes são conectados por um canal de comunicação, formando um laço. Os módulos recebem dados de uma ou mais fontes (dependendo se as fontes são replicadas ou não). Esta informação é então processada e um dado é preparado para votação. O dado é transmitido aos módulos adjacentes, através do canal de comunicação. A tolerância à falhas é obtida, pela capacidade que os três processadores têm de examinar os resultados do processamento de seus vzinhos. Assim, cada processador recebe duas versões de cada processamento: o seu próprio resultado e o resultado do seu vizinho. Cada módulo, então executa a votação por programação, através da estratégia de votação sobre um número parcial de dados. Se nenhuma falha ocorreu, os três módulos irão produzir o mesmo resultado. O resultado da votação (comparação) é indicado em cada módulo por um sinalizador de erro. Quando ocorre uma falha em um módulo, o esquema de votação por programação identifica a ocorrência desta falha, mas o sistema irá continuar a operar corretamente, apesar da falha e um módulo. O sistema em Anel com redundância tripla, pode tolerar uma falha em um dos módulos. Estes cálculos não são executados de uma maneira fortemente sincronizada, mas os processadores são sincronizados de uma forma mais flexível, através de programação. O sistema foi implementado usando três módulos microcomputadores. Cada microcomputador tem um controlador de disco. O sistema acessa um único terminal de vídeo. O programa monitor é composto de três módulos idênticos, para os três microcomputadores. Cada módulo reside na memória local de cada microcomputador. O sistema executa o Sistema Operacional CP/M. Os programas para este sistema operacional serão executados de uma forma tolerante à falhas sem necessidade de modificações. O objetivo deste trabalho foi desenvolver um sistema de uso geral com alta disponibilidade. / A fault-tolerant, tri-module redundant (TMR), microcomputer System is presented. This system is characterized by a Ring Architecture implemented with three processor modules. The ring structure is a loop type architecture in which adjacent modules are connected by single communication links. The modules receive data from one or more sources (depending on whether these sources are replicated or not). This information is then processed and made ready for voting. The data is passed between the adjacent modules over the connecting links. Fault-tolerance is achieved by each of the three processors being able to examine computational results from its neighbour. Thus, each module process receives two versions of each calculation: one from its own calculation and one received from the other processor. Each module then performs the voting by software, with voting on parcial data estrategy. If no fault has ocorred, it can be expected that all the three modules will produce the same result. The result of the voting (comparision) is indicated in each module by na error condition flag. In the evento f a fault in one of the module/processors, then this will be recognized by the software voting and an error will be reported, the system will continue proper operation in spite of the failure of a single module. Triple Modular Redundant Ring System can tolerate a single fault in one of the modules. The calculations are not carried out in a tightly synchronized manner, but the processors are loosely synchronized by software. The system was implemented using three Z-80 based microcomputer boards. Each microcomputer board has it own disk-controller board. The system access a single vídeo terminal. The software monitor is comprised of three identical modules, one for each three microcomputer. Each software monitor module resides in the respective local memory of its microcomputers. The application software performs under CP/M Operational System. Programs from non-redundant versions will be executed in a fault tolerant manner without modification. Through this, our objective was to develop a system of general application, with high availability. Arquitetura em anel Fault tolerance Microcomputador Microcomputer Ring architecture Tolerância a falhas
149	Improving fault tolerance support in wireless sensor network macroprogramming / Evoluindo o suporte à tolerância a falhas na macroprogramação de redes de sensores sem fio Nogueira, Guilherme de Maio 01 December 2014 (has links) Wireless Sensor Networks (WSN) are distributed sensing network systems composed of tiny networked devices. These systems are employed to develop applications for sensing and acting on the environment. Each network device, or node, is equipped with sensors and sometimes actuators as well. WSNs typically have limited power, processing, and storage capability, and are also subject to faults, especially when deployed in harsh environments. Given WSNs limitations, application developers often design fault-tolerance mechanisms. Although developers implement some fault-tolerance mechanisms in hardware, most are implemented in software. Indeed, WSN application development mostly occurs at a low level, close to the operating system, which forces developers to focus away from application logic and dive into WSNs technical background. Some have proposed high-level programming solutions, such as macroprogramming languages and frameworks; however, few deal with fault-tolerance. This dissertation aims to incorporate fault-tolerance features into Srijan, an open-source WSN macroprogramming framework based on a mixed declarative-imperative language called Abstract Task Graph (ATaG). We augment Srijans framework to support code generation for dealing with devices that crash or report meaningless values. We present our feature implementation here, along with an evaluation of the tool, demonstrating that it is possible to provide a macroprogramming framework with appropriate support for developing fault-tolerant WSN applications. / Redes de Sensores Sem Fio (RSSF) são sistemas distribuídos em rede para sensoreamento, compostos de pequenos dispositivos conectados entre si. Esses sistemas são utilizados para construir aplicações que medem e atuam no meio físico. Cada dispositivo da rede, chamado de nó, é equipado com sensores, e algumas vezes, atuadores. Os nós também comumente possuem limitações em termos de suprimento de energia e capacidade de armazenamento e processamento. Em adição à essas limitações, redes de sensores sem fio também estão sujeitas à diversos tipos de falhas, especialmente quando são implantadas em ambientes de condições naturais extremas, como florestas e plantações. Por essas razões, desenvolvedores de aplicações para redes de sensores sem fio necessitam utilizar mecanismos de tolerância a falhas. Alguns dos mecanismos de tolerância a falhas são implementados em hardware, porém são mais comumente deixados para implementação em software. Além disso, a maior parte do desenvolvimento de aplicações para RSSF é feita em baixo nível de abstração, perto do sistema operacional. Desse modo, além de terem que concentrar-se na lógica da aplicação em baixo nível, os desenvolvedores ainda têm que implementar os mecanismos de tolerância a falhas junto à aplicação, pela falta de bibliotecas ou componentes genéricos para esse fim. Técnicas de programação em alto nível para RSSF já foram propostas na forma de linguagens e arcabouços de macroprogramação. No entanto, uma minoria lida com aspectos de tolerância a falhas. O objetivo desse trabalho é incorporar funcionalidades para tolerância a falhas ao Srijan, um arcabouço de macroprogramação para redes de sensores sem fio. Srijan possui código aberto e é baseado em uma linguagem mista declarativa-imperativa chamada Abstract Task Graph (ATaG). Evoluímos o arcabouço para dar suporte à geração automática de código lidando com quedas de nós da rede e falhas que resultam em dados incorretos de sensores. Nesta dissertação, apresentamos a nossa implementação de tais funcionalidades, juntamente com a avaliação conduzida sobre a ferramenta. Mostramos que é possível prover um arcabouço de macroprogramação com suporte apropriado ao desenvolvimento de aplicações para RSSF que necessitam tolerância a falhas. Fault tolerance Macroprogramação Macroprogramming Redes de sensores sem fio Srijan Srijan Tolerância a falhas Wireless sensor networks
150	Multicast-Based Interactive-Group Object-Replication For Fault Tolerance Soria-Rodriguez, Pedro 25 October 1999 (has links) "Distributed systems are clusters of computers working together on one task. The sharing of information across different architectures, and the timely and efficient use of the network resources for communication among computers are some of the problems involved in the implementation of a distributed system. In the case of a low latency system, the network utilization and the responsiveness of the communication mechanism are even more critical. This thesis introduces a new approach for the distribution of messages to computers in the system, in which, the Common Object Request Broker Architecture (CORBA) is used in conjunction with IP multicast to implement a fault-tolerant, low latency distributed system. Fault tolerance is achieved by replication of the current state of the system across several hosts. An update of the current state is initiated by a client application that contacts one of the state object replicas. The new information needs to be distributed to all the members of the distributed system (the object replicas). This state update is accomplished by using a two-phase commit protocol, which is implemented using a binary tree structure along with IP multicast to reduce the amount of network utilization, distribute the computation load associated with state propagation, and to achieve faster communication among the members of the distributed system. The use of IP multicast enhances the speed of message distribution, while the two-phase commit protocol encapsulates IP multicast to produce a reliable multicast service that is suitable for fault tolerant, distributed low latency applications. The binary tree structure, finally, is essential for the load sharing of the state commit response collection processing. " multicast CORBA commit protocol fault tolerance Electronic data processing Distributed processing Fault-tolerant computing

Search results