Global ETD Search

31	AFIDS : arquitetura para injeção de falhas em sistemas distribuídos / AFIDS - architecture for fault injection in distributed systems Sotoma, Irineu January 1997 (has links) Sistemas distribuídos já são de amplo uso atualmente e seu crescimento tende a se acentuar devido a popularização da Internet. Cada vez mais computadores se interligam e trocam informações entre si. Nestes sistemas, requerimentos como confiabilidade, disponibilidade e desempenho são de fundamental importância para a satisfação do usuário. Estes requerimentos podem ser atendidos aproveitando-se da redundância já existente com as maquinas interligadas. Mas para atingir os requisitos de confiabilidade e disponibilidade, protocolos tolerantes a falhas devem ser construídos. Tolerância a falhas visa continuar a fornecer o serviço de algum protocolo, aplicação ou sistema a despeito da ocorrência de falhas durante a sua execução. Tolerância a falhas pode ser implementada por hardware ou por software através de mascaramento ou recuperação de falhas. Recentemente, a injeção de falhas implementada por software tem sido um dos principais métodos utilizados para validar protocolos tolerantes a falhas em sistemas distribuídos, e muitas ferramentas tem sido construídas. Contudo, não ha nenhuma biblioteca de classes orientada a objetos para auxiliar novos pesquisadores na construção da sua própria ferramenta de injeção de falhas. Este trabalho apresenta uma proposta de uma arquitetura orientada a objetos escrita em C++ para sistemas operacionais UNIX usando sockets, de modo a alcançar aquele objetivo. Esta arquitetura é chamada de AFIDS (Architecture for Fault Injection in Distributed Systems). AFIDS pretende fornecer uma estrutura básica que aborda as questões principais no processo de injeção de falhas implementada por software: a) a Geraldo de parâmetros de falhas para o experimento, b) o controle da localização, tipo e tempo da injeção de falhas, c) a coleta de dados do experimento, d) a injeção efetiva da falha e a análise dos dados coletados de modo a obter medidas de dependabilidade sobre o protocolo tolerante a falhas. Ou seja, AFIDS pretende ser um framework para a construed° de ferramentas de injeção de falhas. Segundo [BOO 961: "Através do uso de frameworks maduros, o esforço de desenvolvimento torna-se mais fácil, porque os principais elementos funcionais podem ser reutilizados". AFIDS leva em consideração várias questões de projeto que foram obtidas através da analise de oito ferramentas de injeção de falhas por software para sistemas distribuídos: FIAT [SEG 88], EFA [ECH 92. ECH 94], SFI [ROS 93], DOCTOR [HAN 93], PFI [DAW 94], CSFI [CAR 95a], SockPFI [DAW 95] e ORCHESTRA [DAW 96]. Para auxiliar a construção de AFIDS, é apresentada uma ferramenta de injeção de falhas que utiliza um objeto injetor de falhas por processo do protocolo sob teste. AFIDS e esta ferramenta são implementadas em C++ usando sockets em Linux. Para que AFIDS se tome estável e consistente e necessário que outras ferramentas sejam construídas baseadas nela. Isto é enfatizado porque, segundo [BOO 96]: "Um framework são começa a alcançar maturidade apos a sua aplicação em pelo menos três ou mais aplicações distintas". / Currently, distributed systems are already in wide use. Because of the Internet popularization their growth tend to arise. More and more computers interconnect and share information. In these systems, requirements such as reliability, availability and performance are fundamental in order to satisfy the users. These requirements can be reached taking advantage of the redundancy already associated with the computers interconnected. However, to reach the reliability and availability requirements, fault tolerant protocols must be built. Fault tolerance aims to provide continuous service of some protocol, application or system in despite of fault occurrence during its execution. Fault tolerance can be implemented in hardware or software using fault masking or recovery. Recently, the software-implemented fault injection has been one of the main methods used to validate fault tolerant protocols in distributed systems, and many tools has been built. However, there is no object-oriented class library to aid new researchers on the buildin g of own fault injection tool. This work presents a proposal of an objectoriented architecture written in C++ for UNIX operating systems using sockets, in order to reach that purpose. This architecture is called AFIDS (Architecture for Fault Injection in Distributed Systems). AFIDS intends to provide a basic structure that addresses the main issues of the process of software-implemented fault injection: a) the generation of fault parameters for the experiment, b) the control of the location, type and time of the fault injection, c) the data collection of the experiment, d) the effective injection of the faults, and e) the analysis of collected data in order to obtain dependability measures about the fault tolerant protocol sob test. According to [BOO 96]: "By using mature frameworks, the effort of the development team is made even easier, because now major functional elements can be reused.". AFIDS regards various design issues that were obtained from the analysis of eight tools of software-implemented fault injection for distributed systems: FIAT [SEG 88], EFA [ECH 92, ECH 94], SFI EROS 93], DOCTOR [HAN 93], PFI [DAW 94], CSFI [CAR 95a], SockPFI [DAW 95] e ORCHESTRA [DAW 96]. In order to aid the AFIDS building, a fault injection tool that uses one injector object in each process of protocol under test is shown. AFIDS and this tool are implemented in C++ using sockets on Linux operating system. AFIDS will become stable and consistent after the building of others tools based on it. This is emphasized because, according to [BOO 96]: "A framework does not even begin to reach maturity until it has been applied in at least three or more distinct applications". Confiabilidade : Computadores Tolerancia : Falhas Sistemas distribuidos Injecao : Falhas Orientacao : Objetos Fault injection Distributed systems Fault tolerance Object orientation
32	Reintegração de servidores em sistemas distribuídos / Reintegration of failed server in distributed systems Pasin, Marcia January 1998 (has links) Sistemas distribuídos representam uma plataforma ideal para implementação de sistemas computacionais com alta confiabilidade e disponibilidade devido a redundância fornecida por um grande número de estações interligadas. Falhas de um servidor podem ser contornadas pela reconfiguração do sistema. Entretanto falhas em seqüência que afetem múltiplas estações comprometem não apenas o desempenho do sistema, mas também a continuidade do serviço e sua confiabilidade. Assim, servidores falhos, que tenham sido isolados do sistema, devem ser reintegrados tão logo quanto possível para não comprometer a disponibilidade do sistema computacional. Este trabalho trata da atualização do estado de servidores e da troca de informação que o servidor recuperado realiza para integrar-se aos demais membros do sistema através de um procedimento chamado reintegração do servidor. E assumido um ambiente distribuído que garante alta confiabilidade em aplicações convencionais através da técnica de replicação de arquivos. O servidor a ser reintegrado faz parte de um grupo de replicação e volta a participar ativamente do grupo tão logo seja reintegrado. Para tanto, considera-se a estratégia de replicação por copia primaria e um sistema distribuído experimental, compatível com o NFS, desenvolvido na UFRGS para aplicar a reintegração de servidor. Os métodos de atualização de arquivos para a reintegração do servidor foram implementadas no ambiente UNIX. / Distributed systems are an ideal platform to develop high reliable computer applications due to the redundancy supplied by a great number of interconnected workstations. Failed stations can be masked reconfiguring the system. However, sequential faults, that affect multiple stations, not just decrease the performance of the system, but also affect the continuity of the service and its reliability. Thus, failed stations working as servers, that have been isolated from the system, should be reintegrated as soon as possible to not impair the system availability. This work is exactly about methods to update the state of failed servers. It deals also with the change of information that the recovered server accomplishes to be integrated to the other members of the service group through a process called reintegration of server. It is assumed a distributed environment that guarantees high reliability in conventional applications through replication of files. The server to be reintegrated is part of a replication group and it participates actively of the service group again as soon as it is reintegrated. Our approach is based on a primary copy. The file actualization methods to the reintegration of server were implemented in an UNIX environment. To illustrate our approach we will describe how the integration of repaired server can be made a fault-tolerant system. The experimental distributed system, compatible with NFS, was designed at the UFRGS. Confiabilidade : Computadores Tolerancia : Falhas Reintegracao : Servidores Sistemas operacionais distribuidos Distributed file systems Fault tolerance Reintegration of failed server
33	AFIDS : arquitetura para injeção de falhas em sistemas distribuídos / AFIDS - architecture for fault injection in distributed systems Sotoma, Irineu January 1997 (has links) Sistemas distribuídos já são de amplo uso atualmente e seu crescimento tende a se acentuar devido a popularização da Internet. Cada vez mais computadores se interligam e trocam informações entre si. Nestes sistemas, requerimentos como confiabilidade, disponibilidade e desempenho são de fundamental importância para a satisfação do usuário. Estes requerimentos podem ser atendidos aproveitando-se da redundância já existente com as maquinas interligadas. Mas para atingir os requisitos de confiabilidade e disponibilidade, protocolos tolerantes a falhas devem ser construídos. Tolerância a falhas visa continuar a fornecer o serviço de algum protocolo, aplicação ou sistema a despeito da ocorrência de falhas durante a sua execução. Tolerância a falhas pode ser implementada por hardware ou por software através de mascaramento ou recuperação de falhas. Recentemente, a injeção de falhas implementada por software tem sido um dos principais métodos utilizados para validar protocolos tolerantes a falhas em sistemas distribuídos, e muitas ferramentas tem sido construídas. Contudo, não ha nenhuma biblioteca de classes orientada a objetos para auxiliar novos pesquisadores na construção da sua própria ferramenta de injeção de falhas. Este trabalho apresenta uma proposta de uma arquitetura orientada a objetos escrita em C++ para sistemas operacionais UNIX usando sockets, de modo a alcançar aquele objetivo. Esta arquitetura é chamada de AFIDS (Architecture for Fault Injection in Distributed Systems). AFIDS pretende fornecer uma estrutura básica que aborda as questões principais no processo de injeção de falhas implementada por software: a) a Geraldo de parâmetros de falhas para o experimento, b) o controle da localização, tipo e tempo da injeção de falhas, c) a coleta de dados do experimento, d) a injeção efetiva da falha e a análise dos dados coletados de modo a obter medidas de dependabilidade sobre o protocolo tolerante a falhas. Ou seja, AFIDS pretende ser um framework para a construed° de ferramentas de injeção de falhas. Segundo [BOO 961: "Através do uso de frameworks maduros, o esforço de desenvolvimento torna-se mais fácil, porque os principais elementos funcionais podem ser reutilizados". AFIDS leva em consideração várias questões de projeto que foram obtidas através da analise de oito ferramentas de injeção de falhas por software para sistemas distribuídos: FIAT [SEG 88], EFA [ECH 92. ECH 94], SFI [ROS 93], DOCTOR [HAN 93], PFI [DAW 94], CSFI [CAR 95a], SockPFI [DAW 95] e ORCHESTRA [DAW 96]. Para auxiliar a construção de AFIDS, é apresentada uma ferramenta de injeção de falhas que utiliza um objeto injetor de falhas por processo do protocolo sob teste. AFIDS e esta ferramenta são implementadas em C++ usando sockets em Linux. Para que AFIDS se tome estável e consistente e necessário que outras ferramentas sejam construídas baseadas nela. Isto é enfatizado porque, segundo [BOO 96]: "Um framework são começa a alcançar maturidade apos a sua aplicação em pelo menos três ou mais aplicações distintas". / Currently, distributed systems are already in wide use. Because of the Internet popularization their growth tend to arise. More and more computers interconnect and share information. In these systems, requirements such as reliability, availability and performance are fundamental in order to satisfy the users. These requirements can be reached taking advantage of the redundancy already associated with the computers interconnected. However, to reach the reliability and availability requirements, fault tolerant protocols must be built. Fault tolerance aims to provide continuous service of some protocol, application or system in despite of fault occurrence during its execution. Fault tolerance can be implemented in hardware or software using fault masking or recovery. Recently, the software-implemented fault injection has been one of the main methods used to validate fault tolerant protocols in distributed systems, and many tools has been built. However, there is no object-oriented class library to aid new researchers on the buildin g of own fault injection tool. This work presents a proposal of an objectoriented architecture written in C++ for UNIX operating systems using sockets, in order to reach that purpose. This architecture is called AFIDS (Architecture for Fault Injection in Distributed Systems). AFIDS intends to provide a basic structure that addresses the main issues of the process of software-implemented fault injection: a) the generation of fault parameters for the experiment, b) the control of the location, type and time of the fault injection, c) the data collection of the experiment, d) the effective injection of the faults, and e) the analysis of collected data in order to obtain dependability measures about the fault tolerant protocol sob test. According to [BOO 96]: "By using mature frameworks, the effort of the development team is made even easier, because now major functional elements can be reused.". AFIDS regards various design issues that were obtained from the analysis of eight tools of software-implemented fault injection for distributed systems: FIAT [SEG 88], EFA [ECH 92, ECH 94], SFI EROS 93], DOCTOR [HAN 93], PFI [DAW 94], CSFI [CAR 95a], SockPFI [DAW 95] e ORCHESTRA [DAW 96]. In order to aid the AFIDS building, a fault injection tool that uses one injector object in each process of protocol under test is shown. AFIDS and this tool are implemented in C++ using sockets on Linux operating system. AFIDS will become stable and consistent after the building of others tools based on it. This is emphasized because, according to [BOO 96]: "A framework does not even begin to reach maturity until it has been applied in at least three or more distinct applications". Confiabilidade : Computadores Tolerancia : Falhas Sistemas distribuidos Injecao : Falhas Orientacao : Objetos Fault injection Distributed systems Fault tolerance Object orientation
34	Software tolerante a falhas para aplicações tempo real Denardin, Fernanda Kruel January 1997 (has links) Esta dissertação aborda um ramo da computação que se encontra em crescente desenvolvimento: a computação em tempo real. Os sistemas de computação tempo real surgiram a partir da necessidade de substituição do controle humano, que muitas vezes é falho, em situações complexas ou críticas, onde máxima confiabilidade e disponibilidade são exigidas para garantir a segurança do sistema. A área de aplicação diferencia-se de outras convencionais por possuir diferentes tipos de restrições de tempo e operar em ambientes não-determinísticos. Entretanto, atualmente tais sistemas estão tornando-se grandes, complexos, distribuídos, adaptativos e cada vez mais presentes nas aplicações do dia-a-dia,o que tende a exigir soluções mais simples e generalizadas. Pelo fato de tais sistemas normalmente atuarem sobre aplicações críticas, importante salientar que, em algumas situações, pequenos erros no sistema podem levar a grandes catástrofes. Mesmo atrasos mínimos no tempo de resposta são problemáticos, podendo ocasionar degradações ou ações erradas no mundo físico controlado pelo sistema tempo real. Como nestes casos máxima confiabilidade e disponibilidade são exigidas para garantir a sua segurança, tornou-se importante a construção de sistemas tempo real tolerantes a falhas. Dessa forma, é visivelmente crescente a necessidade de utilização de mecanismos capazes de abordar os requisitos de tempo real e tolerância a falhas de forma integrada durante o desenvolvimento do sistema. Assim, o processo de desenvolvimento de sistemas tempo real confiáveis torna-se mais simples e mais eficiente. A necessidade de maior conhecimento do uso de tolerância a falhas para obter segurança no funcionamento de aplicações tempo real levou ao desenvolvimento deste trabalho, onde buscou-se um caminho de solução para a adequação das técnicas de tolerância a falhas a estas aplicações. Sabe-se que para produzir software confiável e, desta forma de maior qualidade, além do emprego de boas técnicas de engenharia de software, é necessário compreender os principais conceitos e técnicas de tolerância a falhas. Por outro lado, é importante ter-se conhecimento dos mecanismos oferecidos pelas diversas camadas de software de um sistema - protocolo de comunicação, sistema operacional e linguagem de programação - para apoiar estas atividades de tolerância a falhas. Este trabalho busca analisar os mecanismos e técnicas usados na implementação de software tolerante a falhas frente às situações mencionadas, uma vez que nem todas as técnicas conhecidas podem ser indistintamente aplicáveis a estas situações. Os resultados desta análise são organizados na forma de uma taxonomia, visando assim auxiliar projetistas de desenvolvimento de software a tomarem decisões importantes na construção de sistemas tempo real tolerantes a falhas. Os mecanismos são agrupados de acordo com o nível de implementação: sistemas operacionais, linguagens de programação e protocolos de comunicação, destacando suas características e aplicabilidade. Por fim uso da classificação é demonstrado com a análise de três casos-exemplo. / This dissertation is about a, computer science field which is in growing development, that is, real-time computation. Real-time computing systems have emerged from the necessity of substituting. human control which is sometimes failed in complex or critical situations. In these ones maximum availability and reliability are requested in order to guarantee the system dependability. The application area differs from the conventional ones because it has particular time constraints and operates in nondeterministic environments. Nevertheless, nowadays such systems are becoming large, complex, distributed and adaptive but tend to demand simpler and generalized solutions as they are more present in daily applications. Since such systems normally act on critical applications it is important to reinforce, that in some situations, subtle systems errors may generate big catastrophes. Even slight delays in response time are troublesome and they may cause degradation or wrong acts in physical world controlled by real-time systems. In these cases maximum reliability and availability are requested in order to guarantee system dependability. Thereby, the requirement of including mechanisms capable of achieving real-time and fault tolerance in an integrated way during the system design has been increased. Thus, the developing process of reliable real-time systems becomes simpler and more effective. The necessity of improving designers knowledge on using fault tolerance in order to obtain dependability on real-time applications has motivated this study. Our main goal has been to find an adequate way of using fault tolerance techniques to these applications. It is known that the development of reliable software not only requires appropriate software engineering techniques but also demands understanding of main politics and mechanisms used to implement fault tolerance techniques in these situations. Otherwise, it is very important to know the related support that is offered by the different software levels of a system - communication protocol, operating system and programming language. This study has as purpose analyzing the mechanisms and techniques used in implementation of fault-tolerant software applied to the previously mentioned situations. The basic supposition is that not all the known techniques may be applied indistinctly to these situations. The properties of the software are organized according to a taxonomy, where the mechanisms are bracketed in groups according to implementation level: operating systems, programming languages and communication protocols. In this presentation, the characteristics and applicability of the software tools are stood out in order to help developing-software designers to decide what is important to build faulttolerant software. Finally, the use of the classification is demonstrated by analyzing three case-examples. Confiabilidade : Computadores Tolerancia : Falhas Sistemas : Tempo real Sistemas tolerantes : Falhas Real-time Fault tolerance Fault-tolerant software Fault-tolerant realtime systems
35	Software tolerante a falhas para aplicações tempo real Denardin, Fernanda Kruel January 1997 (has links) Esta dissertação aborda um ramo da computação que se encontra em crescente desenvolvimento: a computação em tempo real. Os sistemas de computação tempo real surgiram a partir da necessidade de substituição do controle humano, que muitas vezes é falho, em situações complexas ou críticas, onde máxima confiabilidade e disponibilidade são exigidas para garantir a segurança do sistema. A área de aplicação diferencia-se de outras convencionais por possuir diferentes tipos de restrições de tempo e operar em ambientes não-determinísticos. Entretanto, atualmente tais sistemas estão tornando-se grandes, complexos, distribuídos, adaptativos e cada vez mais presentes nas aplicações do dia-a-dia,o que tende a exigir soluções mais simples e generalizadas. Pelo fato de tais sistemas normalmente atuarem sobre aplicações críticas, importante salientar que, em algumas situações, pequenos erros no sistema podem levar a grandes catástrofes. Mesmo atrasos mínimos no tempo de resposta são problemáticos, podendo ocasionar degradações ou ações erradas no mundo físico controlado pelo sistema tempo real. Como nestes casos máxima confiabilidade e disponibilidade são exigidas para garantir a sua segurança, tornou-se importante a construção de sistemas tempo real tolerantes a falhas. Dessa forma, é visivelmente crescente a necessidade de utilização de mecanismos capazes de abordar os requisitos de tempo real e tolerância a falhas de forma integrada durante o desenvolvimento do sistema. Assim, o processo de desenvolvimento de sistemas tempo real confiáveis torna-se mais simples e mais eficiente. A necessidade de maior conhecimento do uso de tolerância a falhas para obter segurança no funcionamento de aplicações tempo real levou ao desenvolvimento deste trabalho, onde buscou-se um caminho de solução para a adequação das técnicas de tolerância a falhas a estas aplicações. Sabe-se que para produzir software confiável e, desta forma de maior qualidade, além do emprego de boas técnicas de engenharia de software, é necessário compreender os principais conceitos e técnicas de tolerância a falhas. Por outro lado, é importante ter-se conhecimento dos mecanismos oferecidos pelas diversas camadas de software de um sistema - protocolo de comunicação, sistema operacional e linguagem de programação - para apoiar estas atividades de tolerância a falhas. Este trabalho busca analisar os mecanismos e técnicas usados na implementação de software tolerante a falhas frente às situações mencionadas, uma vez que nem todas as técnicas conhecidas podem ser indistintamente aplicáveis a estas situações. Os resultados desta análise são organizados na forma de uma taxonomia, visando assim auxiliar projetistas de desenvolvimento de software a tomarem decisões importantes na construção de sistemas tempo real tolerantes a falhas. Os mecanismos são agrupados de acordo com o nível de implementação: sistemas operacionais, linguagens de programação e protocolos de comunicação, destacando suas características e aplicabilidade. Por fim uso da classificação é demonstrado com a análise de três casos-exemplo. / This dissertation is about a, computer science field which is in growing development, that is, real-time computation. Real-time computing systems have emerged from the necessity of substituting. human control which is sometimes failed in complex or critical situations. In these ones maximum availability and reliability are requested in order to guarantee the system dependability. The application area differs from the conventional ones because it has particular time constraints and operates in nondeterministic environments. Nevertheless, nowadays such systems are becoming large, complex, distributed and adaptive but tend to demand simpler and generalized solutions as they are more present in daily applications. Since such systems normally act on critical applications it is important to reinforce, that in some situations, subtle systems errors may generate big catastrophes. Even slight delays in response time are troublesome and they may cause degradation or wrong acts in physical world controlled by real-time systems. In these cases maximum reliability and availability are requested in order to guarantee system dependability. Thereby, the requirement of including mechanisms capable of achieving real-time and fault tolerance in an integrated way during the system design has been increased. Thus, the developing process of reliable real-time systems becomes simpler and more effective. The necessity of improving designers knowledge on using fault tolerance in order to obtain dependability on real-time applications has motivated this study. Our main goal has been to find an adequate way of using fault tolerance techniques to these applications. It is known that the development of reliable software not only requires appropriate software engineering techniques but also demands understanding of main politics and mechanisms used to implement fault tolerance techniques in these situations. Otherwise, it is very important to know the related support that is offered by the different software levels of a system - communication protocol, operating system and programming language. This study has as purpose analyzing the mechanisms and techniques used in implementation of fault-tolerant software applied to the previously mentioned situations. The basic supposition is that not all the known techniques may be applied indistinctly to these situations. The properties of the software are organized according to a taxonomy, where the mechanisms are bracketed in groups according to implementation level: operating systems, programming languages and communication protocols. In this presentation, the characteristics and applicability of the software tools are stood out in order to help developing-software designers to decide what is important to build faulttolerant software. Finally, the use of the classification is demonstrated by analyzing three case-examples. Confiabilidade : Computadores Tolerancia : Falhas Sistemas : Tempo real Sistemas tolerantes : Falhas Real-time Fault tolerance Fault-tolerant software Fault-tolerant realtime systems
36	Software tolerante a falhas para aplicações tempo real Denardin, Fernanda Kruel January 1997 (has links) Esta dissertação aborda um ramo da computação que se encontra em crescente desenvolvimento: a computação em tempo real. Os sistemas de computação tempo real surgiram a partir da necessidade de substituição do controle humano, que muitas vezes é falho, em situações complexas ou críticas, onde máxima confiabilidade e disponibilidade são exigidas para garantir a segurança do sistema. A área de aplicação diferencia-se de outras convencionais por possuir diferentes tipos de restrições de tempo e operar em ambientes não-determinísticos. Entretanto, atualmente tais sistemas estão tornando-se grandes, complexos, distribuídos, adaptativos e cada vez mais presentes nas aplicações do dia-a-dia,o que tende a exigir soluções mais simples e generalizadas. Pelo fato de tais sistemas normalmente atuarem sobre aplicações críticas, importante salientar que, em algumas situações, pequenos erros no sistema podem levar a grandes catástrofes. Mesmo atrasos mínimos no tempo de resposta são problemáticos, podendo ocasionar degradações ou ações erradas no mundo físico controlado pelo sistema tempo real. Como nestes casos máxima confiabilidade e disponibilidade são exigidas para garantir a sua segurança, tornou-se importante a construção de sistemas tempo real tolerantes a falhas. Dessa forma, é visivelmente crescente a necessidade de utilização de mecanismos capazes de abordar os requisitos de tempo real e tolerância a falhas de forma integrada durante o desenvolvimento do sistema. Assim, o processo de desenvolvimento de sistemas tempo real confiáveis torna-se mais simples e mais eficiente. A necessidade de maior conhecimento do uso de tolerância a falhas para obter segurança no funcionamento de aplicações tempo real levou ao desenvolvimento deste trabalho, onde buscou-se um caminho de solução para a adequação das técnicas de tolerância a falhas a estas aplicações. Sabe-se que para produzir software confiável e, desta forma de maior qualidade, além do emprego de boas técnicas de engenharia de software, é necessário compreender os principais conceitos e técnicas de tolerância a falhas. Por outro lado, é importante ter-se conhecimento dos mecanismos oferecidos pelas diversas camadas de software de um sistema - protocolo de comunicação, sistema operacional e linguagem de programação - para apoiar estas atividades de tolerância a falhas. Este trabalho busca analisar os mecanismos e técnicas usados na implementação de software tolerante a falhas frente às situações mencionadas, uma vez que nem todas as técnicas conhecidas podem ser indistintamente aplicáveis a estas situações. Os resultados desta análise são organizados na forma de uma taxonomia, visando assim auxiliar projetistas de desenvolvimento de software a tomarem decisões importantes na construção de sistemas tempo real tolerantes a falhas. Os mecanismos são agrupados de acordo com o nível de implementação: sistemas operacionais, linguagens de programação e protocolos de comunicação, destacando suas características e aplicabilidade. Por fim uso da classificação é demonstrado com a análise de três casos-exemplo. / This dissertation is about a, computer science field which is in growing development, that is, real-time computation. Real-time computing systems have emerged from the necessity of substituting. human control which is sometimes failed in complex or critical situations. In these ones maximum availability and reliability are requested in order to guarantee the system dependability. The application area differs from the conventional ones because it has particular time constraints and operates in nondeterministic environments. Nevertheless, nowadays such systems are becoming large, complex, distributed and adaptive but tend to demand simpler and generalized solutions as they are more present in daily applications. Since such systems normally act on critical applications it is important to reinforce, that in some situations, subtle systems errors may generate big catastrophes. Even slight delays in response time are troublesome and they may cause degradation or wrong acts in physical world controlled by real-time systems. In these cases maximum reliability and availability are requested in order to guarantee system dependability. Thereby, the requirement of including mechanisms capable of achieving real-time and fault tolerance in an integrated way during the system design has been increased. Thus, the developing process of reliable real-time systems becomes simpler and more effective. The necessity of improving designers knowledge on using fault tolerance in order to obtain dependability on real-time applications has motivated this study. Our main goal has been to find an adequate way of using fault tolerance techniques to these applications. It is known that the development of reliable software not only requires appropriate software engineering techniques but also demands understanding of main politics and mechanisms used to implement fault tolerance techniques in these situations. Otherwise, it is very important to know the related support that is offered by the different software levels of a system - communication protocol, operating system and programming language. This study has as purpose analyzing the mechanisms and techniques used in implementation of fault-tolerant software applied to the previously mentioned situations. The basic supposition is that not all the known techniques may be applied indistinctly to these situations. The properties of the software are organized according to a taxonomy, where the mechanisms are bracketed in groups according to implementation level: operating systems, programming languages and communication protocols. In this presentation, the characteristics and applicability of the software tools are stood out in order to help developing-software designers to decide what is important to build faulttolerant software. Finally, the use of the classification is demonstrated by analyzing three case-examples. Confiabilidade : Computadores Tolerancia : Falhas Sistemas : Tempo real Sistemas tolerantes : Falhas Real-time Fault tolerance Fault-tolerant software Fault-tolerant realtime systems

Page generated in 0.068 seconds