Spelling suggestions: "subject:"multiinjection"" "subject:"faultdetection""
1 |
Fiesta++ : a software implemented fault injection tool for transient fault injectionChaudhari, Ameya Suhas 26 January 2015 (has links)
Computer systems, even when correctly designed, can suffer from temporary errors due to radiation particles striking the circuit or changes in the operating conditions such as the temperature or the voltage. Such transient errors can cause systems to malfunction or even crash. Fault injection is a technique used for simulating the effect of such errors on the system. Fault injection tools inject errors in either the software running on the processors or in the underlying computer hardware to simulate the effect of a fault and observe the system behavior. These tools can be used to determine the different responses of the system to such errors and estimate the probability of occurrence of errors in the computations performed by the system. They can also be used to test the fault tolerance capabilities of the system under test or any proposed technique for providing fault tolerance in circuits or software. As a part of this thesis, I have developed a software implemented fault injection tool, Fiesta++, for evaluating the fault tolerance and fault response of software applications. Software implemented fault injection tools inject faults into the software state of the application as it runs on a processor. Since such fault injection tools are used to conduct experiments on applications executing natively on a processor, the experiments can be carried out at almost the same speed as the application execution and can be run on the same hardware as used by the software application in the field. Fiesta++ offers two modes of operation: whitebox and blackbox. The whitebox mode assumes that users have some degree of knowledge of the structure of the software under test and allows them to specify fault injection targets in terms of the application variables and fault injection time in terms of code locations and events at run time. It can be used for precise fault injection to get reproducible outcomes from the fault injection experiments. The blackbox mode is targeted for the case where the user has very little or no knowledge of the application code structure. In this mode, Fiesta++ provides the user with a view of the active process memory and an array of associated information which a user can use to inject faults. / text
|
2 |
Fault Injection Attacks on RSA and CSIDHChiu, TingHung 16 May 2024 (has links)
Fault injection attacks are a powerful technique that intentionally induces faults during computations to leak secret information. This thesis studies the fault injection attack techniques. The thesis first categorizes various fault attack methods by fault model and fault analysis and gives examples of the various fault attacks on symmetric key cryptosystems and public key cryptosystems. The thesis then demonstrates fault injection attacks on RSA-CRT and constant time CSIDH. The fault attack consists of two main components: fault modeling, which examines methods for injecting faults in a target device, and fault analysis, which analyzes the resulting faulty outputs to deduce secrets in each cryptosystem. The thesis aims to provide a comprehensive survey on fault attack research, directions for further study on securing real-world cryptosystems against fault injection attacks, testing fault injection attacks with RSA-CRT, and demonstrate and evaluate fault injection attacks on constant time CSIDH. / Master of Science / Fault injection attacks are attacks where the attackers intentionally induce the fault in the device during the operation to obtain or recover secret information. The induced fault will impact the operation and cause the faulty output, providing the information to attackers. Many cryptographic algorithms and devices have been proven vulnerable to fault injection attacks. Cryptography is essential nowadays, as it is used to secure and protect confidential data. If the cryptosystem is broken, many system today will be compromised. Thus, this thesis focus on the fault injection attacks on the cryptosystems. This thesis introduces the background of fault injection attacks, categorizes them into different types, and provides examples of the attacks on cryptosystems. The thesis studies how the attacks work, including how the attack induces the fault in the device and how the attack analyzes the fault output they obtained. Specifically, I examine how these attacks affect two commonly used encryption methods: symmetric key cryptography and public key cryptography. Additionally, I implement the fault injection attack on RSA-CRT and emph{Commutative Supersingular Isogeny Diffie-Hellman}~(CSIDH). This research aims to understand the potential attack method on different cryptosystems and can explore mitigation or protection in the future.
|
3 |
Thwarting Electromagnetic Fault Injection Attack Utilizing Timing Attack CountermeasureGhodrati, Marjan 23 January 2018 (has links)
The extent of embedded systems' role in modern life has continuously increased over the years. Moreover, embedded systems are assuming highly critical functions with security requirements more than ever before. Electromagnetic fault injection (EMFI) is an efficient class of physical attacks that can compromise the immunity of secure cryptographic algorithms. Despite successful EMFI attacks, the effects of electromagnetic injection on a processor are not well understood. This includes lack of solid knowledge about how EMFI affects the circuit and deviates it from proper functionality. Also, effects of EM glitches on the global networks of a chip such as power, clock and reset network are not known. We believe to properly model EMFI and develop effective countermeasures, a deeper understanding of the EM effect on a chip is needed. In this thesis, we present a bottom-up analysis of EMFI effects on a RISC microprocessor. We study these effects at three levels: at the wire-level, at the chip-network level, and at the gate-level considering parameters such as EM-injection location and timing. We conclude that EMFI induces local timing errors implying current timing attack detection and prevention techniques can be adapted to overcome EMFI. To further validate our hypothesis, we integrate a configurable timing sensor into our microprocessor to evaluate its effectiveness against EMFI. / Master of Science / In the current technology era, embedded systems play a critical role in every human’s life. They are collecting very precise and private information of the users. So, they can become a potential target for the attackers to steal this valuable information. As a result, the security of these devices becomes a serious issue in this era.
Electromagnetic fault injection (EMFI) is an efficient class of physical attacks that can inject faults to the state of the processor and deviate it from its proper functionality. Despite its growing popularity among the attackers, limitations and capabilities of this attack are not very well understood. Several detection techniques have been proposed so far, but most of them are either very expensive to implement or not very effective. We believe to properly model EMFI and develop effective countermeasures, a deeper understanding of the EM effect on a chip is needed. In this research work, we try to perform a bottom-up analysis of EM fault injection on a RISC microprocessor and do a comprehensive study at all wire-level, chip-network level, and gate-level and finally propose a solution for it.
|
4 |
Radiation Hardened System Design with Mitigation and Detection in FPGASandberg, Hampus January 2016 (has links)
FPGAs are attractive devices as they enable the designer to make changes to the system during its lifetime. This is important in the early stages of development when all the details of the final system might not be known yet. In a research environment like at CERN there are many FPGAs used for this very reason and also because they enable high speed communication and processing. The biggest problem at CERN is that the systems might have to operate in a radioactive envi- ronment which is very harsh on electronics. ASICs can be designed to withstand high levels of radiation and are used in many places but they are expensive in terms of cost and time and they are not very flexible. There is therefore a need to understand if it is possible to use FPGAs in these places or what needs to be done to make it possible. Mitigation techniques can be used to avoid that a fault caused by radiation is disrupting the system. How this can be done and the importance of under- standing the underlying architecture of the FPGA is discussed in this thesis. A simulation tool used for injecting faults into the design is proposed in order to verify that the techniques used are working as expected which might not always be the case. The methods used during simulation which provided the best protec- tion against faults is added to a system design which is implemented on a flash based FPGA mounted on a board. This board was installed in the CERN Proton Synchrotron for 99 days during which the system was continuously monitored. During this time 11 faults were detected and the system was still functional at the end of the test. The result from the simulation and hardware test shows that with reasonable effort it is possible to use commercially available FPGAs in a radioactive environment.
|
5 |
Um ambiente para descrição de cenários detalhados de falhas / An environment for detailed fault scenarios descriptionMunaretti, Ruthiano Simioni January 2010 (has links)
A utilização de várias ferramentas de injeção de falhas em um mesmo experimento de testes fornece mais subsídios para os resultados alcançados, tornando a atividade mais efetiva e menos sujeita a erros de interpretação. Neste sentido, as cargas de falhas possuem um importante papel, visto que elas compõem a principal entrada a ser fornecida nestas ferramentas. No entanto, os mecanismos oferecidos, nas ferramentas de injeção de falhas existentes, para esta especificação de cargas de falhas, possuem um baixo grau de usabilidade e expressividade. Por este motivo, o presente trabalho aborda uma metodologia, na qual cenários detalhados de testes, que envolvam experimentos com injeção de falhas, possam ser especificados de maneira simples, homogênea e padronizada. Para isso, é proposta a criação de um ambiente para a especificação destas cargas de falhas, denominado como jFaultload. Este ambiente, por sua vez, utiliza-se de um subconjunto da linguagem Java para a especificação destas cargas de falhas, ficando responsável ainda pela tradução, desta carga em Java, para os respectivos formatos de carga referentes a cada injetor de falhas utilizado em um dado experimento. Para efeito de exemplo e validação do ambiente proposto, as ferramentas FIRMAMENT, MENDOSUS e FAIL/FCI são integradas neste ambiente, tornando assim o cenário de testes amplamente detalhado. O serviço a ser testado, visando a demonstração da usabilidade e expressividade da solução proposta, foi uma sessão de video streaming, utilizando-se para isso do protocolo RTP, onde uma campanha de testes foi realizada com o injetor FIRMAMENT. / Use of two or more fault injection tools in a test campaign enriches the scenario obtained from a test execution. Faultloads represent the main input for these tools but their specification mechanisms lack usability and expressiveness. This thesis presents a full test scenario featuring the use of jFaultload, which applies Java for the specification of faultloads and translates them to specific formats that are appropriate to each available fault injector. FIRMAMENT, MENDOSUS and FAIL/FCI, fault injectors for communication systems, were integrated in the environment and complete the test scenario. The service under test used to demonstrate the usability and expressiveness of our solution is a video streaming session using RTP Protocol, which a test campaign was executed through the FIRMAMENT fault injector.
|
6 |
Analyzing the Impact of Radiation-induced Failures in All Programmable System-on-Chip Devices / Avaliação do impacto de falhas induzidas pela radiação em dispositivos sistemas-em-chip totalmente programáveisTambara, Lucas Antunes January 2017 (has links)
O recente avanço da indústria de semicondutores tem possibilitado a integração de componentes complexos e arquiteturas de sistemas dentro de um único chip de silício. Atualmente, FPGAs do estado da arte incluem, não apenas a matriz de lógica programável, mas também outros blocos de hardware, como processadores de propósito geral, blocos de processamento dedicado, interfaces para vários periféricos, estruturas de barramento internas ao chip, e blocos analógicos. Estes novos dispositivos são comumente chamados de Sistemasem-Chip Totalmente Programáveis (APSoCs). Uma das maiores preocupações acerca dos efeitos da radiação em APSoCs é o fato de que erros induzidos pela radiação podem ter diferente probabilidade e criticalidade em seus blocos de hardware heterogêneos, em ambos os níveis de dispositivo e projeto. Por esta razão, este trabalho realiza uma investigação profunda acerca dos efeitos da radiação em APSoCs e da correlação entre a sensibilidade de recursos de hardware e software na performance geral do sistema. Diversos experimentos estáticos e dinâmicos inéditos foram realizados nos blocos de hardware de um APSoC a fim de melhor entender as relações entre confiabilidade e performance de cada parte separadamente. Os resultados mostram que há um comprometimento a ser analisado entre o desempenho e a área de choque de um projeto durante o desenvolvimento de um sistema em um APSoC. Desse modo, é fundamental levar em consideração cada opção de projeto disponível e todos os parâmetros do sistema envolvidos, como o tempo de execução e a carga de trabalho, e não apenas a sua seção de choque. Exemplificativamente, os resultados mostram que é possível aumentar o desempenho de um sistema em até 5.000 vezes com um pequeno aumento na sua seção de choque de até 8 vezes, aumentando assim a confiabilidade operacional do sistema. Este trabalho também propõe um fluxo de análise de confiabilidade baseado em injeções de falhas para estimar a tendência de confiabilidade de projetos somente de hardware, de software, ou de hardware e software. O fluxo objetiva acelerar a procura pelo esquema de projeto com a melhor relação entre performance e confiabilidade dentre as opções possíveis. A metodologia leva em consideração quatro grupos de parâmetros, os quais são: recursos e performance; erros e bits críticos; medidas de radiação, tais como seções de choque estáticas e dinâmicas; e, carga de trabalho média entre falhas. Os resultados obtidos mostram que o fluxo proposto é um método apropriado para estimar tendências de confiabilidade de projeto de sistemas em APSoCs antes de experimentos com radiação. / The recent advance of the semiconductor industry has allowed the integration of complex components and systems’ architectures into a single silicon die. Nowadays, state-ofthe-art FPGAs include not only the programmable logic fabric but also hard-core parts, such as hard-core general-purpose processors, dedicated processing blocks, interfaces to various peripherals, on-chip bus structures, and analog blocks. These new devices are commonly called of All Programmable System-on-Chip (APSoC) devices. One of the major concerns about radiation effects on APSoCs is that radiation-induced errors may have different probability and criticality in their heterogeneous hardware parts at both device and design levels. For this reason, this work performs a deep investigation about the radiation effects on APSoCs and the correlation between hardware and software resources sensitivity in the overall system performance. Several static and dynamic experiments were performed on different hardware parts of an APSoC to better understand the trade-offs between reliability and performance of each part separately. Results show that there is a trade-off between design cross section and performance to be analyzed when developing a system on an APSoC. Therefore, today it is mandatory to take into account each design option available and all the parameters of the system involved, such as the execution time and the workload of the system, and not only its cross section. As an example, results show that it is possible to increase the performance of a system up to 5,000 times by changing its architecture with a small impact in cross section (increase up to 8 times), significantly increasing the operational reliability of the system. This work also proposes a reliability analysis flow based on fault injection for estimating the reliability trend of hardware-only designs, software-only designs, and hardware and software co-designs. It aims to accelerate the search for the design scheme with the best trade-off between performance and reliability among the possible ones. The methodology takes into account four groups of parameters, which are the following: area resources and performance; the number of output errors and critical bits; radiation measurements, such as static and dynamic cross sections; and, Mean Workload Between Failures. The obtained results show that the proposed flow is a suitable method for estimating the reliability trend of system designs on APSoCs before radiation experiments.
|
7 |
Fault Detection in Autonomous RobotsChristensen, Anders L 27 June 2008 (has links)
In this dissertation, we study two new approaches to fault detection for autonomous robots. The first approach involves the synthesis of software components that give a robot the capacity to detect faults which occur in itself. Our hypothesis is that hardware faults change the flow of sensory data and the actions performed by the control program. By detecting these changes, the presence of faults can be inferred. In order to test our hypothesis, we collect data in three different tasks performed by real robots. During a number of training runs, we record sensory data from the robots both while they are operating normally and after a fault has been injected. We use back-propagation neural networks to synthesize fault detection components based on the data collected in the training runs. We evaluate the performance of the trained fault detectors in terms of the number of false positives and the time it takes to detect a fault.
The results show that good fault detectors can be obtained. We extend the set of possible faults and go on to show that a single fault detector can be trained to detect several faults in both a robot's sensors and actuators. We show that fault detectors can be synthesized that are robust to variations in the task. Finally, we show how a fault detector can be trained to allow one robot to detect faults that occur in another robot.
The second approach involves the use of firefly-inspired synchronization to allow the presence of faulty robots to be determined by other non-faulty robots in a swarm robotic system. We take inspiration from the synchronized flashing behavior observed in some species of fireflies. Each robot flashes by lighting up its on-board red LEDs and neighboring robots are driven to flash in synchrony. The robots always interpret the absence of flashing by a particular robot as an indication that the robot has a fault. A faulty robot can stop flashing periodically for one of two reasons. The fault itself can render the robot unable to flash periodically.
Alternatively, the faulty robot might be able to detect the fault itself using endogenous fault detection and decide to stop flashing.
Thus, catastrophic faults in a robot can be directly detected by its peers, while the presence of less serious faults can be detected by the faulty robot itself, and actively communicated to neighboring robots. We explore the performance of the proposed algorithm both on a real world swarm robotic system and in simulation. We show that failed robots are detected correctly and in a timely manner, and we show that a system composed of robots with simulated self-repair capabilities can survive relatively high failure rates.
We conclude that i) fault injection and learning can give robots the capacity to detect faults that occur in themselves, and that ii) firefly-inspired synchronization can enable robots in a swarm robotic system to detect and communicate faults.
|
8 |
Injeção de falhas de comunicação em ambientes distribuídosOliveira, Gustavo Menezes January 2011 (has links)
A busca por características de dependabilidade em aplicações distribuídas está cada vez maior. Para tanto, técnicas de tolerância a falhas são componentes importantes no processo de desenvolvimento de um software, e requerem a reprodução de cenários espe- cíficos de falhas para possibilitar uma avaliação adequada. Nestes casos, resta ao engenheiro de teste a integração de experimentos da aplicação- alvo com ferramentas auxiliares para emulação de um ambiente fiel para a execução de testes. Entretanto, tais ferramentas auxiliares, designadas injetores de falhas de comuni- cação, muitas vezes não estão disponíveis para a comunidade ou, na melhor das hipóteses, apresentam baixa funcionalidade, seja pela incompatibilidade com sistemas mais atuali- zados, seja pela implementação superficial de funções específicas (protótipos). Outro fator agravante para a realização de avaliações experimentais em aplicações distribuídas está no suporte a falhas distribuídas, ou seja, injetores de falhas de comunica- ção não, obrigatoriamente, estão aptos a reproduzir os comportamentos necessários para emulação de ambientes distribuídos adequados. Desta forma, este trabalho destina-se ao estudo e proposta de uma solução para injeção de falhas em ambientes distribuídos, em especial o particionamento de rede, e deu origem ao injetor de falhas PIE. PIE (Partitioning Injection Environment) é um injetor de falhas de comunicação vol- tado para injeção de particionamentos de rede. Sua arquitetura distribuída permite o con- trole centralizado do ambiente por parte do engenheiro de testes. Com isso, a criação de uma única carga de falhas pode ser facilmente replicada para os demais nodos componen- tes do ambiente experimental. Apesar de adotar um coordenador de experimentos, durante a execução de testes, cada nodo interpreta sua carga de falhas e processa-a localmente, ga- rantindo a baixa intrusividade da ferramenta e evitando a ocorrência de comportamentos inesperados pela aplicação-alvo. Como mecanismo de avaliação desta proposta foram realizados experimentos com diferentes aplicações-alvo, disponibilizadas pelo framework JGroups, com um conjunto de cenários de falha específico para cada aplicação. Desta forma, foi possível comprovar a viabilidade e utilidade do modelo e arquitetura do injetor de falhas PIE levando em consideração sua funcionalidade, intrusividade e corretude dos resultados experimentais. / Communication Fault Injection in Distributed Environments The search for dependability characteristics in distributed applications is increasing quickly. For these, fault tolerance techniques are important components in software de- velopment and requires the emulation of specific scenarios to allow a proper evaluation. In these cases, it remains to the test managers the integration of the target application with extra tools for a faithful emulation environment. However, such tools, named com- munication fault injectors, are not available to the community or, in other cases, presents a very poor functionality, incompatibility with current systems, either by superficial im- plementation of specific functions (prototypes). Another problem for achieving experimental evaluations in distributed applications is the support to distributed faults. Communication fault injectors not necessarily are able to reproduce the behaviors required for proper environment emulation. Thus, this work aims to study and propose a solution for fault injection in distributed environments in particular network partitioning, and led to PIE fault injector. PIE (Partitioning Injection Environment) is a communication fault injector aimed to network partitioning injection. Its distributed architecture allows centralized control by the test manager. Thus, a fault load can be easily replicated to other nodes. Despite adopting a experiment coordinator, each node interprets its fault load and processes it locally during testing, ensuring PIE low intrusiveness and avoiding the occurrence of unexpected behavior by the target application. As an assessment of this work, experiments were done with different target appli- cations, provided by JGroups framework, with a set of specific fault scenarios to each application. Thus, it was able to prove the feasibility and usefulness of the model and architecture of the PIE fault injector considering its functionality, intrusiveness and cor- rectness of the experimental results.
|
9 |
Injeção de falhas de comunicação em ambientes distribuídosOliveira, Gustavo Menezes January 2011 (has links)
A busca por características de dependabilidade em aplicações distribuídas está cada vez maior. Para tanto, técnicas de tolerância a falhas são componentes importantes no processo de desenvolvimento de um software, e requerem a reprodução de cenários espe- cíficos de falhas para possibilitar uma avaliação adequada. Nestes casos, resta ao engenheiro de teste a integração de experimentos da aplicação- alvo com ferramentas auxiliares para emulação de um ambiente fiel para a execução de testes. Entretanto, tais ferramentas auxiliares, designadas injetores de falhas de comuni- cação, muitas vezes não estão disponíveis para a comunidade ou, na melhor das hipóteses, apresentam baixa funcionalidade, seja pela incompatibilidade com sistemas mais atuali- zados, seja pela implementação superficial de funções específicas (protótipos). Outro fator agravante para a realização de avaliações experimentais em aplicações distribuídas está no suporte a falhas distribuídas, ou seja, injetores de falhas de comunica- ção não, obrigatoriamente, estão aptos a reproduzir os comportamentos necessários para emulação de ambientes distribuídos adequados. Desta forma, este trabalho destina-se ao estudo e proposta de uma solução para injeção de falhas em ambientes distribuídos, em especial o particionamento de rede, e deu origem ao injetor de falhas PIE. PIE (Partitioning Injection Environment) é um injetor de falhas de comunicação vol- tado para injeção de particionamentos de rede. Sua arquitetura distribuída permite o con- trole centralizado do ambiente por parte do engenheiro de testes. Com isso, a criação de uma única carga de falhas pode ser facilmente replicada para os demais nodos componen- tes do ambiente experimental. Apesar de adotar um coordenador de experimentos, durante a execução de testes, cada nodo interpreta sua carga de falhas e processa-a localmente, ga- rantindo a baixa intrusividade da ferramenta e evitando a ocorrência de comportamentos inesperados pela aplicação-alvo. Como mecanismo de avaliação desta proposta foram realizados experimentos com diferentes aplicações-alvo, disponibilizadas pelo framework JGroups, com um conjunto de cenários de falha específico para cada aplicação. Desta forma, foi possível comprovar a viabilidade e utilidade do modelo e arquitetura do injetor de falhas PIE levando em consideração sua funcionalidade, intrusividade e corretude dos resultados experimentais. / Communication Fault Injection in Distributed Environments The search for dependability characteristics in distributed applications is increasing quickly. For these, fault tolerance techniques are important components in software de- velopment and requires the emulation of specific scenarios to allow a proper evaluation. In these cases, it remains to the test managers the integration of the target application with extra tools for a faithful emulation environment. However, such tools, named com- munication fault injectors, are not available to the community or, in other cases, presents a very poor functionality, incompatibility with current systems, either by superficial im- plementation of specific functions (prototypes). Another problem for achieving experimental evaluations in distributed applications is the support to distributed faults. Communication fault injectors not necessarily are able to reproduce the behaviors required for proper environment emulation. Thus, this work aims to study and propose a solution for fault injection in distributed environments in particular network partitioning, and led to PIE fault injector. PIE (Partitioning Injection Environment) is a communication fault injector aimed to network partitioning injection. Its distributed architecture allows centralized control by the test manager. Thus, a fault load can be easily replicated to other nodes. Despite adopting a experiment coordinator, each node interprets its fault load and processes it locally during testing, ensuring PIE low intrusiveness and avoiding the occurrence of unexpected behavior by the target application. As an assessment of this work, experiments were done with different target appli- cations, provided by JGroups framework, with a set of specific fault scenarios to each application. Thus, it was able to prove the feasibility and usefulness of the model and architecture of the PIE fault injector considering its functionality, intrusiveness and cor- rectness of the experimental results.
|
10 |
Analyzing the Impact of Radiation-induced Failures in All Programmable System-on-Chip Devices / Avaliação do impacto de falhas induzidas pela radiação em dispositivos sistemas-em-chip totalmente programáveisTambara, Lucas Antunes January 2017 (has links)
O recente avanço da indústria de semicondutores tem possibilitado a integração de componentes complexos e arquiteturas de sistemas dentro de um único chip de silício. Atualmente, FPGAs do estado da arte incluem, não apenas a matriz de lógica programável, mas também outros blocos de hardware, como processadores de propósito geral, blocos de processamento dedicado, interfaces para vários periféricos, estruturas de barramento internas ao chip, e blocos analógicos. Estes novos dispositivos são comumente chamados de Sistemasem-Chip Totalmente Programáveis (APSoCs). Uma das maiores preocupações acerca dos efeitos da radiação em APSoCs é o fato de que erros induzidos pela radiação podem ter diferente probabilidade e criticalidade em seus blocos de hardware heterogêneos, em ambos os níveis de dispositivo e projeto. Por esta razão, este trabalho realiza uma investigação profunda acerca dos efeitos da radiação em APSoCs e da correlação entre a sensibilidade de recursos de hardware e software na performance geral do sistema. Diversos experimentos estáticos e dinâmicos inéditos foram realizados nos blocos de hardware de um APSoC a fim de melhor entender as relações entre confiabilidade e performance de cada parte separadamente. Os resultados mostram que há um comprometimento a ser analisado entre o desempenho e a área de choque de um projeto durante o desenvolvimento de um sistema em um APSoC. Desse modo, é fundamental levar em consideração cada opção de projeto disponível e todos os parâmetros do sistema envolvidos, como o tempo de execução e a carga de trabalho, e não apenas a sua seção de choque. Exemplificativamente, os resultados mostram que é possível aumentar o desempenho de um sistema em até 5.000 vezes com um pequeno aumento na sua seção de choque de até 8 vezes, aumentando assim a confiabilidade operacional do sistema. Este trabalho também propõe um fluxo de análise de confiabilidade baseado em injeções de falhas para estimar a tendência de confiabilidade de projetos somente de hardware, de software, ou de hardware e software. O fluxo objetiva acelerar a procura pelo esquema de projeto com a melhor relação entre performance e confiabilidade dentre as opções possíveis. A metodologia leva em consideração quatro grupos de parâmetros, os quais são: recursos e performance; erros e bits críticos; medidas de radiação, tais como seções de choque estáticas e dinâmicas; e, carga de trabalho média entre falhas. Os resultados obtidos mostram que o fluxo proposto é um método apropriado para estimar tendências de confiabilidade de projeto de sistemas em APSoCs antes de experimentos com radiação. / The recent advance of the semiconductor industry has allowed the integration of complex components and systems’ architectures into a single silicon die. Nowadays, state-ofthe-art FPGAs include not only the programmable logic fabric but also hard-core parts, such as hard-core general-purpose processors, dedicated processing blocks, interfaces to various peripherals, on-chip bus structures, and analog blocks. These new devices are commonly called of All Programmable System-on-Chip (APSoC) devices. One of the major concerns about radiation effects on APSoCs is that radiation-induced errors may have different probability and criticality in their heterogeneous hardware parts at both device and design levels. For this reason, this work performs a deep investigation about the radiation effects on APSoCs and the correlation between hardware and software resources sensitivity in the overall system performance. Several static and dynamic experiments were performed on different hardware parts of an APSoC to better understand the trade-offs between reliability and performance of each part separately. Results show that there is a trade-off between design cross section and performance to be analyzed when developing a system on an APSoC. Therefore, today it is mandatory to take into account each design option available and all the parameters of the system involved, such as the execution time and the workload of the system, and not only its cross section. As an example, results show that it is possible to increase the performance of a system up to 5,000 times by changing its architecture with a small impact in cross section (increase up to 8 times), significantly increasing the operational reliability of the system. This work also proposes a reliability analysis flow based on fault injection for estimating the reliability trend of hardware-only designs, software-only designs, and hardware and software co-designs. It aims to accelerate the search for the design scheme with the best trade-off between performance and reliability among the possible ones. The methodology takes into account four groups of parameters, which are the following: area resources and performance; the number of output errors and critical bits; radiation measurements, such as static and dynamic cross sections; and, Mean Workload Between Failures. The obtained results show that the proposed flow is a suitable method for estimating the reliability trend of system designs on APSoCs before radiation experiments.
|
Page generated in 0.084 seconds