Spelling suggestions: "subject:"soft errors"" "subject:"soft horrors""
1 |
GemV: A Validated Micro-architecture Vulnerability Estimation ToolJanuary 2016 (has links)
abstract: Several decades of transistor technology scaling has brought the threat of soft errors to modern embedded processors. Several techniques have been proposed to protect these systems from soft errors. However, their effectiveness in protecting the computation cannot be ascertained without accurate and quantitative estimation of system reliability. Vulnerability -- a metric that defines the probability of system-failure (reliability) through analytical models -- is the most effective mechanism for our current estimation and early design space exploration needs. Previous vulnerability estimation tools are based around the Sim-Alpha simulator which has been to shown to have several limitations. In this thesis, I present gemV: an accurate and comprehensive vulnerability estimation tool based on gem5. Gem5 is a popular cycle-accurate micro-architectural simulator that can model several different processor models in close to real hardware form. GemV can be used for fast and early design space exploration and also evaluate the protection afforded by commodity processors. gemV is comprehensive, since it models almost all sequential components of the processor. gemV is accurate because of fine-grain vulnerability tracking, accurate vulnerability modeling of squashed instructions, and accurate vulnerability modeling of shared data structures in gem5. gemV has been thoroughly validated against extensive fault injection experiments and achieves a 97\% accuracy with 95\% confidence. A micro-architect can use gemV to discover micro-architectural variants of a processor that minimize vulnerability for allowed performance penalty. A software developer can use gemV to explore the performance-vulnerability trade-off by choosing different algorithms and compiler optimizations, while the system designer can use gemV to explore the performance-vulnerability trade-offs of choosing different Insruction Set Architectures (ISA). / Dissertation/Thesis / Masters Thesis Computer Science 2016
|
2 |
Clock Jitter in Communication SystemsMartwick, Andrew Wayne 21 May 2018 (has links)
For reliable digital communication between devices, the sources that contribute to data sampling errors must be properly modeled and understood. Clock jitter is one such error source occurring during data transfer between integrated circuits. Clock jitter is a noise source in a communication link similar to electrical noise, but is a time domain noise variable affecting many different parts of the sampling process. Presented in this dissertation, the clock jitter effect on sampling is modeled for communication systems with the degree of accuracy needed for modern high speed data communication. The models developed and presented here have been used to develop the clocking specifications and silicon budgets for industry standards such as PCI Express, USB3.0, GDDR5 Memory, and HBM Memory interfaces.
|
3 |
Modeling and Mitigation of Soft Errors in Nanoscale SRAMsJahinuzzaman, Shah M. January 2008 (has links)
Energetic particle (alpha particle, cosmic neutron, etc.) induced single event data upset or soft error has emerged as a key reliability concern in SRAMs in sub-100 nanometre technologies. Low operating voltage, small node capacitance, high packing density, and lack of error masking mechanisms are primarily responsible for the soft error susceptibility of SRAMs. In addition, since SRAM occupies the majority of die area in system-on-chips (SoCs) and microprocessors, different leakage reduction techniques, such as, supply voltage reduction, gated grounding, etc., are applied to SRAMs in order to limit the overall chip leakage. These leakage reduction techniques exponentially increase the soft error rate in SRAMs. The soft error rate is further accentuated by process variations, which are prominent in scaled-down technologies. In this research, we address these concerns and propose techniques to characterize and mitigate soft errors in nanoscale SRAMs.
We develop a comprehensive analytical model of the critical charge, which is a key to assessing the soft error susceptibility of SRAMs. The model is based on the dynamic behaviour of the cell and a simple decoupling technique for the non-linearly coupled storage nodes. The model describes the critical charge in terms of NMOS and PMOS transistor parameters, cell supply voltage, and noise current parameters. Consequently, it enables characterizing the spread of critical charge due to process induced variations in these parameters and to manufacturing defects, such as, resistive contacts or vias. In addition, the model can estimate the improvement in critical charge when MIM capacitors are added to the cell in order to improve the soft error robustness. The model is validated by SPICE simulations (90nm CMOS) and radiation test. The critical charge calculated by the model is in good agreement with SPICE simulations with a maximum discrepancy of less than 5%. The soft error rate estimated by the model for low voltage (sub 0.8 V) operations is within 10% of the soft error rate measured in the radiation test. Therefore, the model can serve as a reliable alternative to time consuming SPICE simulations for optimizing the critical charge and hence the soft error rate at the design stage.
In order to limit the soft error rate further, we propose an area-efficient multiword based error correction code (MECC) scheme. The MECC scheme combines four 32 bit data words to form a composite 128 bit ECC word and uses an optimized 4-input transmission-gate XOR logic. Thus MECC significantly reduces the area overhead for check-bit storage and the delay penalty for error correction. In addition, MECC interleaves two composite words in a row for limiting cosmic neutron induced multi-bit errors. The ground potentials of the composite words are controlled to minimize leakage power without compromising the read data stability. However, use of composite words involves a unique write operation where one data word is written while other three data words are read to update the check-bits. A power efficient word line signaling technique is developed to facilitate the write operation. A 64 kb SRAM macro with MECC is designed and fabricated in a commercial 90nm CMOS technology. Measurement results show that the SRAM consumes 534 μW at 100 MHz with a data latency of 3.3 ns for a single bit error correction. This translates into 82% per-bit energy saving and 8x speed improvement over recently reported multiword ECC schemes. Accelerated neutron radiation test carried out at TRIUMF in Vancouver confirms that the proposed MECC scheme can correct up to 85% of soft errors.
|
4 |
Real-Time Systems with Radiation-Hardened Processors : A GPU-based Framework to Explore TradeoffsAlhowaidi, Mohammad January 2012 (has links)
Radiation-hardened processors are designed to be resilient against soft errorsbut such processors are slower than Commercial Off-The-Shelf (COTS)processors as well significantly costlier. In order to mitigate the high costs,software techniques such as task re-executions must be deployed together withadequately hardened processors to provide reliability. This leads to a huge designspace comprising of the hardening level of the processors and the numberof re-executions of each task in the system. Each configuration in this designspace represents a tradeoff between processor load, reliability and costs. The reliability comes at the price of higher costs due to higher levels of hardeningand performance degradation due to hardening or due to re-executions.Thus, the tradeoffs between performance, reliability and costs must be carefullystudied. Pertinent questions that arise in such a design scenario are — (i)how many times a task must be re-executed and (ii) what should be hardeninglevel? — such that the system reliability is satisfied. In order to evaluate such tradeoffs efficiently, in this thesis, we proposenovel framework that harnesses the computational power of Graphics ProcessingUnits (GPUs). Our framework is based on a system failure probabilityanalysis that connects the probability of failure of tasks to the overall systemreliability. Based on characteristics of this probabilistic analysis as well asreal-time deadlines, we derive bounds on the design space to prune infeasiblesolutions. Finally, we illustrate the benefits of our proposed framework withseveral experiments
|
5 |
Modeling and Mitigation of Soft Errors in Nanoscale SRAMsJahinuzzaman, Shah M. January 2008 (has links)
Energetic particle (alpha particle, cosmic neutron, etc.) induced single event data upset or soft error has emerged as a key reliability concern in SRAMs in sub-100 nanometre technologies. Low operating voltage, small node capacitance, high packing density, and lack of error masking mechanisms are primarily responsible for the soft error susceptibility of SRAMs. In addition, since SRAM occupies the majority of die area in system-on-chips (SoCs) and microprocessors, different leakage reduction techniques, such as, supply voltage reduction, gated grounding, etc., are applied to SRAMs in order to limit the overall chip leakage. These leakage reduction techniques exponentially increase the soft error rate in SRAMs. The soft error rate is further accentuated by process variations, which are prominent in scaled-down technologies. In this research, we address these concerns and propose techniques to characterize and mitigate soft errors in nanoscale SRAMs.
We develop a comprehensive analytical model of the critical charge, which is a key to assessing the soft error susceptibility of SRAMs. The model is based on the dynamic behaviour of the cell and a simple decoupling technique for the non-linearly coupled storage nodes. The model describes the critical charge in terms of NMOS and PMOS transistor parameters, cell supply voltage, and noise current parameters. Consequently, it enables characterizing the spread of critical charge due to process induced variations in these parameters and to manufacturing defects, such as, resistive contacts or vias. In addition, the model can estimate the improvement in critical charge when MIM capacitors are added to the cell in order to improve the soft error robustness. The model is validated by SPICE simulations (90nm CMOS) and radiation test. The critical charge calculated by the model is in good agreement with SPICE simulations with a maximum discrepancy of less than 5%. The soft error rate estimated by the model for low voltage (sub 0.8 V) operations is within 10% of the soft error rate measured in the radiation test. Therefore, the model can serve as a reliable alternative to time consuming SPICE simulations for optimizing the critical charge and hence the soft error rate at the design stage.
In order to limit the soft error rate further, we propose an area-efficient multiword based error correction code (MECC) scheme. The MECC scheme combines four 32 bit data words to form a composite 128 bit ECC word and uses an optimized 4-input transmission-gate XOR logic. Thus MECC significantly reduces the area overhead for check-bit storage and the delay penalty for error correction. In addition, MECC interleaves two composite words in a row for limiting cosmic neutron induced multi-bit errors. The ground potentials of the composite words are controlled to minimize leakage power without compromising the read data stability. However, use of composite words involves a unique write operation where one data word is written while other three data words are read to update the check-bits. A power efficient word line signaling technique is developed to facilitate the write operation. A 64 kb SRAM macro with MECC is designed and fabricated in a commercial 90nm CMOS technology. Measurement results show that the SRAM consumes 534 μW at 100 MHz with a data latency of 3.3 ns for a single bit error correction. This translates into 82% per-bit energy saving and 8x speed improvement over recently reported multiword ECC schemes. Accelerated neutron radiation test carried out at TRIUMF in Vancouver confirms that the proposed MECC scheme can correct up to 85% of soft errors.
|
6 |
The effects of the compiler optimizations in embedded processors reliabilityLins, Filipe Maciel January 2017 (has links)
O recente avanço tecnológico dos processadores embarcados aumentou a complexidade dos compiladores e o uso de recursos heterogêneos, como Arranjo de Portas Programáveis em Campo (Field Programmable Gate Array - FPGA) e Unidade de Processamento Gráfico (Graphics Processing Unit - GPU), integrado aos processadores. Além disso, aumentou-se o uso de componentes de prateleira (Commercial off-the-shelf - COTS) em aplicações críticas, ao invés de chips tolerantes a radiação, pois os COTS podem ser mais baratos, flexíveis, terem uma rápida colocação no mercado e um menor consumo de energia. No entanto, mesmo com essas vantagens, os COTS são suscetíveis a falha sendo necessário garantir uma alta confiabilidade nos sistemas utilizados. Assim como, no caso de aplicações em tempo real, também se precisa respeitar os requisitos determinísticos. Como caso de estudo, este trabalho utiliza a Zynq que é um dispositivo COTS do tipo Sistema em Chip Totalmente Programável (All Programmable System on Chip - APSoC) no qual possui um processador ARM Cortex-A9 embarcado. Nesta pesquisa, investigou-se o impacto das falhas que afetam o arquivo de registradores na confiabilidade dos processadores embarcados. Para tanto, experimentos de injeção de falhas e de radiação de íons pesados foram realizados. Além do mais, avaliou-se como os diferentes níveis de otimização do compilador modificam o uso e a probabilidade de falha do arquivo de registradores do processador. Selecionou-se seis benchmarks representativos, cada um compilado com três níveis diferentes de otimização. Realizamos campanhas exaustivas de injeção de falhas para medir o Fator de Vulnerabilidade Arquitetural (Architectural Vulnerability Factor - AVF) de cada código e configuração, identificando os registradores que são mais propensos a gerar uma corrupção de dados silenciosos (Silent Data Corruption - SDC) ou uma interrupção funcional de evento único (Single Event Functional Interruption - SEFI). Também foram correlacionadas as variações de confiabilidade observadas com a utilização do arquivo de registradores. Finalmente, irradiamos com íons pesados dois dos benchmarks selecionados compilados com dois níveis de otimização. Os resultados mostram que mesmo com o melhor desempenho, o menor uso do arquivo de registradores ou o menor AVF não é garantido que as aplicações irão alcançar a maior Carga de Trabalho Média Entre Falhas (Mean Workload Between Failure - MWBF). Por exemplo, os resultados mostram que o melhor desempenho da aplicação Multiplicação de Matrizes (Matrix Multiplication - MxM) é alcançado no nível de otimização mais alta. No entanto, nos resultados dos experimentos de injeção de falhas, a maior confiabilidade é alcançada no menor nível de otimização que possuem os menores AVFs e o menor uso do arquivo de registradores. Os resultados também mostram que o impacto das otimizações está fortemente relacionado com o algoritmo executado e como o compilador faz esta otimização. / The recent advances in the embedded processors increase the compilers complexity, and the usage of heterogeneous resources such as Field Programmable Gate Array (FPGA) and Graphics Processing Unit (GPU) integrated with the processors. Additionally, the increase in the usage of Commercial off-the-shelf (COTS) instead of radiation hardened chips in safety critical applications occurs because the COTS can be more flexible, inexpensive, have a fast time-to market and a lower power consumption. However, even with these advantages, it is still necessary to guarantee a high reliability in a system that uses a COTS for safety critical applications because they are susceptible to failures. Additionally, in the case of real time applications, the time requirements also need to be respected. As a case of study, this work uses the Zynq which is a COTS device classified as an All Programmable System-on-Chip (APSOC) and has an ARM Cortex-A9 as the embedded processor. In this research, the impact of faults that affect the register file in the embedded processors reliability was investigated. For that, fault-injection and heavy-ion radiation experiments were performed. Moreover, an evaluation of how the different levels of compiler optimization modify the usage and the failure probability of a processor register file. A set of six representative benchmarks, each one compiled with three different levels of compiler optimization. Exhaustive fault injection campaigns were performed to measure the registers Architectural Vulnerability Factor (AVF) of each code and configuration, identifying the registers that are more likely to generate Silent Data Corruption (SDC) or Single Event Functional Interruption (SEFI). Moreover, the observed reliability variations with register file utilization were correlated. Finally, two of the selected benchmarks, each one compiled with two different levels of optimization were irradiated in the heavy ions experiments. The results show that the best performance, the minor register file usage, or the lowest AVF does not always bring the highest Mean Workload Between Failures (MWBF). As an example, in the Matrix Multiplication (MxM) application, the best performance is achieved in the highest compiler optimization. However, in the fault injection, the higher reliability is obtained in the lower compiler optimization which has, the lower AVFs and the lower register file usage. Results also show that the impact of optimizations is strongly related to the executed algorithm and how the compiler optimizes them.
|
7 |
Convolutional neural network reliability on an APSoC platform a traffic-sign recognition case study / Confiabilidade de uma rede neural convolucional em uma plataforma APSoC: um estudo para reconhecimento de placas de trânsitoLopes, Israel da Costa January 2017 (has links)
O aprendizado profundo tem inúmeras aplicações na visão computacional, reconhecimento de fala, processamento de linguagem natural e outras aplicações de interesse comercial. A visão computacional, por sua vez, possui muitas aplicações em áreas distintas, indo desde o entretenimento à aplicações relevantes e críticas. O reconhecimento e manipulação de faces (Snapchat), e a descrição de objetos em fotos (OneDrive) são exemplos de aplicações no entretenimento. Ao passo que, a inspeção industrial, o diagnóstico médico, o reconhecimento de objetos em imagens capturadas por satélites (usadas em missões de resgate e defesa), os carros autônomos e o Sistema Avançado de Auxílio ao Motorista (SAAM) são exemplos de aplicações relevantes e críticas. Algumas das empresas de circuitos integrados mais importantes do mundo, como Xilinx, Intel e Nvidia estão apostando em plataformas dedicadas para acelerar o treinamento e a implementação de algoritmos de aprendizado profundo e outras alternativas de visão computacional para carros autônomos e SAAM devido às suas altas necessidades computacionais. Assim, implementar sistemas de aprendizado profundo que alcançam alto desempenho com o custo de baixa utilização de área e dissipação de potência é um grande desafio. Além do mais, os circuitos eletrônicos para a indústria automotiva devem ser confiáveis mesmo sob efeitos da radiação, defeitos de fabricação e efeitos do envelhecimento. Assim, um gerador automático de VHSIC (Very High Speed Integrated Circuit) Hardware Description Language (VHDL) para Redes Neurais Convolucionais (RNC) foi desenvolvido para reduzir o tempo associado a implementação de algoritmos de aprendizado profundo em hardware. Como estudo de caso, uma RNC foi treinada pela ferramenta Convolutional Architecture for Fast Feature Embedding (Caffe), de modo a classificar 6 classes de placas de trânsito, alcançando uma precisão de cerca de 89,8% no conjunto de dados German Traffic-Sign Recognition Benchmark (GTSRB), que contém imagens de placas de trânsito em cenários complexos. Essa RNC foi implementada num All-Programmable System-on- Chip (APSoC) Zynq-7000, resultando em 313 Frames Por Segundo (FPS) em imagens normalizadas para 32x32, com o APSoC dissipando uma potência de somente 2.057 W, enquanto uma Graphics Processing Unit (GPU) embarcada, em seu modo de operação mínimo, dissipa 10 W. A confiabilidade da RNC proposta foi investigada por injeções de falhas acumuladas e aleatórias por emulação nos bits de configuração da Lógica Programável (LP) do APSoC, alcançando uma confiabilidade de 80,5% sob Single-Bit-Upset (SBU) onde foram considerados ambos os Dados Corrompidos Silenciosos (DCSs) críticos e os casos em que o sistema não respondeu no tempo esperado (time-outs). Em relação às falhas múltiplas, a confiabilidade da RNC decresce exponencialmente com o número de falhas acumuladas. Em vista disso, a confiabilidade da RNC proposta deve ser aumentada através do uso de técnicas de proteção durante o fluxo de projeto. / Deep learning has a plethora of applications in computer vision, speech recognition, natural language processing and other applications of commercial interest. Computer vision, in turn, has many applications in distinct areas, ranging from entertainment applications to relevant and critical applications. Face recognition and manipulation (Snapchat), and object description in pictures (OneDrive) are examples of entertainment applications. Industrial inspection, medical diagnostics, object recognition in images captured by satellites (used in rescue and defense missions), autonomous cars and Advanced Driver-Assistance System (ADAS) are examples of relevant and critical applications. Some of the most important integrated circuit companies around the world, such as Xilinx, Intel and Nvidia are waging in dedicated platforms for accelerating the training and deployment of deep learning and other computer vision algorithms for autonomous cars and ADAS due to their high computational requirement. Thus, implementing a deep learning system that achieves high performance with low area utilization and power consumption costs is a big challenge. Besides, electronic equipment for automotive industry must be reliable even under radiation effects, manufacturing defects and aging effects, inasmuch as if a system failure occurs, a car accident can happen. Thus, a Convolutional Neural Network (CNN) VHSIC (Very High Speed Integrated Circuit) Hardware Description Language (VHDL) automatic generator was developed to reduce the design time associated to the implementation of deep learning algorithms in hardware. As a case study, a CNN was trained by the Convolutional Architecture for Fast Feature Embedding (Caffe) framework, in order to classify 6 traffic-sign classes, achieving an average accuracy of about 89.8% on the German Traffic-Sign Recognition Benchmark (GTSRB) dataset, which contains trafficsigns images in complex scenarios. This CNN was implemented on a Zynq-7000 All- Programmable System-on-Chip (APSoC), achieving about 313 Frames Per Second (FPS) on 32x32-normalized images, with the APSoC consuming only 2.057W, while an embedded Graphics Processing Unit (GPU), in its minimum operation mode, consumes 10W. The proposed CNN reliability was investigated by random piled-up fault injection by emulation in the Programming Logic (PL) configuration bits of the APSoC, achieving 80.5% of reliability under Single-Bit-Upset (SBU) where both critical Silent Data Corruptions (SDCs) and time-outs were considered. Regarding the multiple faults, the proposed CNN reliability exponentially decreases with the number of piled-up faults. Hence, the proposed CNN reliability must be increased by using hardening techniques during the design flow.
|
8 |
Applying dual core lockstep in embedded processors to mitigate radiation induced soft errors / Aplicando dual core lockstep em processadores embarcados para mitigar falhas transientes induzidas por radiaçãoOliveira, Ádria Barros de January 2017 (has links)
Os processadores embarcados operando em sistemas de segurança ou de missão crítica não podem falhar. Qualquer falha neste tipo de aplicação pode levar a consequências inaceitáveis, como risco de vida ou danos à propriedade ou ao meio ambiente. Os sistemas embarcados que operam em aplicações aeroespaciais são sucetíveis à falhas transientes induzidas por radiação. Entretanto, os efeitos de radiação também podem ser observados ao nível do solo. Falhas transientes afetam os processadores modificando os valores armazenados em elementos de memória, tais como registradores e memória de dados. Essas falhas podem levar o processador a executar incorretamente a aplicação, provocando erros na saída ou travamentos no sistema. Os avanços recentes em processadores embarcados concistem na integração de processadores hard-core e FPGAs. Tais dispositivos, comumente chamados de Sistemas-em-Chip Totalmente Programáveis (APSoCs), também são sucetíveis aos efeitos de radiação. Com objetivo de minimizar esse problema de tolerância a falhas, este trabalho apresenta um Dual-Core LockStep (DCLS) como uma técnica de tolerância para mitigar falhas induzidas por radiação que afetam processadores embarcados em APSoCs. Lockstep é um método baseado em redundância usado para detectar e corrigir falhas transientes. O DCLS proposto é implementado em um processador ARM Cortex-A9 hard-core embarcado no APSoC Zynq-7000. A eficiência da abordagem implementada foi validada tanto em aplicações executando em bare-metal como no sistema operacional FreeRTOS. Experimentos com íons pesados e emulação de falhas por injeção foram executados para analisar a sucetibilidade do sistema a inversão de bits. Os resultados obtidos mostram que a abordagem é capaz de diminuir a seção de choque do sistema com uma alta taxa de proteção. O sistema DCLS mitigou com sucesso até 78% das falhas injetadas. Otimizações de software também foram avaliadas para uma melhor compreenção dos trade-offs entre desempenho e confiabilidade. Através da análise de diferentes partições de software, observou-se que o tempo de execução de um bloco da aplicação deve ser muito maior que o tempo de verificação para que se obtenha menor impacto em desempenho. A avaliação de otimizações de compilador demonstrou que utilizar o nível O3 aumenta a vulnerabilidade da aplicação à falhas transientes. Como o O3 requer o uso de mais registradores que os otros níveis de otimização, o sistema se torna mais sucetível à falhas. Por outro lado, os resultados dos experimentos de radiação apontam que a aplicação compilada com nível O3 obtém maior Carga de Trabalho Média Entre Falhas (MWBF). Como a aplicação executa mais rápido, mais dados são computados corretamente antes da ocorrência de um erro. / The embedded processors operating in safety- or mission-critical systems are not allowed to fail. Any failure in such applications could lead to unacceptable consequences as life risk or significant damage to property or environment. Concerning faults originated by the radiation-induced soft errors, the embedded systems operating in aerospace applications are particularly susceptible. However, the radiation effects can also be observed at ground level. Soft errors affect processors by modifying values stored in memory elements, such as registers and data memory. These faults may lead the processor to execute an application incorrectly, generating output errors or leading hangs and crashes in the system. The recent advances in embedded systems concern the integration of hard-core processors and FPGAs. Such devices, called All Programmable System-on-Chip (APSoC), are also susceptible to radiation effects. Aiming to address this fault tolerance problem this work presents a Dual-Core LockStep (DCLS) as a fault tolerance technique to mitigate radiation-induced faults affecting processors embedded into APSoCs. Lockstep is a method based on redundancy used to detect and correct soft errors. The proposed DCLS is implemented in a hard-core ARM Cortex-A9 embedded into a Zynq-7000 APSoC. The approach efficiency was validated not only on applications running in baremetal but also on top of FreeRTOS systems. Heavy ions experiments and fault injection emulation were performed to analyze the system susceptibility to bit-flips. The obtained results show that the approach is able to decrease the system cross section with a high rate of protection. The DCLS system successfully mitigated up to 78% of the injected faults. Software optimizations were also evaluated to understand the trade-offs between performance and reliability better. By the analysis of different software partitions, it was observed that the execution time of an application block must to be much longer than the verification time to achieve fewer performance penalties. The compiler optimizations assessment demonstrate that using O3 level increases the application vulnerability to soft errors. Because O3 handles more registers than other optimizations, the system is more susceptible to faults. On the other hand, results from radiation experiments show that O3 level provides a higher Mean Workload Between Failures (MWBF). As the application runs faster, more data are correctly computed before an error occurrence.
|
9 |
On Efficiency and Accuracy of Data Flow Tracking SystemsJee, Kangkook January 2015 (has links)
Data Flow Tracking (DFT) is a technique broadly used in a variety of security applications such as attack detection, privacy leak detection, and policy enforcement. Although effective, DFT inherits the high overhead common to in-line monitors which subsequently hinders their adoption in production systems. Typically, the runtime overhead of DFT systems range from 3× to 100× when applied to pure binaries, and 1.5× to 3× when inserted during compilation. Many performance optimization approaches have been introduced to mitigate this problem by relaxing propagation policies under certain conditions but these typically introduce the issue of inaccurate taint tracking that leads to over-tainting or under-tainting.
Despite acknowledgement of these performance / accuracy trade-offs, the DFT literature consistently fails to provide insights about their implications. A core reason, we believe, is the lack of established methodologies to understand accuracy.
In this dissertation, we attempt to address both efficiency and accuracy issues. To this end, we begin with libdft, a DFT framework for COTS binaries running atop commodity OSes and we then introduce two major optimization approaches based on statically and dynamically analyzing program binaries.
The first optimization approach extracts DFT tracking logics and abstracts them using TFA. We then apply classic compiler optimizations to eliminate redundant tracking logic and minimize interference with the target program. As a result, the optimization can achieve 2× speed-up over base-line performance measured for libdft. The second optimization approach decouples the tracking logic from execution to run them in parallel leveraging modern multi-core innovations. We apply his approach again applied to libdft where it can run four times as fast, while concurrently consuming fewer CPU cycles.
We then present a generic methodology and tool for measuring the accuracy of arbitrary DFT systems in the context of real applications. With a prototype implementation for the Android framework – TaintMark, we have discovered that TaintDroid’s various performance optimizations lead to serious accuracy issues, and that certain optimizations should be removed to vastly improve accuracy at little performance cost. The TaintMark approach is inspired by blackbox differential testing principles to test for inaccuracies in DFTs, but it also addresses numerous practical challenges that arise when applying those principles to real, complex applications. We introduce the TaintMark methodology by using it to understand taint tracking accuracy trade-offs in TaintDroid, a well-known DFT system for Android.
While the aforementioned works focus on the efficiency and accuracy issues of DFT systems that dynamically track data flow, we also explore another design choice that statically tracks information flow by analyzing and instrumenting the application source code. We apply this approach to the different problem of integer error detection in order to reduce the number of false alarmings.
|
10 |
Timing vulnerability factor analysis in master-slave D flip-flops / Análise do fator de vulnerabilidade temporal em flip-flops mestre-escravo do tipo DZimpeck, Alexandra Lackmann January 2016 (has links)
O dimensionamento da tecnologia trouxe consequências indesejáveis para manter a taxa de crescimento exponencial e levanta questões importantes relacionadas com a confiabilidade e robustez dos sistemas eletrônicos. Atualmente, microprocessadores modernos de superpipeline normalmente contêm milhões de dispositivos com cargas nos nós cada vez menores. Esse fator faz com que os circuitos sejam mais sensíveis a variabilidade ambiental e aumenta a probabilidade de um erro transiente acontecer. Erros transientes em circuitos sequenciais ocorrem quando uma única partícula energizada deposita carga suficiente perto de uma região sensível. Flip-Flops mestreescravo são os circuitos sequencias mais utilizados em projeto VLSI para armazenamento de dados. Se um bit-flip ocorrer dentro deles, eles perdem a informação prévia armazenada e podem causar um funcionamento incorreto do sistema. A fim de proporcionar sistemas mais confiáveis que possam lidar com os efeitos da radiação, este trabalho analisa o Fator de Vulnerabilidade Temporal (Timing Vulnerability Factor - TVF) em algumas topologias de flip-flops mestre-escravo em estágios de pipeline sob diferentes condições de operação. A janela de tempo efetivo que o bit-flip ainda pode ser capturado pelo próximo estágio é definido com janela de vulnerabilidade (WOV). O TVF corresponde ao tempo que o flip-flop é vulnerável a erros transientes induzidos pela radiação de acordo com a WOV e a frequência de operação. A primeira etapa deste trabalho determina a dependência entre o TVF com a propagação de falhas até o próximo estágio através de uma lógica combinacional com diferentes atrasos de propagação e com diferentes modelos de tecnologia, incluindo também as versões de alto desempenho e baixo consumo. Todas as simulações foram feitas sob as condições normais pré-definidas nos arquivos de tecnologia. Como a variabilidade se manifesta com o aumento ou diminuição das especificações iniciais, onde o principal problema é a incerteza sobre o valor armazenado em circuitos sequenciais, a segunda etapa deste trabalho consiste em avaliar o impacto que os efeitos da variabilidade ambiental causam no TVF. Algumas simulações foram refeitas considerando variações na tensão de alimentação e na temperatura em diferentes topologias e configurações de flip-flops mestre-escravo. Para encontrar os melhores resultados, é necessário tentar diminuir os valores de TVF, pois isso significa que eles serão menos vulneráveis a bit-flips. Atrasos de propagação entre dois circuitos sequenciais e frequências de operação mais altas ajudam a reduzir o TVF. Além disso, estas informações podem ser facilmente integradas em ferramentas de EDA para ajudar a identificar os flip-flops mestre-escravo mais vulneráveis antes de mitigar ou substituí-los por aqueles tolerantes a radiação. / Technology scaling has brought undesirable issues to maintain the exponential growth rate and it raises important topics related to reliability and robustness of electronic systems. Currently, modern super pipelined microprocessors typically contain many millions of devices with ever decreasing load capacitances. This factor makes circuits more sensitive to environmental variations and it is increased the probability to induce a soft error. Soft errors in sequential circuits occur when a single energetic particle deposits enough charge near a sensitive node. Master-slave flip-flops are the most adopted sequential elements to work as registers in pipeline and finite state machines. If a bit-flip happens inside them, they lose the previous stored information and may cause an incorrect system operation. To provide reliable systems that can cope with radiation effects, this work analysis the Timing Vulnerability Factor (TVF) of some master-slave D flip-flops topologies in pipeline stages under different operating conditions. The effective time window, which the bit-flip can still be captured by the next stage, is defined as Window of Vulnerability (WOV). TVF corresponds to the time that a flip-flop is vulnerable to radiation-induced soft errors according to WOV and clock frequency. In the first step of this work, it is determined the dependence between the TVF with the fault propagation to the next stage through a combinational logic with different propagation delays and with different nanometer technological models, including also high performance and low power versions. All these simulations were made under the pre-defined nominal conditions in technology files. The variability manifests with an increase or decreases to initial specification, where the main problem is the uncertainty about the value stored in sequential. In this way, the second step of this work evaluates the impact that environmental variability effect causes in TVF. Some simulations were redone considering supply voltage and temperature variations in different master-slave D flip-flop topologies configurations. To achieve better results, it is necessary to try to decrease the TVF values to reduce the vulnerability to bit-flips. The propagation delay between two sequential elements and higher clock frequencies collaborates to reduce TVF values. Moreover, all the information can be easily integrated into Electronic Design Automation (EDA) tools to help identifying the most vulnerable master-slave flip-flops before mitigating or replacing them by radiation hardened ones.
|
Page generated in 1.6254 seconds