• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 54
  • 3
  • 1
  • 1
  • Tagged with
  • 69
  • 69
  • 32
  • 30
  • 22
  • 21
  • 21
  • 19
  • 18
  • 13
  • 12
  • 12
  • 12
  • 12
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Design of Soft Error Robust High Speed 64-bit Logarithmic Adder

Shah, Jaspal Singh January 2008 (has links)
Continuous scaling of the transistor size and reduction of the operating voltage have led to a significant performance improvement of integrated circuits. However, the vulnerability of the scaled circuits to transient data upsets or soft errors, which are caused by alpha particles and cosmic neutrons, has emerged as a major reliability concern. In this thesis, we have investigated the effects of soft errors in combinational circuits and proposed soft error detection techniques for high speed adders. In particular, we have proposed an area-efficient 64-bit soft error robust logarithmic adder (SRA). The adder employs the carry merge Sklansky adder architecture in which carries are generated every 4 bits. Since the particle-induced transient, which is often referred to as a single event transient (SET) typically lasts for 100~200 ps, the adder uses time redundancy by sampling the sum outputs twice. The sampling instances have been set at 110 ps apart. In contrast to the traditional time redundancy, which requires two clock cycles to generate a given output, the SRA generates an output in a single clock cycle. The sampled sum outputs are compared using a 64-bit XOR tree to detect any possible error. An energy efficient 4-input transmission gate based XOR logic is implemented to reduce the delay and the power in this case. The pseudo-static logic (PSL), which has the ability to recover from a particle induced transient, is used in the adder implementation. In comparison with the space redundant approach which requires hardware duplication for error detection, the SRA is 50% more area efficient. The proposed SRA is simulated for different operands with errors inserted at different nodes at the inputs, the carry merge tree, and the sum generation circuit. The simulation vectors are carefully chosen such that the SET is not masked by error masking mechanisms, which are inherently present in combinational circuits. Simulation results show that the proposed SRA is capable of detecting 77% of the errors. The undetected errors primarily result when the SET causes an even number of errors and when errors occur outside the sampling window.
12

Design of Soft Error Robust High Speed 64-bit Logarithmic Adder

Shah, Jaspal Singh January 2008 (has links)
Continuous scaling of the transistor size and reduction of the operating voltage have led to a significant performance improvement of integrated circuits. However, the vulnerability of the scaled circuits to transient data upsets or soft errors, which are caused by alpha particles and cosmic neutrons, has emerged as a major reliability concern. In this thesis, we have investigated the effects of soft errors in combinational circuits and proposed soft error detection techniques for high speed adders. In particular, we have proposed an area-efficient 64-bit soft error robust logarithmic adder (SRA). The adder employs the carry merge Sklansky adder architecture in which carries are generated every 4 bits. Since the particle-induced transient, which is often referred to as a single event transient (SET) typically lasts for 100~200 ps, the adder uses time redundancy by sampling the sum outputs twice. The sampling instances have been set at 110 ps apart. In contrast to the traditional time redundancy, which requires two clock cycles to generate a given output, the SRA generates an output in a single clock cycle. The sampled sum outputs are compared using a 64-bit XOR tree to detect any possible error. An energy efficient 4-input transmission gate based XOR logic is implemented to reduce the delay and the power in this case. The pseudo-static logic (PSL), which has the ability to recover from a particle induced transient, is used in the adder implementation. In comparison with the space redundant approach which requires hardware duplication for error detection, the SRA is 50% more area efficient. The proposed SRA is simulated for different operands with errors inserted at different nodes at the inputs, the carry merge tree, and the sum generation circuit. The simulation vectors are carefully chosen such that the SET is not masked by error masking mechanisms, which are inherently present in combinational circuits. Simulation results show that the proposed SRA is capable of detecting 77% of the errors. The undetected errors primarily result when the SET causes an even number of errors and when errors occur outside the sampling window.
13

Techniques for Enhancing Reliability in VLSI Circuits

Hyman Jr, Ransford Morel 01 January 2011 (has links)
Reliability is an important issue in very large scale integration(VLSI) circuits. In the absence of a focus on reliability in the design process, a circuit's functionality can be compromised. Since chips are fabricated in bulk, if reliability issues are diagnosed during the manufacturing of the design, the faulty chips must be tossed, which reduces product yield and increases cost. Being aware of this situation, chip designers attempt to resolve as many issues dealing with reliability on the front-end of the design phase (architecture or system-level modeling) to minimize the cost of errors in the design which increases as the design phase matures. Chip designers have been known to allocate a large amount of resources to reliability of a chip to maintain confidence in their product as well as to reduce the cost due to errors found in the design. The reliability of a design is often degraded by various causes ranging from soft errors, electro-migration, hot carrier injection, negative bias temperature instability (NBTI), crosstalk, power supply noise and variations in the physical design. Given the continuing scaling down of circuit designs achievable by the advancement in technology, the issues pertaining to reliability have a greater impact within the design. Given this problem along with the demand for high-performance designs, chip designers are faced with objective to design reliable circuits, that are high performance and energy-efficient. This is especially important given the huge growth in mobile battery-operated electronic devices in the market. In prior research, there has been significant contributions to increasing the reliability of VLSI designs, however such techniques are often computationally expensive or power intensive. In this dissertation, we develop a set of new techniques to generate reliable designs by minimizing soft error, peak power and variation effects. Several techniques at the architectural level to detect soft errors with minimal performance overhead, that make use of data, information, temporal and spatial redundancy are proposed. The techniques are designed in such a way that much of their latency overhead can be hidden by the latency of other functional operations. It is shown that the proposed methodologies can be implemented with negligible or minimal performance overhead hidden by critical path operations in the datapath. In designs with large peak power values, high current spikes cause noise within the power supply creating timing issues in the circuit which affect its functionality. A path clustering algorithm is proposed which attempts to normalize the current draw in the circuit over the circuit's clock period by delaying the start times of certain paths. By reducing the number of paths starting at a time instance, we reduce the amount of current drawn from the power supply is reduced. Experimental results indicate a reduction of up to 72\% in peak power values when tested on the ISCAS '85 and OpenCores benchmarks. Variations in VLSI designs come from process, voltage supply, and Temperature (PVT). These variations in the design cause non-ideal behavior at random internal nodes which impacts the timing of the design. A variation aware circuit level design methodology is presented in this dissertation in which the architecture dynamically stretches the clock when the effect of an variation effects are observed within the circuit during computations. While previous research efforts found are directed towards reducing variation effects, this technique offers an alternative approach to adapt dynamically to variation effects. The design technique is shown to increase in timing yield on ITC '99 benchmark circuits by an average of 41\% with negligible area overhead.
14

Efficient modeling of soft error vulnerability in microprocessors

Nair, Arun Arvind 11 July 2012 (has links)
Reliability has emerged as a first class design concern, as a result of an exponential increase in the number of transistors on the chip, and lowering of operating and threshold voltages with each new process generation. Radiation-induced transient faults are a significant source of soft errors in current and future process generations. Techniques to mitigate their effect come at a significant cost of area, power, performance, and design effort. Architectural Vulnerability Factor (AVF) modeling has been proposed to easily estimate the processor's soft error rates, and to enable the designers to make appropriate cost/reliability trade-offs early in the design cycle. Using cycle-accurate microarchitectural or logic gate-level simulations, AVF modeling captures the masking effect of program execution on the visibility of soft errors at the output. AVF modeling is used to identify structures in the processor that have the highest contribution to the overall Soft Error Rate (SER) while running typical workloads, and used to guide the design of SER mitigation mechanisms. The precise mechanisms of interaction between the workload and the microarchitecture that together determine the overall AVF is not well studied in literature, beyond qualitative analyses. Consequently, there is no known methodology for ensuring that the workload suite used for AVF modeling offers sufficient SER coverage. Additionally, owing to the lack of an intuitive model, AVF modeling is reliant on detailed microarchitectural simulations for understanding the impact of scaling processor structures, or design space exploration studies. Microarchitectural simulations are time-consuming, and do not easily provide insight into the mechanisms of interactions between the workload and the microarchitecture to determine AVF, beyond aggregate statistics. These aforementioned challenges are addressed in this dissertation by developing two methodologies. First, beginning with a systematic analysis of the factors affecting the occupancy of corruptible state in a processor, a methodology is developed that generates a synthetic workload for a given microarchitecture such that the SER is maximized. As it is impossible for every bit in the processor to simultaneously contain corruptible state, the worst-case realizable SER while running a workload is less than the sum of their circuit-level fault rates. The knowledge of the worst-case SER enables efficient design trade-offs by allowing the architect to validate the coverage of the workload suite and select an appropriate design point, and to identify structures that may potentially have high contribution to SER. The methodology induces 1.4X higher SER in the core as compared to the highest SER induced by SPEC CPU2006 and MiBench programs. Second, a first-order analytical model is proposed, which is developed from the first principles of out-of-order superscalar execution that models the AVF induced by a workload in microarchitectural structures, using inexpensive profiling. The central component of this model is a methodology to estimate the occupancy of correct-path state in various structures in the core. Owing to its construction, the model provides fundamental insight into the precise mechanism of interaction between the workload and the microarchitecture to determine AVF. The model is used to cheaply perform sizing studies for structures in the core, design space exploration, and workload characterization for AVF. The model is used to quantitatively explain results that may appear counter-intuitive from aggregate performance metrics. The Mean Absolute Error in determining AVF of a 4-wide out-of-order superscalar processor using model is less than 7% for each structure, and the Normalized Mean Square Error for determining overall SER is 9.0%, as compared to cycle-accurate microarchitectural simulation. / text
15

Reliability-centric probabilistic analysis of VLSI circuits

Rejimon, Thara 01 June 2006 (has links)
Reliability is one of the most serious issues confronted by microelectronics industry as feature sizes scale down from deep submicron to sub-100-nanometer and nanometer regime. Due to processing defects and increased noise effects, it is almost impractical to come up with error-free circuits. As we move beyond 22nm, devices will be operating very close to their thermal limit making the gates error-prone and every gate will have a finite propensity of providing erroneous outputs. Additional factors increasing the erroneous behaviors are low operating voltages and extremely high frequencies. These types of errors are not captured by current defect and fault tolerant mechanisms as they might not be present during the testing and reconfiguration. Hence Reliability-centric CAD analysis tool is becoming more essential not only to combat defect and hard faults but also errors that are transient and probabilistic in nature.In this dissertation, we address three broad categories of errors. First, we focus on random pattern testability of logic circuits with respect to hard or permanent faults. Second, we model the effect of single-event-upset (SEU) at an internal node to primary outputs. We capture the temporal nature of SEUs by adding timing information to our model. Finally, we model the dynamic error in nano-domain computing, where reliable computation has to be achieved with "systemic" unreliable devices, thus making the entire computation process probabilistic rather than deterministic in nature.Our central theoretical scheme relies on Bayesian Belief networks that are compact efficient models representing joint probability distribution in a minimal graphical structure that not only uses conditional independencies to model the underlying probabilistic dependence but also uses them for computational advantage. We used both exact and approximate inference which has let us achieve order of magnitude improvements in both accuracy and speed and have enabled us t o study larger benchmarks than the state-of-the-art. We are also able to study error sensitivities, explore design space, and characterize the input space with respect to errors and finally, evaluate the effect of redundancy schemes.
16

Convolutional neural network reliability on an APSoC platform a traffic-sign recognition case study / Confiabilidade de uma rede neural convolucional em uma plataforma APSoC: um estudo para reconhecimento de placas de trânsito

Lopes, Israel da Costa January 2017 (has links)
O aprendizado profundo tem inúmeras aplicações na visão computacional, reconhecimento de fala, processamento de linguagem natural e outras aplicações de interesse comercial. A visão computacional, por sua vez, possui muitas aplicações em áreas distintas, indo desde o entretenimento à aplicações relevantes e críticas. O reconhecimento e manipulação de faces (Snapchat), e a descrição de objetos em fotos (OneDrive) são exemplos de aplicações no entretenimento. Ao passo que, a inspeção industrial, o diagnóstico médico, o reconhecimento de objetos em imagens capturadas por satélites (usadas em missões de resgate e defesa), os carros autônomos e o Sistema Avançado de Auxílio ao Motorista (SAAM) são exemplos de aplicações relevantes e críticas. Algumas das empresas de circuitos integrados mais importantes do mundo, como Xilinx, Intel e Nvidia estão apostando em plataformas dedicadas para acelerar o treinamento e a implementação de algoritmos de aprendizado profundo e outras alternativas de visão computacional para carros autônomos e SAAM devido às suas altas necessidades computacionais. Assim, implementar sistemas de aprendizado profundo que alcançam alto desempenho com o custo de baixa utilização de área e dissipação de potência é um grande desafio. Além do mais, os circuitos eletrônicos para a indústria automotiva devem ser confiáveis mesmo sob efeitos da radiação, defeitos de fabricação e efeitos do envelhecimento. Assim, um gerador automático de VHSIC (Very High Speed Integrated Circuit) Hardware Description Language (VHDL) para Redes Neurais Convolucionais (RNC) foi desenvolvido para reduzir o tempo associado a implementação de algoritmos de aprendizado profundo em hardware. Como estudo de caso, uma RNC foi treinada pela ferramenta Convolutional Architecture for Fast Feature Embedding (Caffe), de modo a classificar 6 classes de placas de trânsito, alcançando uma precisão de cerca de 89,8% no conjunto de dados German Traffic-Sign Recognition Benchmark (GTSRB), que contém imagens de placas de trânsito em cenários complexos. Essa RNC foi implementada num All-Programmable System-on- Chip (APSoC) Zynq-7000, resultando em 313 Frames Por Segundo (FPS) em imagens normalizadas para 32x32, com o APSoC dissipando uma potência de somente 2.057 W, enquanto uma Graphics Processing Unit (GPU) embarcada, em seu modo de operação mínimo, dissipa 10 W. A confiabilidade da RNC proposta foi investigada por injeções de falhas acumuladas e aleatórias por emulação nos bits de configuração da Lógica Programável (LP) do APSoC, alcançando uma confiabilidade de 80,5% sob Single-Bit-Upset (SBU) onde foram considerados ambos os Dados Corrompidos Silenciosos (DCSs) críticos e os casos em que o sistema não respondeu no tempo esperado (time-outs). Em relação às falhas múltiplas, a confiabilidade da RNC decresce exponencialmente com o número de falhas acumuladas. Em vista disso, a confiabilidade da RNC proposta deve ser aumentada através do uso de técnicas de proteção durante o fluxo de projeto. / Deep learning has a plethora of applications in computer vision, speech recognition, natural language processing and other applications of commercial interest. Computer vision, in turn, has many applications in distinct areas, ranging from entertainment applications to relevant and critical applications. Face recognition and manipulation (Snapchat), and object description in pictures (OneDrive) are examples of entertainment applications. Industrial inspection, medical diagnostics, object recognition in images captured by satellites (used in rescue and defense missions), autonomous cars and Advanced Driver-Assistance System (ADAS) are examples of relevant and critical applications. Some of the most important integrated circuit companies around the world, such as Xilinx, Intel and Nvidia are waging in dedicated platforms for accelerating the training and deployment of deep learning and other computer vision algorithms for autonomous cars and ADAS due to their high computational requirement. Thus, implementing a deep learning system that achieves high performance with low area utilization and power consumption costs is a big challenge. Besides, electronic equipment for automotive industry must be reliable even under radiation effects, manufacturing defects and aging effects, inasmuch as if a system failure occurs, a car accident can happen. Thus, a Convolutional Neural Network (CNN) VHSIC (Very High Speed Integrated Circuit) Hardware Description Language (VHDL) automatic generator was developed to reduce the design time associated to the implementation of deep learning algorithms in hardware. As a case study, a CNN was trained by the Convolutional Architecture for Fast Feature Embedding (Caffe) framework, in order to classify 6 traffic-sign classes, achieving an average accuracy of about 89.8% on the German Traffic-Sign Recognition Benchmark (GTSRB) dataset, which contains trafficsigns images in complex scenarios. This CNN was implemented on a Zynq-7000 All- Programmable System-on-Chip (APSoC), achieving about 313 Frames Per Second (FPS) on 32x32-normalized images, with the APSoC consuming only 2.057W, while an embedded Graphics Processing Unit (GPU), in its minimum operation mode, consumes 10W. The proposed CNN reliability was investigated by random piled-up fault injection by emulation in the Programming Logic (PL) configuration bits of the APSoC, achieving 80.5% of reliability under Single-Bit-Upset (SBU) where both critical Silent Data Corruptions (SDCs) and time-outs were considered. Regarding the multiple faults, the proposed CNN reliability exponentially decreases with the number of piled-up faults. Hence, the proposed CNN reliability must be increased by using hardening techniques during the design flow.
17

Design of a soft-error robust microprocessor / Projeto de um Microprocessador Robusto a Soft Errors

Bastos, Rodrigo Possamai January 2006 (has links)
O avanço das tecnologias de circuitos integrados (CIs) levanta importantes questões relacionadas à confiabilidade e à robustez de sistemas eletrônicos. A diminuição da geometria dos transistores, a redução dos níveis de tensão, as menores capacitâncias e portanto menores correntes e cargas para alimentar os circuitos, além das freqüências de relógio elevadas, têm tornado os CIs mais vulneráveis a falhas, especialmente àquelas causadas por ruído elétrico ou por efeitos induzidos pela radiação. Os efeitos induzidos pela radiação conhecidos como Soft Single Event Effects (Soft SEEs) podem ser classificados em: Single Event Upsets (SEUs) diretos em nós de elementos de armazenagem que resultam em inversões de bits; e pulsos transientes Single Event Transients (SETs) em qualquer nó do circuito. Especialmente SETs em circuitos combinacionais podem se propagar até os elementos de armazenagem e podem ser capturados. Estas errôneas armazenagens podem também serem chamadas de SEUs indiretos. Falhas como SETs e SEUs podem provocar erros em operações funcionais de um CI. Os conhecidos Soft Errors (SEs) são caracterizados por valores armazenados erradamente em elementos de memória durante o uso do CI. SEs podem produzir sérias conseqüências em aplicações de CIs devido à sua natureza não permanente e não recorrente. Por essas razões, mecanismos de proteção para evitar SEs através de técnicas de tolerância a falhas, no mínimo em um nível de abstração do projeto, são atualmente fundamentais para melhorar a confiabilidade de sistemas. Neste trabalho de dissertação, uma versão tolerante a falhas de um microprocessador 8-bits de produção em massa da família M68HC11 foi projetada. A arquitetura é capaz de tolerar SETs e SEUs. Baseado nas técnicas de Redundância Modular Tripla (TMR) e Redundância no Tempo (TR), um esquema de proteção foi projetado e implementado em alto nível no microprocessador alvo usando apenas portas lógicas padrões. O esquema projetado preserva as características da arquitetura padrão de tal forma que a reusabilidade das aplicações do microprocessador é garantida. Um típico fluxo de projeto de circuitos integrados foi desenvolvido através de ferramentas de CAD comerciais. Testes funcionais e injeções de falhas através da simulação de execuções de benchmarks foram realizados como um teste de verificação do projeto. Além disto, detalhes do projeto do circuito integrado tolerante a falhas e resultados em área, performance e potência foram comparados com uma versão não protegida do microprocessador. A área do core aumentou 102,64 % para proteger o circuito alvo contra SETs e SEUs. A performance foi degrada em 12,73 % e o consumo de potência cresceu cerca de 49 % para um conjunto de benchmarks. A área resultante do chip robusto foi aproximadamente 5,707 mm². / The advance of the IC technologies raises important issues related to the reliability and robustness of electronic systems. The transistor scale by shrinking its geometry, the voltage reduction, the lesser capacitances and therefore smaller currents and charges to supply the circuits, besides the higher clock frequencies, have made the IC more vulnerable to faults, especially those faults caused by electrical noise or radiationinduced effects. The radiation-induced effects known as Soft Single Event Effects (Soft SEEs) can be classified into: direct Single Event Upsets (SEUs) at nodes of storage elements that result in bit flips; and Single Event Transient (SET) pulses at any circuit node. Especially SETs on combinational circuits might propagate itself up to the storage elements and might be captured. These erroneous storages can be also called indirect SEUs. Faults like SETs and SEUs can provoke errors in functional operations of an IC. The known Soft Errors (SEs) are characterized by values stored wrongly on memory elements during the use of the IC. They can make serious consequences in IC applications due to their non-permanent and non-recurring nature. By these reasons, protection mechanisms to avoid SEs by using fault-tolerance techniques, at least in one abstraction level of the design, are currently fundamental to improve the reliability of systems. In this dissertation work, a fault-tolerant IC version of a mass-produced 8-bit microprocessor from the M68HC11 family was designed. It is able to tolerate SETs and SEUs. Based on the Triple Modular Redundancy (TMR) and Time Redundancy (TR) fault-tolerance techniques, a protection scheme was designed and implemented at high level in the target microprocessor by using only standard logic gates. The designed scheme preserves the standard-architecture characteristics in such way that the reusability of microprocessor applications is guaranteed. A typical IC design flow was developed by means of commercial CAD tools. Functional testing and fault injection simulations through benchmark executions were performed as a design verification testing. Furthermore, fault-tolerant IC design issues and results in area, performance and power were compared with a non-protected microprocessor version. The core area increased by 102.64 % to protect the target circuit against SETs and SEUs. The performance was degraded in 12.73 % and the power consumption grew around 49 % for a set of benchmarks. The resulting area of the robust chip was approximately 5.707 mm².
18

Designing fault tolerant NoCs to improve reliability on SoCs / Projeto de NoCs tolerantes a falhas para o aumento da confiabilidade em SoCs

Frantz, Arthur Pereira January 2007 (has links)
Com a redução das dimensões dos dispositivos nas tecnologias sub-micrônicas foi possível um grande aumento no número de IP cores integrados em um mesmo chip e consequentemente novas arquiteturas de comunicação são usadas bucando atingir os requisitos de desempenho e potência. As redes intra-chip (Networks-on-Chip) foram propostas como uma plataforma alternativa de comunicação capaz de prover interconexões e comunicação entre os cores de um mesmo chip, tratando questões como desempenho, consumo de energia e reusabilidade para grandes sistemas integrados. Por outro lado, a mesma evolução tecnológica dos processos nanométricos reduziu drasticamente a confiabilidade de circuitos integrados, tornando dispositivos e interconexões mais sensíveis a novos tipos de falhas. Erros podem ser gerados por variações no processo de fabricação ou mesmo pela susceptibilidade do projeto, quando este opera em um ambiente hostil. Na comunicação de NoCs as duas principais fontes de erros são falhas de crosstalk e soft errors. No passado, se assumia que interconexões não poderiam ser afetadas por soft errors, por não possuirem circuitos seqüenciais. Porém, quando NoCs são usadas, buffers e circuitos seqüenciais estão presentes nos roteadores e, consequentemente, podem ocorrer soft errors entre a fonte e o destino da comunicação, provocando erros. Técnicas de tolerância a falhas, que tem sido aplicadas em circuitos em geral, podem ser usadas para proteger roteadores contra bit-flips. Neste cenário, este trabalho inicia com a avaliação dos efeitos de soft errors e falhas de crosstalk em uma arquitetura de NoC, através de simulação de injeção de falhas, analisando detalhadamente o impacto de tais falhas no roteador. Os resultados mostram que os efeitos dessas falhas na comunicação do SoC podem ser desastrosos, levando a perda de pacotes e travamento ou indisponibilidade do sistema. Então é proposta e avaliada a aplicação de um conjunto de técnicas de tolerância a falhas em roteadores, possibilitando diminuir os soft errors e falhas de crosstalk no nível de hardware. Estas técnicas propostas foram baseadas em códigos de correção de erros e redundância de hardware. Resultados experimentais mostram que estas técnicas podem obter zero erros com 50% a menos de overhead de área, quando comparadas com a duplicação simples. Entretanto, algumas dessas técnicas têm um grande consumo de potência, pois toda essas técnicas são baseadas na adição de hardware redundante. Considerando que as técnicas de proteção baseadas em software também impõe um considerável overhead na comunicação devido à retransmissão, é proposto o uso de técnicas mistas de hardware e software, que podem oferecer um nível de proteção satisfatório, baseado na análise do ambiente onde o sistema irá operar (soft error rate), fatores relativos ao projeto e fabricação (variações de atraso em interconexões, pontos susceptíveis a crosstalk), a probabilidade de uma falha gerar um erro em um roteador, a carga de comunicação e os limites de potência e energia suportados. / As the technology scales down into deep sub-micron domain, more IP cores are integrated in the same die and new communication architectures are used to meet performance and power constraints. Networks-on-Chip have been proposed as an alternative communication platform capable of providing interconnections and communication among onchip cores, handling performance, energy consumption and reusability issues for large integrated systems. However, the same advances to nanometric technologies have significantly reduced reliability in mass-produced integrated circuits, increasing the sensitivity of devices and interconnects to new types of failures. Variations at the fabrication process or even the susceptibility of a design under a hostile environment might generate errors. In NoC communications the two major sources of errors are crosstalk faults and soft errors. In the past, it was assumed that connections cannot be affected by soft errors because there was no sequential circuit involved. However, when NoCs are used, buffers and sequential circuits are present in the routers, consequently, soft errors can occur between the communication source and destination provoking errors. Fault tolerant techniques that once have been applied in integrated circuits in general can be used to protect routers against bit-flips. In this scenario, this work starts evaluating the effects of soft errors and crosstalk faults in a NoC architecture by performing fault injection simulations, where it has been accurate analyzed the impact of such faults over the switch service. The results show that the effect of those faults in the SoC communication can be disastrous, leading to loss of packets and system crash or unavailability. Then it proposes and evaluates a set of fault tolerant techniques applied at routers able to mitigate soft errors and crosstalk faults at the hardware level. Such proposed techniques were based on error correcting codes and hardware redundancy. Experimental results show that using the proposed techniques one can obtain zero errors with up to 50% of savings in the area overhead when compared to simple duplication. However some of these techniques are very power consuming because all the tolerance is based on adding redundant hardware. Considering that softwarebased mitigation techniques also impose a considerable communication overhead due to retransmission, we then propose the use of mixed hardware-software techniques, that can develop a suitable protection scheme driven by the analysis of the environment that the system will operate in (soft error rate), the design and fabrication factors (delay variations in interconnects, crosstalk enabling points), the probability of a fault generating an error in the router, the communication load and the allowed power or energy budget.
19

Timing vulnerability factor analysis in master-slave D flip-flops / Análise do fator de vulnerabilidade temporal em flip-flops mestre-escravo do tipo D

Zimpeck, Alexandra Lackmann January 2016 (has links)
O dimensionamento da tecnologia trouxe consequências indesejáveis para manter a taxa de crescimento exponencial e levanta questões importantes relacionadas com a confiabilidade e robustez dos sistemas eletrônicos. Atualmente, microprocessadores modernos de superpipeline normalmente contêm milhões de dispositivos com cargas nos nós cada vez menores. Esse fator faz com que os circuitos sejam mais sensíveis a variabilidade ambiental e aumenta a probabilidade de um erro transiente acontecer. Erros transientes em circuitos sequenciais ocorrem quando uma única partícula energizada deposita carga suficiente perto de uma região sensível. Flip-Flops mestreescravo são os circuitos sequencias mais utilizados em projeto VLSI para armazenamento de dados. Se um bit-flip ocorrer dentro deles, eles perdem a informação prévia armazenada e podem causar um funcionamento incorreto do sistema. A fim de proporcionar sistemas mais confiáveis que possam lidar com os efeitos da radiação, este trabalho analisa o Fator de Vulnerabilidade Temporal (Timing Vulnerability Factor - TVF) em algumas topologias de flip-flops mestre-escravo em estágios de pipeline sob diferentes condições de operação. A janela de tempo efetivo que o bit-flip ainda pode ser capturado pelo próximo estágio é definido com janela de vulnerabilidade (WOV). O TVF corresponde ao tempo que o flip-flop é vulnerável a erros transientes induzidos pela radiação de acordo com a WOV e a frequência de operação. A primeira etapa deste trabalho determina a dependência entre o TVF com a propagação de falhas até o próximo estágio através de uma lógica combinacional com diferentes atrasos de propagação e com diferentes modelos de tecnologia, incluindo também as versões de alto desempenho e baixo consumo. Todas as simulações foram feitas sob as condições normais pré-definidas nos arquivos de tecnologia. Como a variabilidade se manifesta com o aumento ou diminuição das especificações iniciais, onde o principal problema é a incerteza sobre o valor armazenado em circuitos sequenciais, a segunda etapa deste trabalho consiste em avaliar o impacto que os efeitos da variabilidade ambiental causam no TVF. Algumas simulações foram refeitas considerando variações na tensão de alimentação e na temperatura em diferentes topologias e configurações de flip-flops mestre-escravo. Para encontrar os melhores resultados, é necessário tentar diminuir os valores de TVF, pois isso significa que eles serão menos vulneráveis a bit-flips. Atrasos de propagação entre dois circuitos sequenciais e frequências de operação mais altas ajudam a reduzir o TVF. Além disso, estas informações podem ser facilmente integradas em ferramentas de EDA para ajudar a identificar os flip-flops mestre-escravo mais vulneráveis antes de mitigar ou substituí-los por aqueles tolerantes a radiação. / Technology scaling has brought undesirable issues to maintain the exponential growth rate and it raises important topics related to reliability and robustness of electronic systems. Currently, modern super pipelined microprocessors typically contain many millions of devices with ever decreasing load capacitances. This factor makes circuits more sensitive to environmental variations and it is increased the probability to induce a soft error. Soft errors in sequential circuits occur when a single energetic particle deposits enough charge near a sensitive node. Master-slave flip-flops are the most adopted sequential elements to work as registers in pipeline and finite state machines. If a bit-flip happens inside them, they lose the previous stored information and may cause an incorrect system operation. To provide reliable systems that can cope with radiation effects, this work analysis the Timing Vulnerability Factor (TVF) of some master-slave D flip-flops topologies in pipeline stages under different operating conditions. The effective time window, which the bit-flip can still be captured by the next stage, is defined as Window of Vulnerability (WOV). TVF corresponds to the time that a flip-flop is vulnerable to radiation-induced soft errors according to WOV and clock frequency. In the first step of this work, it is determined the dependence between the TVF with the fault propagation to the next stage through a combinational logic with different propagation delays and with different nanometer technological models, including also high performance and low power versions. All these simulations were made under the pre-defined nominal conditions in technology files. The variability manifests with an increase or decreases to initial specification, where the main problem is the uncertainty about the value stored in sequential. In this way, the second step of this work evaluates the impact that environmental variability effect causes in TVF. Some simulations were redone considering supply voltage and temperature variations in different master-slave D flip-flop topologies configurations. To achieve better results, it is necessary to try to decrease the TVF values to reduce the vulnerability to bit-flips. The propagation delay between two sequential elements and higher clock frequencies collaborates to reduce TVF values. Moreover, all the information can be easily integrated into Electronic Design Automation (EDA) tools to help identifying the most vulnerable master-slave flip-flops before mitigating or replacing them by radiation hardened ones.
20

Analysis and Mitigation of Multiple Radiation Induced Errors in Modern Circuits

Watkins, Adam 01 December 2016 (has links)
Due to technology scaling, the probability of a high energy radiation particle striking multiple transistors has continued to increase. This, in turn has created a need for new circuit designs that can tolerate multiple simultaneous errors. A common type of error in memory elements is the double node upset (DNU) which has continued to become more common. All existing DNU tolerant designs either suffer from high area and performance overhead, may lose the data stored in the element during clock gating due to high impedance states or are vulnerable to an error after a DNU occurs. In this dissertation, a novel latch design is proposed in which all nodes are capable of fully recovering their correct value after a single or double node upset, referred to as DNU robust. The proposed latch offers lower delay, power consumption and area requirements compared to existing DNU robust designs. Multiple simultaneous radiation induced errors are a current problem that must be studied in combinational logic. Typically, simulators are used early in the design phase which use netlists and rudimentary information of the process parameters to determine the error rate of a circuit. Existing simulators are able to accurately determine the effects when the problem space is limited to one error. However, existing methods do not provide accurate information when multiple concurrent errors occur due to inaccurate approximation of the glitch shape when multiple errors meet at a gate. To improve existing error simulation, a novel analytical methodology to determine the pulse shape when multiple simultaneous errors occur is proposed. Through extensive simulations, it is shown that the proposed methodology matches closely with HSPICE while providing a speedup of 15X. The analysis of the soft error rate of a circuit has continued to be a difficult problem due to the calculation of the logical effect on a pulse generated by a radiation particle. Common existing methods to determine logical effects use either exhaustive input pattern simulation or binary decision diagrams. The problem with both approaches is that simulation of the circuit can be intractably time consuming or can encounter memory blowup. To solve this issue, a simulation tool is proposed which employs partitioning to reduce the execution time and memory overhead. In addition, the tool integrates an accurate electrical masking model. Compared to existing simulation tools, the proposed tool can simulate circuits up to 90X faster.

Page generated in 0.0377 seconds