• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 205
  • 72
  • 64
  • 50
  • 25
  • 21
  • 15
  • 10
  • 6
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • Tagged with
  • 680
  • 197
  • 162
  • 136
  • 135
  • 134
  • 127
  • 124
  • 118
  • 85
  • 81
  • 75
  • 73
  • 69
  • 59
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
221

Trapp : uma ferramenta para particionamento/posicionamento de celulas para metodologia tranca / A trapp tool for partitioning/placement of methodology tranca's cells

Schermer, Paulo Armando January 1995 (has links)
Este trabalho propõe e avalia um novo algoritmo para o posicionamento de células de circuitos que utilizam a metodologia de projeto TRANCA. O algoritmo proposto realiza o posicionamento por particionamento, em n-blocos, baseado no conceito de balanceamento de redes, realizando um pré-roteamento global. A maioria dos algoritmos de posicionamento por particionamento são baseados na heurística de Kernighan-Lin[KER 70] e Fidducia-Mattheyses[FID 82] com migração de grupos. Estes algoritmos utilizam uma função de corte mínimo para diminuir o cruzamento de redes entre as duas partições, produzindo regiões saturadas. Sendo assim, o conceito de balanceamento de redes significa a busca de um equilíbrio no comprimento das conexões para evitar a criação de regiões saturadas, diminuindo o tempo computacional e facilitando a etapa de roteamento. Apresenta-se uma visão geral de síntese automática. Descreve-se os estilos de projeto mais utilizados, define-se e analisa-se o problema de particionamento e posicionamento de células. As principais características da metodologia TRANCA são apresentadas. Resume-se as principais características das ferramentas de síntese TRANCA, destacando-se as etapas de particionamento e posicionamento de cada uma, visando o aproveitamento destas características positivas. Com o propósito de fundamentar os conceitos usados para o desenvolvimento do algoritmo, apresenta-se os métodos de posicionamento mais relevantes, dando destaque aqueles baseados em particionamento. Descreve-se algumas das heurísticas existentes. Os conceitos utilizados para o desenvolvimento do algoritmo são então descritos. O algoritmo consiste basicamente da distribuição das conexões, utilizando um mapa de congestionamento do circuito, o que caracteriza um pré-roteamento global. O mapa de congestionamento é montado sobre as partições geradas no circuito. Além do mapa de congestionamento, a descrição dos caminhos das redes é realizada sobre um modelo definido para controlar o cruzamento de redes. Apos a definição dos conceitos, o ambiente criado para o algoritmo é apresentado. Com o objetivo de validar os conceitos estudados e aqueles propostos, implementou-se um protótipo, chamado TRAPP(TRAnsparent Placement by Partitioning), e um visualizador de posicionamento chamado CIPPATO. Finalmente, alguns resultados do protótipo desenvolvido e uma avaliação sobre o comportamento dente protótipo são apresentados. Propõe também implementações alternativas e direções para trabalhos futuros. / This work proposes and evaluates a new algorithm for cells' placement, for use on TRANCA[REI 87] layouts. The algorithm proposed makes a placement by partitioning using multiple steps, based on the concept of net balancing, in order to make a global prerouting. Most partitioning algorithms are based on the Kernighan-Lin[KER 70] and Fidducia-Mattheyses[FID 82] heuristics with migration groups. These algorithms use a mincut heuristic to decrease the crossing nets between the two blocks, producing saturated regions. Therefore, the nets balancing concept means to search for a balance in the connections size to avoid satured regions, decreasing a computation time and to increase the routing performance. The global vision of automatic synthesis is shown. The main design styles are described and the placement and partitioning problems are analysed. The main features of TRANCA methodology are shown. A summary about the TRANCA synthesis tools is presented, emphasizing the partitioning and placement step in each one. This main features are evaluated. The basic ideas that suported the development of the algorithm are described. The algorithm provides a connection distribuition, using a congestion map of the circuit that describes a global pre-routing. The congestion map is generated based on the circuit partitioning. In addition (to the congestion map), the net paths are defined to control the crossing nets. After the definition of the concepts, the environment created for the algorithm is showed. The most important placement methods are studied and presented in order to provide a general picture of the problem. Among them, specifc attention is given to those based an partitioning. Some particular heuristics are detailed. A prototype system called TRAPP( TRAnsparent Placement by Partitioning) was developed to evaluate this approach. It is completed by a placement viewer, CIPPATO. Finally, some results and conclusions are presented. New implementations and directions for further works are proposed too.
222

Radiation Hardened Clock Design

January 2015 (has links)
abstract: Clock generation and distribution are essential to CMOS microchips, providing synchronization to external devices and between internal sequential logic. Clocks in microprocessors are highly vulnerable to single event effects and designing reliable energy efficient clock networks for mission critical applications is a major challenge. This dissertation studies the basics of radiation hardening, essentials of clock design and impact of particle strikes on clocks in detail and presents design techniques for hardening complete clock systems in digital ICs. Since the sequential elements play a key role in deciding the robustness of any clocking strategy, hardened-by-design implementations of triple-mode redundant (TMR) pulse clocked latches and physical design methodologies for using TMR master-slave flip-flops in application specific ICs (ASICs) are proposed. A novel temporal pulse clocked latch design for low power radiation hardened applications is also proposed. Techniques for designing custom RHBD clock distribution networks (clock spines) and ASIC clock trees for a radiation hardened microprocessor using standard CAD tools are presented. A framework for analyzing the vulnerabilities of clock trees in general, and study the parameters that contribute the most to the tree’s failure, including impact on controlled latches is provided. This is then used to design an integrated temporally redundant clock tree and pulse clocked flip-flop based clocking scheme that is robust to single event transients (SETs) and single event upsets (SEUs). Subsequently, designing robust clock delay lines for use in double data rate (DDRx) memory applications is studied in detail. Several modules of the proposed radiation hardened all-digital delay locked loop are designed and studied. Many of the circuits proposed in this entire body of work have been implemented and tested on a standard low-power 90-nm process. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2015
223

Integrated CMOS-based Low Power Electrochemical Impedance Spectroscopy for Biomedical Applications

January 2016 (has links)
abstract: This thesis dissertation presents design of portable low power Electrochemical Impedance Spectroscopy (EIS) system which can be used for biomedical applications such as tear diagnosis, blood diagnosis, or any other body-fluid diagnosis. Two design methodologies are explained in this dissertation (a) a discrete component-based portable low-power EIS system and (b) an integrated CMOS-based portable low-power EIS system. Both EIS systems were tested in a laboratory environment and the characterization results are compared. The advantages and disadvantages of the integrated EIS system relative to the discrete component-based EIS system are presented including experimental data. The specifications of both EIS systems are compared with commercially available non-portable EIS workstations. These designed EIS systems are handheld and very low-cost relative to the currently available commercial EIS workstations. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2016
224

Energy and Quality-Aware Multimedia Signal Processing

January 2012 (has links)
abstract: Today's mobile devices have to support computation-intensive multimedia applications with a limited energy budget. In this dissertation, we present architecture level and algorithm-level techniques that reduce energy consumption of these devices with minimal impact on system quality. First, we present novel techniques to mitigate the effects of SRAM memory failures in JPEG2000 implementations operating in scaled voltages. We investigate error control coding schemes and propose an unequal error protection scheme tailored for JPEG2000 that reduces overhead without affecting the performance. Furthermore, we propose algorithm-specific techniques for error compensation that exploit the fact that in JPEG2000 the discrete wavelet transform outputs have larger values for low frequency subband coefficients and smaller values for high frequency subband coefficients. Next, we present use of voltage overscaling to reduce the data-path power consumption of JPEG codecs. We propose an algorithm-specific technique which exploits the characteristics of the quantized coefficients after zig-zag scan to mitigate errors introduced by aggressive voltage scaling. Third, we investigate the effect of reducing dynamic range for datapath energy reduction. We analyze the effect of truncation error and propose a scheme that estimates the mean value of the truncation error during the pre-computation stage and compensates for this error. Such a scheme is very effective for reducing the noise power in applications that are dominated by additions and multiplications such as FIR filter and transform computation. We also present a novel sum of absolute difference (SAD) scheme that is based on most significant bit truncation. The proposed scheme exploits the fact that most of the absolute difference (AD) calculations result in small values, and most of the large AD values do not contribute to the SAD values of the blocks that are selected. Such a scheme is highly effective in reducing the energy consumption of motion estimation and intra-prediction kernels in video codecs. Finally, we present several hybrid energy-saving techniques based on combination of voltage scaling, computation reduction and dynamic range reduction that further reduce the energy consumption while keeping the performance degradation very low. For instance, a combination of computation reduction and dynamic range reduction for Discrete Cosine Transform shows on average, 33% to 46% reduction in energy consumption while incurring only 0.5dB to 1.5dB loss in PSNR. / Dissertation/Thesis / Ph.D. Electrical Engineering 2012
225

Arquiteturas de alto desempenho e baixo custo em hardware para a estimação de movimento em vídeos digitais / High performance and low cost hardware architectures for digital videos motion estimation

Porto, Marcelo January 2008 (has links)
A evolução das Tecnologias de Informação e Comunicação (TIC) favoreceu o crescimento do uso de variados meios na comunicação. Entre diversos meios, o vídeo em particular, necessita de uma grande banda para ser transmitido, ou de um grande espaço para ser armazenado. Uma análise dos diversos sinais de uma comunicação multimídia mostra, entretanto, que existe uma grande redundância de informação. Utilizando técnicas de compressão é possível reduzir de uma a duas ordens de grandeza a quantidade de informação veiculada, mantendo uma qualidade satisfatória. Uma das formas de compressão busca a relação de similaridade entre os quadros vizinhos de uma cena, identificando a redundância temporal existente entre as imagens. Essa técnica chama-se estimação de movimento, este processo é muito eficaz, mas o custo computacional é elevado, exigindo a implementação de algoritmos eficientes em hardware, para o caso de compressão em tempo real de vídeos de alta resolução. Esta dissertação apresenta uma investigação sobre algoritmos de estimação de movimento visando implementações em hardware. Todos os algoritmos foram desenvolvidos primeiramente em linguagem C e submetidos a diversos testes para avaliação de desempenho e custo computacional. Os algoritmos foram aplicados a diversas amostras de vídeo utilizadas pela comunidade científica, para avaliação em aplicações reais. As avaliações demonstraram que os algoritmos rápidos conseguem realizar o processo de estimação de movimento de maneira eficiente, obtendo bons resultados em termos de qualidade de vetores, esforço computacional e desempenho. Com as análises dos resultados obtidos, o algoritmo Busca Diamante (Diamond Search) foi escolhido para ser implementado em hardware, com dois níveis diferentes de subamostragem de pixel: 2:1 e 4:1. As arquiteturas para o algoritmo Busca Diamante, com sub-amostragem de pixel de 2:1 e 4:1, foram descritas em VHDL, sintetizadas para FPGAs Virtex-4 da Xilinx e também para standard cells na tecnologia TSMC 0,18μm. Os resultados mostram que as arquiteturas desenvolvidas possuem desempenho superior ao necessário para tratar vídeos HDTV 1080p em tempo real a 30 quadros por segundo. As arquiteturas desenvolvidas também apresentam um baixo consumo de recursos de hardware, após a síntese para FPGA e ASIC. / The evolution of the communication and information technologies push the development of several communication media. These media, video in particular, need a large bandwidth to be transmitted, or a large digital storage capacity. Many multimedia signals show, however, a high information redundancy. By using compression techniques it is possible to reduce the amount of coded information by one or two orders of magnitude, keeping a satisfactory visual quality. One of these compression techniques searches the similarity between neighboring frames of a scene, identifying the temporal redundancy between them. This technique is called motion estimation, and it is a very efficient method for compression. However, the computational complexity of the motion estimation requires high performance algorithms in hardware, when used for real time compression of high resolution videos. This dissertation presents a comprehensive investigation about motion estimation algorithms, targeting a hardware implementation. All the investigated algorithms were first developed in C language and submitted to many evaluation tests. The algorithms were applied to ten video samples used by the scientific community for the evaluation of real application. The evaluation showed that fast algorithms can carry out the motion estimation process efficiently, producing good results in vectors quality, computational effort and performance. With the results analyses, the Diamond Search algorithm was chosen to be hardware designed, with two different levels of pixel subsampling, 2:1 and 4:1. The architectures for Diamond Search algorithm, with pixel subsampling of 2:1 and 4:1, were described in VHDL, synthesized to Xilinx Virtex-4 FPGAs and also to standard cells TSMC 0.18μm technology. The developed architectures have sufficient performance to process HDTV 1080p videos at 30 frames per second and demand small hardware resources consumption after synthesis to FPGA and ASIC. Keywords: Video compression, motion estimation, VLSI design.
226

Floorplan-Aware High Performance NoC Design

Roca Pérez, Antoni 20 November 2012 (has links)
Las actuales arquitecturas de m�ltiples n�cleos como los chip multiprocesadores (CMP) y soluciones multiprocesador para sistemas dentro del chip (MPSoCs) han adoptado a las redes dentro del chip (NoC) como elemento -ptimo para la inter-conexi-n de los diversos elementos de dichos sistemas. En este sentido, fabricantes de CMPs y MPSoCs han adoptado NoCs sencillas, generalmente con una topolog'a en malla o anillo, ya que son suficientes para satisfacer las necesidades de los sistemas actuales. Sin embargo a medida que los requerimientos del sistema -- baja latencia y alto rendimiento -- se hacen m�s exigentes, estas redes tan simples dejan de ser una soluci-n real. As', la comunidad investigadora ha propuesto y analizado NoCs m�s complejas. No obstante, estas soluciones son m�s dif'ciles de implementar -- especialmente los enlaces largos -- haciendo que este tipo de topolog'as complejas sean demasiado costosas o incluso inviables. En esta tesis, presentamos una metodolog'a de dise-o que minimiza la p�rdida de prestaciones de la red debido a su implementaci-n real. Los principales problemas que se encuentran al implementar una NoC son los conmutadores y los enlaces largos. En esta tesis, el conmutador se ha hecho modular, es decir, formado como uni-n de m-dulos m�s peque-os. En nuestro caso, los m-dulos son id�nticos, donde cada m-dulo es capaz de arbitrar, conmutar, y almacenar los mensajes que le llegan. Posteriormente, flexibilizamos la colocaci-n de estos m-dulos en el chip, permitiendo que m-dulos de un mismo conmutador est�n distribuidos por el chip. Esta metodolog'a de dise-o la hemos aplicado a diferentes escenarios. Primeramente, hemos introducido nuestro conmutador modular en NoCs con topolog'as conocidas como la malla 2D. Los resultados muestran como la modularidad y la distribuci-n del conmutador reducen la latencia y el consumo de potencia de la red. En segundo lugar, hemos utilizado nuestra metodolog'a de dise-o para implementar un crossbar distribuid / Roca Pérez, A. (2012). Floorplan-Aware High Performance NoC Design [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17844 / Palancia
227

On low power and circuit parameter independent tests, and a new method of test response compaction

Howard, Joseph Michael 01 December 2010 (has links)
Testing an integrated circuit once it has been manufactured is required in order to identify faulty and fault-free circuits. As the complexity of integrated circuits increases so does the difficulty of creating efficient and high quality tests that can detect a variety of defect types that can occur throughout the manufacturing process. Three issues facing manufacturing test are the power consumed during testing, addressing different types of fault, and test data volume. In regards to the power consumed during testing, abnormal switching activity, far above that seen by functional operation, may occur due to the testing technique of scan insertion. While scan insertion greatly simplifies test generation for sequential circuits, it may lead to excessive switching activity due to the loading and unloading of scan data and when the scan cells are updated using functional clocks. This can potentially damage the circuit due to excessive heat or inadvertently fail a good circuit due to current supply demands beyond design specifications. Stuck-at tests detect when lines are shorted to either the power supply or ground. Open faults are broken connections within the circuit. Some open faults may not be detected by tests generated for stuck-at faults. Therefore tests may need to be generated in order to detect these open faults. The voltage on the open node is determined by circuit parameters. Due to the feature size of the circuit it may not be possible to determine these circuit parameters, making it very difficult or impossible to generate tests for open faults. Automated test equipment is used to apply test stimuli and observing the output response. The output response is compared to the known fault-free response in order to determine if it is faulty or fault-free. Thus, automated test equipment must store the test stimuli and the fault-free responses in memory. With increased integrated circuit complexity, the number of inputs, outputs, and faults increase, increasing the overall data required for testing. Automated test equipment is very expensive, proportional to the memory required to store the test stimuli and fault-free output response. Simply replacing automated test equipment is not cost effective. These issues in the manufacturing test of integrated circuits are addressed in this dissertation. First, a method to reduce power consumption in circuits which incorporate data volume reduction techniques is proposed. Second, a test generation technique for open faults which does not require knowledge of circuit parameters is proposed. Third, a technique to further reduce output data volume in circuits which currently incorporate output response compaction techniques is proposed. Experimental results for the three techniques show their effectiveness.
228

VLSI algorithms and architectures for non-binary-LDPC decoding

Lacruz Jucht, Jesús Omar 04 November 2016 (has links)
[EN] This thesis studies the design of low-complexity soft-decision Non-Binary Low-Density Parity-Check (NB-LDPC) decoding algorithms and their corresponding hardware architectures suitable for decoding high-rate codes at high throughput (hundreds of Mbps and Gbps). In the first part of the thesis the main aspects concerning to the NB-LDPC codes are analyzed, including a study of the main bottlenecks of conventional softdecision decoding algorithms (Q-ary Sum of Products (QSPA), Extended Min-Sum (EMS), Min-Max and Trellis-Extended Min-Sum (T-EMS)) and their corresponding hardware architectures. Despite the limitations of T-EMS algorithm (high complexity in the Check Node (CN) processor, wiring congestion due to the high number of exchanged messages between processors and the inability to implement decoders over high-order Galois fields due to the high decoder complexity), it was selected as starting point for this thesis due to its capability to reach high-throughput. Taking into account the identified limitations of the T-EMS algorithm, the second part of the thesis includes six papers with the results of the research made in order to mitigate the T-EMS disadvantages, offering solutions that reduce the area, the latency and increase the throughput compared to previous proposals from literature without sacrificing coding gain. Specifically, five low-complexity decoding algorithms are proposed, which introduce simplifications in different parts of the decoding process. Besides, five complete decoder architectures are designed and implemented on a 90nm Complementary Metal-Oxide-Semiconductor (CMOS) technology. The results show an achievement in throughput higher than 1Gbps and an area less than 10 mm2. The increase in throughput is 120% and the reduction in area is 53% compared to previous implementations of T-EMS, for the (837,726) NB-LDPC code over GF(32). The proposed decoders reduce the CN area, latency, wiring between CN and Variable Node (VN) processor and the number of storage elements required in the decoder. Considering that these proposals improve both area and speed, the efficiency parameter (Mbps / Million NAND gates) is increased in almost five times compared to other proposals from literature. The improvements in terms of area allow us to implement NB-LDPC decoders over high-order fields which had not been possible until now due to the highcomplexity of decoders previously proposed in literature. Therefore, we present the first post-place and route report for high-rate codes over high-order fields higher than Galois Field (GF)(32). For example, for the (1536,1344) NB-LDPC code over GF(64) the throughput is 1259Mbps occupying an area of 28.90 mm2. On the other hand, a decoder architecture is implemented on a Field Programmable Gate Array (FPGA) device achieving 630 Mbps for the high-rate (2304,2048) NB-LDPC code over GF(16). To the best knowledge of the author, these results constitute the highest ones presented in literature for similar codes and implemented on the same technologies. / [ES] En esta tesis se aborda el estudio del diseño de algoritmos de baja complejidad para la decodificación de códigos de comprobación de paridad de baja densidad no binarios (NB-LDPC) y sus correspondientes arquitecturas apropiadas para decodificar códigos de alta tasa a altas velocidades (cientos de Mbps y Gbps). En la primera parte de la tesis los principales aspectos concernientes a los códigos NB-LDPC son analizados, incluyendo un estudio de los principales cuellos de botella presentes en los algoritmos de decodificación convencionales basados en decisión blanda (QSPA, EMS, Min-Max y T-EMS) y sus correspondientes arquitecturas hardware. A pesar de las limitaciones del algoritmo T-EMS (alta complejidad en el procesador del nodo de chequeo de paridad (CN), congestión en el rutado debido al intercambio de mensajes entre procesadores y la incapacidad de implementar decodificadores para campos de Galois de orden elevado debido a la elevada complejidad), éste fue seleccionado como punto de partida para esta tesis debido a su capacidad para alcanzar altas velocidades. Tomando en cuenta las limitaciones identificadas en el algoritmo T-EMS, la segunda parte de la tesis incluye seis artículos con los resultados de la investigación realizada con la finalidad de mitigar las desventajas del algoritmo T-EMS, ofreciendo soluciones que reducen el área, la latencia e incrementando la velocidad comparado con propuestas previas de la literatura sin sacrificar la ganancia de codificación. Especificamente, cinco algoritmos de decodificación de baja complejidad han sido propuestos, introduciendo simplificaciones en diferentes partes del proceso de decodificación. Además, arquitecturas completas de decodificadores han sido diseñadas e implementadas en una tecnologia CMOS de 90nm consiguiéndose una velocidad mayor a 1Gbps con un área menor a 10 mm2, aumentando la velocidad en 120% y reduciendo el área en 53% comparado con previas implementaciones del algoritmo T-EMS para el código (837,726) implementado sobre campo de Galois GF(32). Las arquitecturas propuestas reducen el área del CN, latencia, número de mensajes intercambiados entre el nodo de comprobación de paridad (CN) y el nodo variable (VN) y el número de elementos de almacenamiento en el decodificador. Considerando que estas propuestas mejoran tanto el área comola velocidad, el parámetro de eficiencia (Mbps / Millones de puertas NAND) se ha incrementado en casi cinco veces comparado con otras propuestas de la literatura. Las mejoras en términos de área nos ha permitido implementar decodificadores NBLDPC sobre campos de Galois de orden elevado, lo cual no habia sido posible hasta ahora debido a la alta complejidad de los decodificadores anteriormente propuestos en la literatura. Por lo tanto, en esta tesis se presentan los primeros resultados incluyendo el emplazamiento y rutado para códigos de alta tasa sobre campos finitos de orden mayor a GF(32). Por ejemplo, para el código (1536,1344) sobre GF(64) la velocidad es 1259 Mbps ocupando un área de 28.90 mm2. Por otro lado, una arquitectura de decodificador ha sido implementada en un dispositivo FPGA consiguiendo 660 Mbps de velocidad para el código de alta tasa (2304,2048) sobre GF(16). Estos resultados constituyen, según el mejor conocimiento del autor, los mayores presentados en la literatura para códigos similares implementados para las mismas tecnologías. / [CAT] En esta tesi s'aborda l'estudi del disseny d'algoritmes de baixa complexitat per a la descodificació de codis de comprovació de paritat de baixa densitat no binaris (NB-LDPC), i les seues corresponents arquitectures per a descodificar codis d'alta taxa a altes velocitats (centenars de Mbps i Gbps). En la primera part de la tesi els principals aspectes concernent als codis NBLDPC són analitzats, incloent un estudi dels principals colls de botella presents en els algoritmes de descodificació convencionals basats en decisió blana (QSPA, EMS, Min-Max i T-EMS) i les seues corresponents arquitectures. A pesar de les limitacions de l'algoritme T-EMS (alta complexitat en el processador del node de revisió de paritat (CN), congestió en el rutat a causa de l'intercanvi de missatges entre processadors i la incapacitat d'implementar descodificadors per a camps de Galois d'orde elevat a causa de l'elevada complexitat), este va ser seleccionat com a punt de partida per a esta tesi degut a la seua capacitat per a aconseguir altes velocitats. Tenint en compte les limitacions identificades en l'algoritme T-EMS, la segona part de la tesi inclou sis articles amb els resultats de la investigació realitzada amb la finalitat de mitigar els desavantatges de l'algoritme T-EMS, oferint solucions que redueixen l'àrea, la latència i incrementant la velocitat comparat amb propostes prèvies de la literatura sense sacrificar el guany de codificació. Específicament, s'han proposat cinc algoritmes de descodificació de baixa complexitat, introduint simplificacions en diferents parts del procés de descodificació. A més, s'han dissenyat arquitectures completes de descodificadors i s'han implementat en una tecnologia CMOS de 90nm aconseguint-se una velocitat major a 1Gbps amb una àrea menor a 10 mm2, augmentant la velocitat en 120% i reduint l'àrea en 53% comparat amb prèvies implementacions de l'algoritme T-EMS per al codi (837,726) implementat sobre camp de Galois GF(32). Les arquitectures proposades redueixen l'àrea del CN, la latència, el nombre de missatges intercanviats entre el node de comprovació de paritat (CN) i el node variable (VN) i el nombre d'elements d'emmagatzemament en el descodificador. Considerant que estes propostes milloren tant l'àrea com la velocitat, el paràmetre d'eficiència (Mbps / Milions deportes NAND) s'ha incrementat en quasi cinc vegades comparat amb altres propostes de la literatura. Les millores en termes d'àrea ens ha permès implementar descodificadors NBLDPC sobre camps de Galois d'orde elevat, la qual cosa no havia sigut possible fins ara a causa de l'alta complexitat dels descodificadors anteriorment proposats en la literatura. Per tant, nosaltres presentem els primers reports després de l'emplaçament i rutat per a codis d'alta taxa sobre camps finits d'orde major a GF(32). Per exemple, per al codi (1536,1344) sobre GF(64) la velocitat és 1259 Mbps ocupant una àrea de 28.90 mm2. D'altra banda, una arquitectura de descodificador ha sigut implementada en un dispositiu FPGA aconseguint 660 Mbps de velocitat per al codi d'alta taxa (2304,2048) sobre GF(16). Estos resultats constitueixen, per al millor coneixement de l'autor, els millors presentats en la literatura per a codis semblants implementats per a les mateixes tecnologies. / Lacruz Jucht, JO. (2016). VLSI algorithms and architectures for non-binary-LDPC decoding [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/73266 / TESIS
229

Applications Of Physical Unclonable Functions on ASICS and FPGAs

Usmani, Mohammad 04 April 2018 (has links)
With the ever-increasing demand for security in embedded systems and wireless sensor networks, we require integrating security primitives for authentication in these devices. One such primitive is known as a Physically Unclonable Function. This entity can be used to provide security at a low cost, as the key or digital signature can be generated by dedicating a small part of the silicon die to these primitives which produces a fingerprint unique to each device. This fingerprint produced by a PUF is called its response. The response of PUFs depends upon the process variation that occurs during the manufacturing process. In embedded systems and especially wireless sensor networks, there is a need to secure the data the collected from the sensors. To tackle this problem, we propose the use of SRAM-based PUFs to detect the temperature of the system. This is done by taking the PUF response to generate temperature based keys. The key would act as proofs of the temperature of the system. In SRAM PUFs, it is experimentally determined that at varying temperatures there is a shift in the response of the cells from zero to one and vice-versa. This variation can be exploited to generate random but repeatable keys at different temperatures. To evaluate our approach, we first analyze the key metrics of a PUF, namely, reliability and uniqueness. In order to test the idea of using the PUF as a temperature based key generator, we collect data from a total of ten SRAM chips at fixed temperatures steps. We first calculate the reliability, which is related to bit error rate, an important parameter with respect to error correction, at various temperatures to verify the stability of the responses. We then identify the temperature of the system by using a temperature sensor and then encode the key offset by PUF response at that temperature using BCH codes. This key-temperature pair can then be used to establish secure communication between the nodes. Thus, this scheme helps in establishing secure keys as the generation has an extra variable to produce confusion. We developed a novel PUF for Xilinx FPGAs and evaluated its quality metrics. It is very compact and has high uniqueness and reliability. We also implement 2 different PUF configurations to allow per-device selection of best PUFs to reduce the area and power required for key-generation. We also evaluate the temperature response of this PUF and show improvement in the response by using per-device selection.
230

A Study on Controlling Power Supply Ramp-Up Time in SRAM PUFs

Ramanna, Harshavardhan 29 October 2019 (has links)
With growing connectivity in the modern era, the risk of encrypted data stored in hardware being exposed to third-party adversaries is higher than ever. The security of encrypted data depends on the secrecy of the stored key. Conventional methods of storing keys in Non-Volatile Memory have been shown to be susceptible to physical attacks. Physically Unclonable Functions provide a unique alternative to conventional key storage. SRAM PUFs utilize inherent process variation caused during manufacturing to derive secret keys from the power-up values of SRAM memory cells. This thesis analyzes the effect of supply ramp-up times on the reliability of SRAM PUFs. We use SPICE simulations as the platform to observe the effect of supply ramp times at the circuit level using carefully controlled supply voltages during power-up. We also measure the effect of supply ramp times on commercially available SRAM ICs by performing reliability and uniqueness measurements on two commercial SRAM models. Finally, a hardware implementation is proposed in a commercial 16nm FinFET technology to establish the design flow for taping out a custom SRAM IC with separated peripheral and core power supplies that would allow for experimental evaluation of sequenced power supplies on the SRAM PUF.

Page generated in 0.0219 seconds