Global ETD Search

221	Trapp : uma ferramenta para particionamento/posicionamento de celulas para metodologia tranca / A trapp tool for partitioning/placement of methodology tranca's cells Schermer, Paulo Armando January 1995 (has links) Este trabalho propõe e avalia um novo algoritmo para o posicionamento de células de circuitos que utilizam a metodologia de projeto TRANCA. O algoritmo proposto realiza o posicionamento por particionamento, em n-blocos, baseado no conceito de balanceamento de redes, realizando um pré-roteamento global. A maioria dos algoritmos de posicionamento por particionamento são baseados na heurística de Kernighan-Lin[KER 70] e Fidducia-Mattheyses[FID 82] com migração de grupos. Estes algoritmos utilizam uma função de corte mínimo para diminuir o cruzamento de redes entre as duas partições, produzindo regiões saturadas. Sendo assim, o conceito de balanceamento de redes significa a busca de um equilíbrio no comprimento das conexões para evitar a criação de regiões saturadas, diminuindo o tempo computacional e facilitando a etapa de roteamento. Apresenta-se uma visão geral de síntese automática. Descreve-se os estilos de projeto mais utilizados, define-se e analisa-se o problema de particionamento e posicionamento de células. As principais características da metodologia TRANCA são apresentadas. Resume-se as principais características das ferramentas de síntese TRANCA, destacando-se as etapas de particionamento e posicionamento de cada uma, visando o aproveitamento destas características positivas. Com o propósito de fundamentar os conceitos usados para o desenvolvimento do algoritmo, apresenta-se os métodos de posicionamento mais relevantes, dando destaque aqueles baseados em particionamento. Descreve-se algumas das heurísticas existentes. Os conceitos utilizados para o desenvolvimento do algoritmo são então descritos. O algoritmo consiste basicamente da distribuição das conexões, utilizando um mapa de congestionamento do circuito, o que caracteriza um pré-roteamento global. O mapa de congestionamento é montado sobre as partições geradas no circuito. Além do mapa de congestionamento, a descrição dos caminhos das redes é realizada sobre um modelo definido para controlar o cruzamento de redes. Apos a definição dos conceitos, o ambiente criado para o algoritmo é apresentado. Com o objetivo de validar os conceitos estudados e aqueles propostos, implementou-se um protótipo, chamado TRAPP(TRAnsparent Placement by Partitioning), e um visualizador de posicionamento chamado CIPPATO. Finalmente, alguns resultados do protótipo desenvolvido e uma avaliação sobre o comportamento dente protótipo são apresentados. Propõe também implementações alternativas e direções para trabalhos futuros. / This work proposes and evaluates a new algorithm for cells' placement, for use on TRANCA[REI 87] layouts. The algorithm proposed makes a placement by partitioning using multiple steps, based on the concept of net balancing, in order to make a global prerouting. Most partitioning algorithms are based on the Kernighan-Lin[KER 70] and Fidducia-Mattheyses[FID 82] heuristics with migration groups. These algorithms use a mincut heuristic to decrease the crossing nets between the two blocks, producing saturated regions. Therefore, the nets balancing concept means to search for a balance in the connections size to avoid satured regions, decreasing a computation time and to increase the routing performance. The global vision of automatic synthesis is shown. The main design styles are described and the placement and partitioning problems are analysed. The main features of TRANCA methodology are shown. A summary about the TRANCA synthesis tools is presented, emphasizing the partitioning and placement step in each one. This main features are evaluated. The basic ideas that suported the development of the algorithm are described. The algorithm provides a connection distribuition, using a congestion map of the circuit that describes a global pre-routing. The congestion map is generated based on the circuit partitioning. In addition (to the congestion map), the net paths are defined to control the crossing nets. After the definition of the concepts, the environment created for the algorithm is showed. The most important placement methods are studied and presented in order to provide a general picture of the problem. Among them, specifc attention is given to those based an partitioning. Some particular heuristics are detailed. A prototype system called TRAPP( TRAnsparent Placement by Partitioning) was developed to evaluate this approach. It is completed by a placement viewer, CIPPATO. Finally, some results and conclusions are presented. New implementations and directions for further works are proposed too. Microeletrônica Sintese automatica Microeletronic VLSI design PAC tools
222	Radiation Hardened Clock Design January 2015 (has links) abstract: Clock generation and distribution are essential to CMOS microchips, providing synchronization to external devices and between internal sequential logic. Clocks in microprocessors are highly vulnerable to single event effects and designing reliable energy efficient clock networks for mission critical applications is a major challenge. This dissertation studies the basics of radiation hardening, essentials of clock design and impact of particle strikes on clocks in detail and presents design techniques for hardening complete clock systems in digital ICs. Since the sequential elements play a key role in deciding the robustness of any clocking strategy, hardened-by-design implementations of triple-mode redundant (TMR) pulse clocked latches and physical design methodologies for using TMR master-slave flip-flops in application specific ICs (ASICs) are proposed. A novel temporal pulse clocked latch design for low power radiation hardened applications is also proposed. Techniques for designing custom RHBD clock distribution networks (clock spines) and ASIC clock trees for a radiation hardened microprocessor using standard CAD tools are presented. A framework for analyzing the vulnerabilities of clock trees in general, and study the parameters that contribute the most to the tree’s failure, including impact on controlled latches is provided. This is then used to design an integrated temporally redundant clock tree and pulse clocked flip-flop based clocking scheme that is robust to single event transients (SETs) and single event upsets (SEUs). Subsequently, designing robust clock delay lines for use in double data rate (DDRx) memory applications is studied in detail. Several modules of the proposed radiation hardened all-digital delay locked loop are designed and studied. Many of the circuits proposed in this entire body of work have been implemented and tested on a standard low-power 90-nm process. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2015 Electrical engineering Clock Radiation Hardened by Design VLSI
223	Integrated CMOS-based Low Power Electrochemical Impedance Spectroscopy for Biomedical Applications January 2016 (has links) abstract: This thesis dissertation presents design of portable low power Electrochemical Impedance Spectroscopy (EIS) system which can be used for biomedical applications such as tear diagnosis, blood diagnosis, or any other body-fluid diagnosis. Two design methodologies are explained in this dissertation (a) a discrete component-based portable low-power EIS system and (b) an integrated CMOS-based portable low-power EIS system. Both EIS systems were tested in a laboratory environment and the characterization results are compared. The advantages and disadvantages of the integrated EIS system relative to the discrete component-based EIS system are presented including experimental data. The specifications of both EIS systems are compared with commercially available non-portable EIS workstations. These designed EIS systems are handheld and very low-cost relative to the currently available commercial EIS workstations. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2016 Electrical engineering Biomedical CMOS Impedance Low Power Spectroscopy VLSI
224	Energy and Quality-Aware Multimedia Signal Processing January 2012 (has links) abstract: Today's mobile devices have to support computation-intensive multimedia applications with a limited energy budget. In this dissertation, we present architecture level and algorithm-level techniques that reduce energy consumption of these devices with minimal impact on system quality. First, we present novel techniques to mitigate the effects of SRAM memory failures in JPEG2000 implementations operating in scaled voltages. We investigate error control coding schemes and propose an unequal error protection scheme tailored for JPEG2000 that reduces overhead without affecting the performance. Furthermore, we propose algorithm-specific techniques for error compensation that exploit the fact that in JPEG2000 the discrete wavelet transform outputs have larger values for low frequency subband coefficients and smaller values for high frequency subband coefficients. Next, we present use of voltage overscaling to reduce the data-path power consumption of JPEG codecs. We propose an algorithm-specific technique which exploits the characteristics of the quantized coefficients after zig-zag scan to mitigate errors introduced by aggressive voltage scaling. Third, we investigate the effect of reducing dynamic range for datapath energy reduction. We analyze the effect of truncation error and propose a scheme that estimates the mean value of the truncation error during the pre-computation stage and compensates for this error. Such a scheme is very effective for reducing the noise power in applications that are dominated by additions and multiplications such as FIR filter and transform computation. We also present a novel sum of absolute difference (SAD) scheme that is based on most significant bit truncation. The proposed scheme exploits the fact that most of the absolute difference (AD) calculations result in small values, and most of the large AD values do not contribute to the SAD values of the blocks that are selected. Such a scheme is highly effective in reducing the energy consumption of motion estimation and intra-prediction kernels in video codecs. Finally, we present several hybrid energy-saving techniques based on combination of voltage scaling, computation reduction and dynamic range reduction that further reduce the energy consumption while keeping the performance degradation very low. For instance, a combination of computation reduction and dynamic range reduction for Discrete Cosine Transform shows on average, 33% to 46% reduction in energy consumption while incurring only 0.5dB to 1.5dB loss in PSNR. / Dissertation/Thesis / Ph.D. Electrical Engineering 2012 Electrical engineering Low-Power Multimedia Video-Image VLSI
225	Arquiteturas de alto desempenho e baixo custo em hardware para a estimação de movimento em vídeos digitais / High performance and low cost hardware architectures for digital videos motion estimation Porto, Marcelo January 2008 (has links) A evolução das Tecnologias de Informação e Comunicação (TIC) favoreceu o crescimento do uso de variados meios na comunicação. Entre diversos meios, o vídeo em particular, necessita de uma grande banda para ser transmitido, ou de um grande espaço para ser armazenado. Uma análise dos diversos sinais de uma comunicação multimídia mostra, entretanto, que existe uma grande redundância de informação. Utilizando técnicas de compressão é possível reduzir de uma a duas ordens de grandeza a quantidade de informação veiculada, mantendo uma qualidade satisfatória. Uma das formas de compressão busca a relação de similaridade entre os quadros vizinhos de uma cena, identificando a redundância temporal existente entre as imagens. Essa técnica chama-se estimação de movimento, este processo é muito eficaz, mas o custo computacional é elevado, exigindo a implementação de algoritmos eficientes em hardware, para o caso de compressão em tempo real de vídeos de alta resolução. Esta dissertação apresenta uma investigação sobre algoritmos de estimação de movimento visando implementações em hardware. Todos os algoritmos foram desenvolvidos primeiramente em linguagem C e submetidos a diversos testes para avaliação de desempenho e custo computacional. Os algoritmos foram aplicados a diversas amostras de vídeo utilizadas pela comunidade científica, para avaliação em aplicações reais. As avaliações demonstraram que os algoritmos rápidos conseguem realizar o processo de estimação de movimento de maneira eficiente, obtendo bons resultados em termos de qualidade de vetores, esforço computacional e desempenho. Com as análises dos resultados obtidos, o algoritmo Busca Diamante (Diamond Search) foi escolhido para ser implementado em hardware, com dois níveis diferentes de subamostragem de pixel: 2:1 e 4:1. As arquiteturas para o algoritmo Busca Diamante, com sub-amostragem de pixel de 2:1 e 4:1, foram descritas em VHDL, sintetizadas para FPGAs Virtex-4 da Xilinx e também para standard cells na tecnologia TSMC 0,18μm. Os resultados mostram que as arquiteturas desenvolvidas possuem desempenho superior ao necessário para tratar vídeos HDTV 1080p em tempo real a 30 quadros por segundo. As arquiteturas desenvolvidas também apresentam um baixo consumo de recursos de hardware, após a síntese para FPGA e ASIC. / The evolution of the communication and information technologies push the development of several communication media. These media, video in particular, need a large bandwidth to be transmitted, or a large digital storage capacity. Many multimedia signals show, however, a high information redundancy. By using compression techniques it is possible to reduce the amount of coded information by one or two orders of magnitude, keeping a satisfactory visual quality. One of these compression techniques searches the similarity between neighboring frames of a scene, identifying the temporal redundancy between them. This technique is called motion estimation, and it is a very efficient method for compression. However, the computational complexity of the motion estimation requires high performance algorithms in hardware, when used for real time compression of high resolution videos. This dissertation presents a comprehensive investigation about motion estimation algorithms, targeting a hardware implementation. All the investigated algorithms were first developed in C language and submitted to many evaluation tests. The algorithms were applied to ten video samples used by the scientific community for the evaluation of real application. The evaluation showed that fast algorithms can carry out the motion estimation process efficiently, producing good results in vectors quality, computational effort and performance. With the results analyses, the Diamond Search algorithm was chosen to be hardware designed, with two different levels of pixel subsampling, 2:1 and 4:1. The architectures for Diamond Search algorithm, with pixel subsampling of 2:1 and 4:1, were described in VHDL, synthesized to Xilinx Virtex-4 FPGAs and also to standard cells TSMC 0.18μm technology. The developed architectures have sufficient performance to process HDTV 1080p videos at 30 frames per second and demand small hardware resources consumption after synthesis to FPGA and ASIC. Keywords: Video compression, motion estimation, VLSI design. Microeletrônica Compressao : Video Video compression Motion estimation VLSI design
226	On low power and circuit parameter independent tests, and a new method of test response compaction Howard, Joseph Michael 01 December 2010 (has links) Testing an integrated circuit once it has been manufactured is required in order to identify faulty and fault-free circuits. As the complexity of integrated circuits increases so does the difficulty of creating efficient and high quality tests that can detect a variety of defect types that can occur throughout the manufacturing process. Three issues facing manufacturing test are the power consumed during testing, addressing different types of fault, and test data volume. In regards to the power consumed during testing, abnormal switching activity, far above that seen by functional operation, may occur due to the testing technique of scan insertion. While scan insertion greatly simplifies test generation for sequential circuits, it may lead to excessive switching activity due to the loading and unloading of scan data and when the scan cells are updated using functional clocks. This can potentially damage the circuit due to excessive heat or inadvertently fail a good circuit due to current supply demands beyond design specifications. Stuck-at tests detect when lines are shorted to either the power supply or ground. Open faults are broken connections within the circuit. Some open faults may not be detected by tests generated for stuck-at faults. Therefore tests may need to be generated in order to detect these open faults. The voltage on the open node is determined by circuit parameters. Due to the feature size of the circuit it may not be possible to determine these circuit parameters, making it very difficult or impossible to generate tests for open faults. Automated test equipment is used to apply test stimuli and observing the output response. The output response is compared to the known fault-free response in order to determine if it is faulty or fault-free. Thus, automated test equipment must store the test stimuli and the fault-free responses in memory. With increased integrated circuit complexity, the number of inputs, outputs, and faults increase, increasing the overall data required for testing. Automated test equipment is very expensive, proportional to the memory required to store the test stimuli and fault-free output response. Simply replacing automated test equipment is not cost effective. These issues in the manufacturing test of integrated circuits are addressed in this dissertation. First, a method to reduce power consumption in circuits which incorporate data volume reduction techniques is proposed. Second, a test generation technique for open faults which does not require knowledge of circuit parameters is proposed. Third, a technique to further reduce output data volume in circuits which currently incorporate output response compaction techniques is proposed. Experimental results for the three techniques show their effectiveness. circuit independent compaction low power test vlsi Electrical and Computer Engineering
227	Applications Of Physical Unclonable Functions on ASICS and FPGAs Usmani, Mohammad 04 April 2018 (has links) With the ever-increasing demand for security in embedded systems and wireless sensor networks, we require integrating security primitives for authentication in these devices. One such primitive is known as a Physically Unclonable Function. This entity can be used to provide security at a low cost, as the key or digital signature can be generated by dedicating a small part of the silicon die to these primitives which produces a fingerprint unique to each device. This fingerprint produced by a PUF is called its response. The response of PUFs depends upon the process variation that occurs during the manufacturing process. In embedded systems and especially wireless sensor networks, there is a need to secure the data the collected from the sensors. To tackle this problem, we propose the use of SRAM-based PUFs to detect the temperature of the system. This is done by taking the PUF response to generate temperature based keys. The key would act as proofs of the temperature of the system. In SRAM PUFs, it is experimentally determined that at varying temperatures there is a shift in the response of the cells from zero to one and vice-versa. This variation can be exploited to generate random but repeatable keys at different temperatures. To evaluate our approach, we first analyze the key metrics of a PUF, namely, reliability and uniqueness. In order to test the idea of using the PUF as a temperature based key generator, we collect data from a total of ten SRAM chips at fixed temperatures steps. We first calculate the reliability, which is related to bit error rate, an important parameter with respect to error correction, at various temperatures to verify the stability of the responses. We then identify the temperature of the system by using a temperature sensor and then encode the key offset by PUF response at that temperature using BCH codes. This key-temperature pair can then be used to establish secure communication between the nodes. Thus, this scheme helps in establishing secure keys as the generation has an extra variable to produce confusion. We developed a novel PUF for Xilinx FPGAs and evaluated its quality metrics. It is very compact and has high uniqueness and reliability. We also implement 2 different PUF configurations to allow per-device selection of best PUFs to reduce the area and power required for key-generation. We also evaluate the temperature response of this PUF and show improvement in the response by using per-device selection. Electrical and Computer Engineering
228	A Study on Controlling Power Supply Ramp-Up Time in SRAM PUFs Ramanna, Harshavardhan 29 October 2019 (has links) With growing connectivity in the modern era, the risk of encrypted data stored in hardware being exposed to third-party adversaries is higher than ever. The security of encrypted data depends on the secrecy of the stored key. Conventional methods of storing keys in Non-Volatile Memory have been shown to be susceptible to physical attacks. Physically Unclonable Functions provide a unique alternative to conventional key storage. SRAM PUFs utilize inherent process variation caused during manufacturing to derive secret keys from the power-up values of SRAM memory cells. This thesis analyzes the effect of supply ramp-up times on the reliability of SRAM PUFs. We use SPICE simulations as the platform to observe the effect of supply ramp times at the circuit level using carefully controlled supply voltages during power-up. We also measure the effect of supply ramp times on commercially available SRAM ICs by performing reliability and uniqueness measurements on two commercial SRAM models. Finally, a hardware implementation is proposed in a commercial 16nm FinFET technology to establish the design flow for taping out a custom SRAM IC with separated peripheral and core power supplies that would allow for experimental evaluation of sequenced power supplies on the SRAM PUF. Security PUF SRAM Power Supply
229	Návrh digitálního IP bloku pro diskrétní kosinovu transformaci / Design of digital IP block for discrete cosine transform Veškrna, Filip January 2015 (has links) Tato diplomová práce se zabývá návrhem IP bloku pro diskrétní kosinovou transformaci. V~teoretické části jsou shrnuty algoritmy pro výpočet diskrétní kosinové transformace a diskutována jejich použitelnost v~hardwaru. Zvolený algoritmus pro hardwarovou implementaci je modelován v jazyce C. Poté je popsán na RTL úrovni, verifikován a je provedena syntéza v~technologii TSMC 65 nm. Hardwarová implementace je poté zhodnocena s ohledem na datovou propustnost, plochu, rychlost and spotřebu. DCT VLSI C model IP-block Verilog-2001
230	Synchronization Algorithms and VLSI Implementation for DC-OFDM based UWB System Zhou, Jun January 2011 (has links) UWB is a promising technology for short-range high-rate wireless applicationa.It is able to providemaximal 480Mbps data-rate at a distance of 2 meters in realisticindoormulti-path environments. UWB technology is widely applied to the next generation WPAN as well as the wireless accessof consumer electronics at home. Recently, Multi-Band OFDM based UWB technology proposed by WiMedia has been selected as the international standard by ISO. In China, a new transmission architecture based on Dual-Carrier OFDM technology is adopted as UWB standard draft. Comparing to MB-OFDM based UWB system, DC-OFDM based UWB system has multiple advantages, like more spectrum resource,lower requirements on devices, etc. Besides, it is compatiblewith existing MB-OFDM based UWB technology. Therefore, DC-OFDM based UWB is more flexible. Synchronizationis the first step atthe receiver digital baseband, which is of tremendous importance in any wireless communication systems. The performance of synchronization directly determines whether the receiver can pick up radio signals correctly or not, whether the baseband modules can fulfill the digital signal processing effectively or not. The synchronization process in OFDM system can be briefly divided into two parts: symbol timing and frequency synchronization. Symbol timing serves to judge the starting position of OFDM symbolsafter considering the impact of multi-path fading channel.While the frequency synchronization estimates the multiple imperfections in analog front-end signal processing and make proper compensation. This thesis puts the emphasis on synchronization issues in DC-OFDM based UWB systems. We are the first to analyze the synchronization algorithm as well as the hardware implementation method tailored for DC-OFDM based UWB system. We also present the VLSI implementation result for synchronization module. The thesis consists of symbol timing and frequency synchronization. Regarding on the symbol timing, we analyze the impact of several synchronization errors inOFDM system. After that, we divide the synchronization process into four modulesby functionality: packet detection, coarse timing, TFC detection and fine timing. The internal parameters in each moduleare determined by system simulations. In the aspect of algorithm development, we adopt the joint auto-correlation and cross-correlation method to meet the requirements of UWB system in different indoor multi-path environments, and therefore achieve the robustness. In the aspect of hardware implementation, we put the attention on the structure of some key modules in symbol timing and their VLSI implementation result, such as auto-correlator, cross-correlator, real-number divider, etc. Regarding on the frequency synchronization, we first investigate the multiple analog front-end imperfections in OFDM system, like CFO, SFO and I/Q imbalance, and present their mathematics models respectively in DC-OFDM based UWB system.After that, we analyze the performance degradation in OFDM system due to these non-ideal effects by the metric of EVM. RF designer can build the connection between mismatching parameters and performance degradation by referring to the analysis. Hence, theRF designer is able to traceout the outline of system design. In the aspect of algorithm development, we explore the intrinsic character of I/Q imbalancewhich causes the image interference. Then, we design a set of new training sequences based on phase rotation and give the corresponding estimation algorithm.The simulation result shows that the new training sequence is able to obtain the diversity message introduced by I/Q imbalance and therefore achieve the diversity gain during demodulation process. In order to deal with the challenging situation where multiple analog front-end imperfections co-exist, we propose a joint estimation and compensation scheme. In the aspect of hardware implementation, we present the hardware structure of CFO estimation and compensation module catered for DC-OFDM based UWB system, with the emphasis on CORDIC unit that is responsible for triangle calculations. The VLSI implementation result shows that the proposed CFO estimation and compensation module satisfies the timing and resource requirements in DC-OFDM based UWB system. In the last, we present the prospective research area in 60-GHz applications. It includes multiplenon-ideal impairments, like phase noise, non-linear power amplification, DC offset, ADCs mismatch, etc. It is even more challenging to develop joint estimation and compensation scheme for these non-ideal effects. UWB OFDM syncronization VLSI implementation Engineering and Technology Teknik och teknologier

Search results