Spelling suggestions: "subject:"blocking"" "subject:"locking""
21 |
Low-Power Clocking and Circuit Techniques for Leakage and Process Variation CompensationHansson, Martin January 2008 (has links)
Over the last four decades the integrated circuit industry has evolved in a tremendous pace. This success has been driven by the scaling of device sizes leading to higher and higher integration capability, which have enabled more functionality and higher performance. The impressive evolution of modern high-performance microprocessors have resulted in chips with over a billion transistors as well as multi-GHz clock frequencies. As the silicon integrated circuit industry moves further into the nanometer regime, scaling of device sizes is still predicted to continue at least into the near future. However, there are a number of challenges to overcome to be able to continue the increase of integration at the same pace. Three of the major challenges are increasing power dissipation due to clocking of synchronous circuit, increasing leakage currents causing growing static power dissipation and reduced circuit robustness, and finally increasing spread in circuit parameters due to physical limitations in the manufacturing process. This thesis presents a number of circuit techniques that aims to help in all three of the mentioned challenges.Power dissipation related to the clock generation and distribution is identified as the dominating contributor of the total active power dissipation for multi-GHz systems. As the complexity and size of synchronous systems continues to increase, clock power will also increase. This makes novel power reduction techniques absolutely crucial in future VLSI design. In this thesis an energy recovering clocking technique aimed at reducing the total chip clock power is presented. Based on theoretical analysis the technique is shown to enable considerable clock power savings. Moreover, the impact of the proposed technique on conventional flip-flop topologies is studied. Measurements on an experimental chip design proves the technique, and shows more than 56% lower clock power compared to conventional clock distribution techniques at clock frequencies up to 1.76 GHz.Static leakage power dissipation is a considerable contributor to the total power dissipation. This power is dissipated even for circuits that are idle and not contributing to the operation. Hence, with increasing number of transistors on each chip, circuit techniques which reduce the static leakage currents are necessary. In this thesis a technique is discussed which reduces the static leakage current in a microcode ROM resulting in 30% reduction of the leakage power with no area or performance penalty.Apart from increasing static power dissipation the increasing leakage currents also impact the robustness constraints of the circuits. This is important for regenerative circuits like flip-flops and latches where a changed state due to leakage will lead to loss of functionality. This is a serious issue especially for high-performance dynamic circuits, which are attractive in order to limit the clock load in the design. However, with the increasing leakage the robustness of dynamic circuits reduces dramatically. To improve the leakage robustness for sub-90 nm low clock load dynamic flip-flops, a novel keeper technique is proposed. The proposed keeper utilizes a scalable and simple leakage compensation technique, which is implemented on a reconfigurable flip-flop. At normal clock frequencies the flip-flop is configured in dynamic mode, and reduces the clock power by 25% due to the lower clock load. During any low-frequency operation, the flip-flop is configured as a static flip-flop retaining full functional robustness.As scaling continues further towards the fundamental atomistic limits, several challenges arise for continuing industrial device integration. Large inaccuracies in lithography process, impurities in manufacturing, and reduced control of dopant levels during implantation all cause increasing statistical spread of performance, power, and robustness of the devices. In order to compensate the impact of the increasingly large process variations on latches and flip-flops, a reconfigurable keeper technique is presented in this thesis. In contrast to the traditional design for worst-case process corners, a variable keeper circuit is utilized. The proposed reconfigurable keeper preserves the robustness of storage nodes across the process corners without degrading the overall chip performance.
|
22 |
Low-Power Multi-GHz Circuit Techniques for On-chip ClockingHansson, Martin January 2006 (has links)
The impressive evolution of modern high-performance microprocessors have resulted in chips with over one billion transistors as well as multi-GHz clock frequencies. As the silicon integrated circuit industry moves further into the nanometer regime, three of the main challenges to overcome in order for continuing CMOS technology scaling are; growing standby power dissipation, increasing variations in process parameters, and increasing power dissipation due to growing clock load and circuit complexity. This thesis addresses all three of these future scaling challenges with the overall focus on reducing the total clock-power for low-power, multi-GHz VLSI circuits. Power-dissipation related to the clock generation and distribution is identified as the dominating contributor of the total active power dissipation. This makes novel power reduction techniques crucial in future VLSI design. This thesis describes a new energy-recovering clocking technique aimed at reducing the total chip clock-power. The proposed technique consumes 2.3x lower clock-power compared to conventional clocking at a clock frequency of 1.56 GHz. Apart from increasing power dissipation due to leakage also the robustness constraints for circuits are impacted by the increasing leakage. To improve the leakage robustness for sub-90 nm low clock load dynamic flip-flops a novel keeper technique is proposed. The proposed keeper utilizes a scalable and simple leakage compensation technique. During any low frequency operation, the flip-flop is configured as a static flip-flop with increased functional robustness. In order to compensate the impact of the increasingly large process variations on latches and flip-flops, a reconfigurable keeper technique is presented in this thesis. In contrast to the traditional design for worst-case process corners, a variable keeper circuit is utilized. The proposed reconfigurable keeper preserves the robustness of storage nodes across the process corners without degrading the overall chip performance. / Report code: LiU-TEK-LIC-2006:21.
|
23 |
GALS design methodology based on pausible clockingFan, Xin 22 April 2014 (has links)
Globally Asynchronous Locally Synchronous (GALS) Design ist eine Lösung zur Skalierbarkeit und Modularität für die SoC-Integration. Heutzutage ist GALS-Design weit in der Industrie angewendet. Die meisten GALS-Systeme basieren auf Dual-Clock-FIFOs für die Kommunikation Zwischen Taktdomänen. Um Leistungsverluste aufgrund der Synchronisationslatenzzeit zu vermindern, müssen die On-Chip-FIFOs ausreichend groß sein. Dies führt jedoch oft zu erheblichen Kosten-Hardware. Effiziente GALS- Lösungen sind daher vonnöten. Diese Arbeit berichtet unsere neuesten Fortschritte in GALS Design, das auf der Pausierenden Taktung basiert. Kritische Designthemen in Bezug auf Synchronisation-szuverlässigkeit bzw. Kommunikationsfähigkeit sind systematisch und analytisch un-tersucht. Ein lose gekoppeltes GALS Data-Link-Design wird vorgeschlagen. Es unter-stützt metastabilitätsfreie Synchronisation für Sub-Takt-Baum Verzögerungen. Außer-dem unterstützt es kontinuierliche Datenübertragung für High-Throughput-Kommuni-kation. Die Rosten hinsichtlich Energie verbrauch und Chipfläche sind marginal. GALS Design ist eingesetzt, um digitales On-Chip Umschaltrauschen zu verringern. Plesiochron Taktung mit balanciertem Leistungsverbrauch zwischen GALS Blöcken wird insbesondere untersucht. Für M Taktbereiche wird eine Reduzierung um 20lgM dB für die spektralen Spitzen des Versorgungsstroms bei der Takt-Grundfrequenz theoretisch hergcleitet. Im Vergleich zu den bestehenden synchronen Lösungen, geben diese Methode eine Alternative, um das digitale schaltrauschen effektiv zu senken. Schließlich wurde die entwickelte GALS Design Methodik schon bei reale Chip-Implementierungen angewendet. Zwei komplizierte industriell relevante Test-Chips, Lighthouse und Moonrake, wurden entworfen und mit State-Of-The-Art-Technologien hergestellt. Die experimentellen Ergebnisse bzw. / Globally asynchronous locally synchronous (GALS) design presents a solution of scalability and modularity to SoC integration. Today, it has been widely applied in the industry. Most of the GALS systems are based on dual-clock FIFOs for clock domain crossing. To avoid performance loss due to synchronization latency, the on-chip FIFOs need to be sufficiently large. This, however, often leads to considerable hardware costs. Efficient design solutions of GALS are therefore in great demand. This thesis reports our latest progress in GALS design bases on pausible clocking. Critical design issues on synchronization reliability and communication performance are studied systematically and analytically. A loosely-coupled GALS data-link design is proposed. It supports metastability-free synchronization for sub-cycle clock-tree delay, and accommodates continuous data transfer for high-throughput communication. Only marginal costs of power and silicon area are required. GALS design has been employed to cope with the on-chip digital switching noise in our work. Plesiochronous clocking with power-consumption balance between GALS blocks is in particular explored. Given M clock domains, a reduction of 20lgM dB on the spectral peaks of supply current at the fundamental clock frequency is theoretically derived. In comparison with the existing synchronous design solutions, it thus presents an alternative to effective attenuation of digital switching noise. The developed GALS design methodology has been applied to chip implementation. Two complicated industry-relevant test chips, named Lighthouse and Moonrake, were designed and fabricated using state-of-the-art technologies. The experimental results as well as the on-chip measurements are reported here in detail. We expect that, our work will contribute to the practical applications of GALS design based on pausible clocking in the industry.
|
24 |
Energy and Transient Power Minimization During Behavioral SynthesisMohanty, Saraju P 17 October 2003 (has links)
The proliferation of portable systems and mobile computing platforms has increased the need for the design of low power consuming integrated circuits. The increase in chip density and clock frequencies due to technology advances has made low power design a critical issue. Low power design is further driven by several other factors such as thermal considerations and environmental concerns. In low-power design for battery driven portable applications, the reduction of peak power, peak power differential, average power and energy are equally important. In this dissertation, we propose a framework for the reduction of these parameters through datapath scheduling at behavioral level. Several ILP based and heuristic based scheduling schemes are developed for datapath synthesis assuming : (i) single supply voltage and single frequency (SVSF), (ii) multiple supply voltages and dynamic frequency clocking (MVDFC), and (iii) multiple supply voltages and multicycling (MVMC). The scheduling schemes attempt to minimize : (i) energy, (ii) energy delay product, (iii) peak power, (iv) simultaneous peak power and average power, (v) simultaneous peak power, average power, peak power differential and energy, and (vi) power fluctuation.
A new parameter called "Cycle Power Function" (CPF) is defined which captures the transient power characteristics as the equally weighted sum of normalized mean cycle power and normalized mean cycle differential power. Minimizing this parameter using multiple supply voltages and dynamic frequency clocking results in the reduction of both energy and transient power. The cycle differential power can be modeled as either the absolute deviation from the average power or as the cycle-to-cycle power gradient. The switching activity information is obtained from behavioral simulations. Power fluctuation is modeled as the cycle-to-cycle power gradient and to reduce fluctuation the mean power gradient (MPG) is minimized. The power models take into consideration the effect of switching activity on the power consumption of the functional units.
Experimental results for selected high-level synthesis benchmark circuits under different constraints indicate that significant reductions in power, energy and energy delay product can be obtained and that the MVDFC and MVMC schemes yield better power reduction compared to the SVSF scheme. Several application specific VLSI circuits were designed and implemented for digital watermarking of images. Digital watermarking is the process that embeds data called a watermark into a multimedia object such that the watermark can be detected or extracted later to make an assertion about the object. A class of VLSI architectures were proposed for various watermarking algorithms : (i) spatial domain invisible-robust watermarking scheme, (ii) spatial domain invisible-fragile watermarking scheme, (iii) spatial domain visible watermarking scheme, (iv) DCT domain invisible-robust watermarking scheme, and (v) DCT domain visible watermarking scheme. Prototype implementation of (i), (ii) and (iii) are given. The hardware modules can be incorporated in a "JPEG encoder" or in a "digital still camera".
|
25 |
Energy and transient power minimization during behavioral synthesis [electronic resource] / by Saraju P Mohanty.Mohanty, Saraju P. January 2003 (has links)
Includes vita. / Title from PDF of title page. / Document formatted into pages; contains 289 pages. / Thesis (Ph.D.)--University of South Florida, 2003. / Includes bibliographical references. / Text (Electronic thesis) in PDF format. / ABSTRACT: The proliferation of portable systems and mobile computing platforms has increased the need for the design of low power consuming integrated circuits. The increase in chip density and clock frequencies due to technology advances has made low power design a critical issue. Low power design is further driven by several other factors such as thermal considerations and environmental concerns. In low-power design for battery driven portable applications, the reduction of peak power, peak power differential, average power and energy are equally important. In this dissertation, we propose a framework for the reduction of these parameters through datapath scheduling at behavioral level. Several ILP based and heuristic based scheduling schemes are developed for datapath synthesis assuming : (i) single supply voltage and single frequency (SVSF), (ii) multiple supply voltages and dynamic frequency clocking (MVDFC), and (iii) multiple supply voltages and multicycling (MVMC). / ABSTRACT: The scheduling schemes attempt to minimize : (i) energy, (ii) energy delay product, (iii) peak power, (iv) simultaneous peak power and average power, (v) simultaneous peak power, average power, peak power differential and energy, and (vi) power fluctuation. A new parameter called "Cycle Power Function" CPF) is defined which captures the transient power characteristics as the equally weighted sum of normalized mean cycle power and normalized mean cycle differential power. Minimizing this parameter using multiple supply voltages and dynamic frequency clocking results in the reduction of both energy and transient power. The cycle differential power can be modeled as either the absolute deviation from the average power or as the cycle-to-cycle power gradient. The switching activity information is obtained from behavioral simulations. Power fluctuation is modeled as the cycle-to-cycle power gradient and to reduce fluctuation the mean power gradient MPG is minimized. / ABSTRACT: The power models take into consideration the effect of switching activity on the power consumption of the functional units. Experimental results for selected high-level synthesis benchmark circuits under different constraints indicate that significant reductions in power, energy and energy delay product can be obtained and that the MVDFC and MVMC schemes yield better power reduction compared to the SVSF scheme. Several application specific VLSI circuits were designed and implemented for digital watermarking of images. Digital watermarking is the process that embeds data called a watermark into a multimedia object such that the watermark can be detected or extracted later to make an assertion about the object. / ABSTRACT: A class of VLSI architectures were proposed for various watermarking algorithms : (i) spatial domain invisible-robust watermarking scheme, (ii) spatial domain invisible-fragile watermarking scheme, (iii) spatial domain visible watermarking scheme, (iv) DCT domain invisible-robust watermarking scheme, and (v) DCT domain visible watermarking scheme. Prototype implementation of (i), (ii) and (iii) are given. The hardware modules can be incorporated in a "JPEG encoder" or in a "digital still camera". / System requirements: World Wide Web browser and PDF reader. / Mode of access: World Wide Web.
|
26 |
An On-Chip Memory for Testing of High-Speed Mixed-Signal CircuitsOmar, Omar Jaber January 2013 (has links)
Mixed-signal processing systems especially data converters can be reliably tested at high frequencies using on-chip testing schemes based on memory. In this thesis, an on-chip testing strategy based on shift registers/memory (2 k bits) has been proposed for digital-to-analog converters (DACs) operating at 5 GHz. The proposed design uses word length of 8 bits in order to test DAC at high speed of 5 GHz. The proposed testing strategy has been designed in standard 65 nm CMOS technology with additional requirement of 1-V supply. This design has been implemented using Cadence IC design environment. The additional advantage of the proposed testing strategy is that it requires lower number of I/O pins and avoids the large number of high speed I/O pads. It therefore also solves the problem of the bandwidth limitation that is associated with I/O transmission paths. The design of the on-chip tester based on memory contains no analog block and is implemented entirely in digital domain. In the proposed design, low frequency of 1 MHz has been used outside the chip to load the data into the memory during the write mode. During the read mode, the frequency of 625 MHz is used to read the data from the memory. A multiplexing system is used to reuse the stored data during read mode to test the intended functionality and performance. In order to convert the parallel data into serial data at high frequency at the memory output, serializer has been used. By using the frequencies of 1.25 GHz and 2.5 GHz, the serializer speeds up the data from the lower frequency of 625 MHz to the highest frequency of 5 GHz in order to test DAC at 5 GHz.
|
27 |
Proposta de sistema para registro eletrônico de ponto com gerenciamento remotoAna Paula Müller Giancoli 31 January 2011 (has links)
O presente trabalho propõe uma arquitetura de sistema para registro eletrônico de ponto baseado em software livre e capaz de atender os principais requisitos extraídos da portaria 1510 do Ministério do Trabalho. Essa arquitetura utiliza o sistema operacional Linux, a linguagem de programação Python, o framework Web Plone e o servidor de aplicações Zope, a fim de proporcionar, entre outros benefícios, a segurança, o acesso ao código fonte da aplicação e a independência de fornecedores. A validação é obtida por meio de testes práticos realizados em protótipo que adota os elementos do sistema. Os resultados satisfatórios obtidos nesses testes indicam que a aludida arquitetura é adequada para a aplicação em questão. / This study aims at proposing a system architecture for electronic clocking in and out based on free software to fulfill the main requirements extracted from the 1510 regulation from the Ministry of Labor. This architecture uses the Linux operating system; the Python programming language; the Web Plone framework and the Zope application server to provide, among other benefits, the security, the access to application source code, and the independence from suppliers. The validation is obtained though practical tests on prototypes that adopt the elements of the system. The satisfactory results obtained in these tests indicate that the aforementioned architecture is suitable for the application in question.
|
28 |
APPLICATIONS OF 4-STATE NANOMAGNETIC LOGIC USING MULTIFERROIC NANOMAGNETS POSSESSING BIAXIAL MAGNETOCRYSTALLINE ANISOTROPY AND EXPERIMENTS ON 2-STATE MULTIFERROIC NANOMAGNETIC LOGICD'Souza, Noel 01 January 2014 (has links)
Nanomagnetic logic, incorporating logic bits in the magnetization orientations of single-domain nanomagnets, has garnered attention as an alternative to transistor-based logic due to its non-volatility and unprecedented energy-efficiency. The energy efficiency of this scheme is determined by the method used to flip the magnetization orientations of the nanomagnets in response to one or more inputs and produce the desired output. Unfortunately, the large dissipative losses that occur when nanomagnets are switched with a magnetic field or spin-transfer-torque inhibit the promised energy-efficiency. Another technique offering superior energy efficiency, “straintronics”, involves the application of a voltage to a piezoelectric layer to generate a strain which is transferred to an elastically coupled magnetrostrictive layer, causing magnetization rotation. The functionality of this scheme can be enhanced further by introducing magnetocrystalline anisotropy in the magnetostrictive layer, thereby generating four stable magnetization states (instead of the two stable directions produced by shape anisotropy in ellipsoidal nanomagnets). Numerical simulations were performed to implement a low-power universal logic gate (NOR) using such 4-state magnetostrictive/piezoelectric nanomagnets (Ni/PZT) by clocking the piezoelectric layer with a small electrostatic potential (~0.2 V) to switch the magnetization of the magnetic layer. Unidirectional and reliable logic propagation in this system was also demonstrated theoretically. Besides doubling the logic density (4-state versus 2-state) for logic applications, these four-state nanomagnets can be exploited for higher order applications such as image reconstruction and recognition in the presence of noise, associative memory and neuromorphic computing. Experimental work in strain-based switching has been limited to magnets that are multi-domain or magnets where strain moves domain walls. In this work, we also demonstrate strain-based switching in 2-state single-domain ellipsoidal magnetostrictive nanomagnets of lateral dimensions ~200 nm fabricated on a piezoelectric substrate (PMN-PT) and studied using Magnetic Force Microscopy (MFM). A nanomagnetic Boolean NOT gate and unidirectional bit information propagation through a finite chain of dipole-coupled nanomagnets are also shown through strain-based "clocking". This is the first experimental demonstration of strain-based switching in nanomagnets and clocking of nanomagnetic logic (Boolean NOT gate), as well as logic propagation in an array of nanomagnets.
|
29 |
Etude de la synchronisation et de la stabilité d’un réseau d’oscillateurs non linéaires. Application à la conception d’un système d’horlogerie distribuée pour un System-on-Chip (projet HODISS). / Study of the synchronization and the stability of a network of non-linear oscillators. Application to the design of a clock distribution system for a System-on-Chip (HODISS Project).Akre, Niamba Jean-Michel 11 January 2013 (has links)
Le projet HODISS dans le cadre duquel s'effectue nos travaux adresse la problématique de la synchronisation globale des systèmes complexes sur puce (System-on-Chip ou SOCs, par exemple un multiprocesseur monolithique). Les approches classiques de distribution d'horloges étant devenues de plus en plus obsolètes à cause de l'augmentation de la fréquence d'horloge, l'accroissement des temps de propagation, l'accroissement de la complexité des circuits et les incertitudes de fabrication, les concepteurs s’intéressent (pour contourner ces difficultés) à d'autres techniques basées entre autres sur les oscillateurs distribués. La difficulté majeure de cette dernière approche réside dans la capacité d’assurer le synchronisme global du système. Nous proposons un système d'horlogerie distribuée basé sur un réseau d’oscillateurs couplés en phase. Pour synchroniser ces oscillateurs, chacun d'eux est en fait une boucle à verrouillage de phase qui permet ainsi d'assurer un couplage en phase avec les oscillateurs des zones voisines. Nous analysons la stabilité de l'état synchrone dans des réseaux cartésiens identiques de boucles à verrouillage de phase entièrement numériques (ADPLLs). Sous certaines conditions, on montre que l'ensemble du réseau peut synchroniser à la fois en phase et en fréquence. Un aspect majeur de cette étude réside dans le fait que, en l'absence d'une horloge de référence absolue, le filtre de boucle dans chaque ADPLL est piloté par les fronts montants irréguliers de l'oscillateur local et, par conséquent, n'est pas régi par les mêmes équations d'état selon que l'horloge locale est avancée ou retardée par rapport au signal considéré comme référence. Sous des hypothèses simples, ces réseaux d'ADPLLs dits "auto-échantillonnés" peuvent être décrits comme des systèmes linéaires par morceaux dont la stabilité est notoirement difficile à établir. L'une des principales contributions que nous présentons est la définition de règles de conception simples qui doivent être satisfaites sur les coefficients de chaque filtre de boucle afin d'obtenir une synchronisation dans un réseau cartésien de taille quelconque. Les simulations transitoires indiquent que cette condition nécessaire de synchronisation peut également être suffisante pour une classe particulière d'ADPLLs "auto-échantillonnés". / The HODISS project, context in which this work is achieved, addresses the problem of global synchronization of complex systems-on-chip (SOCs, such as a monolithic multiprocessor). Since the traditional approaches of clock distribution are less used due to the increase of the clock frequency, increased delay, increased circuit complexity and uncertainties of manufacture, designers are interested (to circumvent these difficulties) to other techniques based among others on distributed synchronous clocks. The main difficulty of this latter approach is the ability to ensure the overall system synchronization. We propose a clock distribution system based on a network of phase-coupled oscillators. To synchronize these oscillators, each is in fact a phase-locked loop which allows to ensure a phase coupling with the nearest neighboring oscillators. We analyze the stability of the synchronized state in Cartesian networks of identical all-digital phase-locked loops (ADPLLs). Under certain conditions, we show that the entire network may synchronize both in phase and frequency. A key aspect of this study lies in the fact that, in the absence of an absolute reference clock, the loop-filter in each ADPLL is operated on the irregular rising edges of the local oscillator and consequently, does not use the same operands depending on whether the local clock is leading or lagging with respect to the signal considered as reference. Under simple assumptions, these networks of so-called “self-sampled” all-digital phase-locked-loops (SS-ADPLLs) can be described as piecewise-linear systems, the stability of which is notoriously difficult to establish. One of the main contributions presented here is the definition of simple design rules that must be satisfied by the coefficients of each loop-filter in order to achieve synchronization in a Cartesian network of arbitrary size. Transient simulations indicate that this necessary synchronization condition may also be sufficient for a specific class of SS-ADPLLs.
|
Page generated in 0.0442 seconds