Global ETD Search

1	Design, Implementation and Application of a Digital Signal Processor Li, Tsung-Ken 25 July 2005 (has links) This thesis discusses the implementation of a digital signal processor (DSP), including the DSP core and the peripheral interfaces. The DSP core includes three parallel computational units (arithmetic/logic unit, multiplier/accumulator, and barrel shifter), two independent data address generators, and a powerful program sequencer. The I/O designs provide two kinds of interfaces: serial ports and direct memory access (DMA) ports. The DMA contains two modes: full memory mode and host mode. To reduce power consumption in the instruction memory access, we add an instruction buffer for nested loops where the instructions in a loop are fetched only once and then put into the instruction buffer to be used in the subsequent iterations. The DSP implementation has passed the verification procedures both in the front-end synthesis by Synopsys Design Compiler and the back-end post-layout simulation by Nanosim. Furthermore, some benchmark DSP application programs such as FFT, FIR, and DCT are executed on the implemented DSP core. Digital Signal Processor Low Power Design
2	Design and analysis of an integrated low-power ultra-wideband receiver Lu, Ivan Siu-Chuang, Computer Science & Engineering, Faculty of Engineering, UNSW January 2006 (has links) This thesis documents the design and analysis of a low-power integrated ultra-wideband (UWB) receiver that is well suited for usage in medium to low rate, location aware communication systems. For the first time, this receiver design explores and exploits the unique properties of UWB pulse technology. By exploiting low emission power limit and pulse based communication, RF circuits have been designed with reduced linearity to achieve low-power operation and better circuit performance. The receiver design in this thesis follows a top-down approach which begins by focusing on UWB-specific issues such as signal characteristics, modulation schemes, potential advantages, and design challenges. Next, different receiver architectures are evaluated in terms of their circuit complexity, power consumption, and levels of integration. The impact of various analog non-idealities on the performance of UWB systems is also analysed in detail. After evaluating the performance of UWB systems operating with non-linear frontends, the use of pulse doublets is introduced, for the first time, to mitigate nonlinearityinduced distortion. Simulation results demonstrate that under non-linear operating conditions, significant BER improvements can be achieved by using filtering, pulse doublet, and direct sequence spread spectrum techniques. When ADC quantization effects are included in the receiver, analysis shows that quantization noise dominates distortion-induced BER degradation when two or three bits ADCs are employed. Consequently, reduced front-end linearity requirements can be tolerated in exchange for improvements in the more critical circuit parameters of the UWB receiver. By adopting the sub-linear circuit design approach, a direct-conversion receiver prototype is implemented in the 0.5 um SOS CMOS technology according to specifications determined from system-level Simulink simulations. This highly integrated receiver prototype contains a low-noise amplifier, a 4-GHz frequency synthesizer, mixers, baseband amplifiers and filters, and 2-GSps two-bit analog-to-digital converters. The receiver prototype consumes 75-mW of power, the lowest amount for reported UWB receivers operating in the 3.1 to 10.6-GHz band. Complete end-to-end simulations of the system are performed in Simulink, revealing an achievable BER of approximately 8x10e-4 Finally, a novel 79-uW 5.6-GHz CMOS frequency divider with on-chip temperature and processing compensation have been designed. The divider, designed in a 0.25 um SOS-CMOS technology, occupies 35 x 25 um2 and achieves an operating frequency of 5.6-GHz while consuming 79-uW at a supply voltage of 0.8V. The power efficiency of 143-GHz/mW is one of the highest achieved among conventional CMOS dividers. When combined with a simple and effective compensation submodule, the proposed divider is shown to achieve process and temperature-insensitive operation in a 5-GHz UNII band frequency synthesizer. Ultra wideband linearity tradeoff system modelling low power design
3	Fused floating-point arithmetic for application specific processors Min, Jae Hong 25 February 2014 (has links) Floating-point computer arithmetic units are used for modern-day computers for 2D/3D graphic and scientific applications due to their wider dynamic range than a fixed-point number system with the same word-length. However, the floating-point arithmetic unit has larger area, power consumption, and latency than a fixed-point arithmetic unit. It has become a big issue in modern low-power processors due to their limited power and performance margins. Therefore, fused architectures have been developed to improve floating-point operations. This dissertation introduces new improved fused architectures for add-subtract, sum-of-squares, and magnitude operations for graphics, scientific, and signal processing. A low-power dual-path fused floating-point add-subtract unit is introduced and compared with previous fused add-subtract units such as the single path and the high-speed dual-path fused add-subtract unit. The high-speed dual-path fused add-subtract unit has less latency compared with the single-path unit at a cost of large power consumption. To reduce the power consumption, an alternative dual-path architecture is applied to the fused add-subtract unit. The significand addition, subtraction and round units are performed after the far/close path. The power consumption of the proposed design is lower than the high-speed dual-path fused add-subtract unit at a cost in latency; however, the proposed fused unit is faster than the single-path fused unit. High-performance and low-power floating-point fused architectures for a two-term sum-of-squares computation are introduced and compared with discrete units. The fused architectures include pre/post-alignment, partial carry-sum width, and enhanced rounding. The fused floating-point sum-of-squares units with the post-alignment, 26 bit partial carry-sum width, and enhanced rounding system have less power-consumption, area, and latency compared with discrete parallel dot-product and sum-of-squares units. Hardware tradeoffs are presented between the fused designs in terms of power consumption, area, and latency. For example, the enhanced rounding processing reduces latency with a moderate cost of increased power consumption and area. A new type of fused architecture for magnitude computation with less power consumption, area, and latency than conventional discrete floating-point units is proposed. Compared with the discrete parallel magnitude unit realized with conventional floating-point squarers, an adder, and a square-root unit, the fused floating-point magnitude unit has less area, latency, and power consumption. The new design includes new designs for enhanced exponent, compound add/round, and normalization units. In addition, a pipelined structure for the fused magnitude unit is shown. / text Floating point arithmetic Digital arithmetic Low power design
4	Energy and Reliability in Future NOC Interconnected CMPS Kim, Hyungjun 16 December 2013 (has links) In this dissertation, I explore energy and reliability in future NoC (Network-on-Chip) interconnected CMPs (chip multiprocessors) as they have become a first-order constraint in future CMP design. In the first part, we target the root cause of network energy consumption through techniques that reduce link and router-level switching activity. We specifically focus on memory subsystem traffic, as it comprises the bulk of NoC load in a CMP. By transmitting only the flits that contain words that we predicted would be useful using a novel spatial locality predictor, our scheme seeks to reduce network activity. We aim to further lower NoC energy consumption through microarchitectural mechanisms that inhibit datapath switching activity caused by unused words in individual flits. Using simulation-based performance studies and detailed energy models based on synthesized router designs and different link wire types, we show that (a) the pre- diction mechanism achieves very high accuracy, with an average rate of false-unused prediction of just 2.5%; (b) the combined NoC energy savings enabled by the predictor and microarchitectural support are 36% on average and up to 57% in the best case; and (c) there is no system performance penalty as a result of this technique. In the second part, we present a method for dynamic voltage/frequency scaling of networks-on-chip and last level caches in CMP designs, where the shared resources form a single voltage/frequency domain. We develop a new technique for monitoring and control and validate it by running PARSEC benchmarks through full system simulations. These techniques reduce energy-delay product by 46% compared to a state-of-the-art prior work. In the third part, we develop critical path models for HCI- and NBTI-induced wear assuming stress caused under realistic workload conditions, and apply them onto the interconnect microarchitecture. A key finding from this modeling is that, counter to prevailing wisdom, wearout in the CMP on-chip interconnect is correlated with a lack of load observed in the NoC routers, rather than high load. We then develop a novel wearout-decelerating scheme in which routers under low load have their wearout-sensitive components exercised without significantly impacting the router’s cycle time, pipeline depth, and area or power consumption. We subsequently show that the proposed design yields a 13.8∼65× increase in CMP lifetime. Chip-Multiprocessor Network-on-Chip Low Power Design Reliability
5	Smart Antennas at Handsets for the 3G Wideband CDMA Systems and Adaptive Low-Power Rake Combining Schemes Kim, Suk Won 06 August 2002 (has links) Smart antenna technology is a promising means to overcome signal impairments in wireless personal communications. When spatial signal processing achieved through smart antennas is combined with temporal signal processing, the space-time processing can mitigate interference and multipath to yield higher network capacity, coverage, and quality. In this dissertation, we propose a dual smart antenna system incorporated into handsets for the third generation wireless personal communication systems in which the two antennas are separated by a quarter wavelength (3.5 cm). We examine the effectiveness of a dual smart antenna system with diversity and adaptive combining schemes and propose a new combining scheme called hybrid combining. The proposed hybrid combiner combines diversity combiner and adaptive combiner outputs using maximal ratio combining (MRC). Since these diversity combining and adaptive combining schemes exhibit somewhat opposite and complementary characteristics, the proposed hybrid combining scheme aims to exploit the advantages of the two schemes. To model dual antenna signals, we consider three channel models: loosely correlated fading channel model (LCFCM), spatially correlated fading channel model (SCFCM), and envelope correlated fading channel model (ECFCM). Each antenna signal is assumed to have independent Rayleigh fading in the LCFCM. In the SCFCM, each antenna signal is subject to the same Rayleigh fading, but is different in the phase due to a non-zero angle of arrival (AOA). The LCFCM and the SCFCM are useful to evaluate the upper and the lower bounds of the system performance. To model the actual channel of dual antenna signals lying in between these two channel models, the ECFCM is considered. In this model, two Rayleigh fading antenna signals for each multipath are assumed to have an envelope correlation and a phase difference due to a non-zero AOA. To obtain the channel profile, we adopted not only the geometrically based single bounce (GBSB) circular and elliptical models, but also the International Telecommunication Union (ITU) channel model. In this dissertation, we also propose a new generalized selection combining (GSC) method called minimum selection GSC (MS-GSC) and an adaptive rake combining scheme to reduce the power consumption of mobile rake receivers. The proposed MS-GSC selects a minimum number of branches as long as the combined SNR is maintained larger than a given threshold. The proposed adaptive rake combining scheme which dynamically determines the threshold values is applicable to the three GSC methods: the absolute threshold GSC, the normalized threshold GSC, and the proposed MS-GSC. Through simulation, we estimated the effectiveness of the proposed scheme for a mobile rake receiver for a wideband CDMA system. We also suggest a new power control strategy to maximize the benefit of the proposed adaptive scheme. / Ph. D. Hybrid Combining Adaptive Rake Combiner Low-Power Design Smart Antennas
6	Low power design implementation of a signal acquisition module Thakur, Ravi Bhushan January 1900 (has links) Master of Science / Department of Electrical and Computer Engineering / Don M. Gruenbacher / As semiconductor technologies advance, the smallest feature sizes that can be fabricated get smaller. This has led to the development of high density FPGAs capable of supporting high clock speeds, which allows for the implementation of larger more complex designs on a single chip. Over the past decade the technology market has shifted toward mobile devices with low power consumption at or near the top of design considerations. By reducing power consumption in FPGAs we can achieve greater reliability, lower cooling cost, simpler power supply and delivery, and longer battery life. In this thesis, FPGA technology is discussed for the design and commercial implementation of low power systems as compared to ASICs or microprocessors, and a few techniques are suggested for lowering power consumption in FPGA designs. The objective of this research is to implement some of these approaches and attempt to design a low power signal acquisition module. Designing for low power consumption without compromising performance requires a power-efficient FPGA architecture and good design practices to leverage the architectural features. With various power conservation techniques suggested for every stage of the FPGA design flow, the following approach was used in the design process implementation: the switching activity is addressed in the design entry, and synthesis level and software tools are utilized to get an initial estimate of and optimize the design’s power consumption. Finally, the device choice is made based on its features that will enhance the optimization achieved in the previous stages; it is configured and real time board level power measurements are made to verify the implementation’s efficacy Low power design FPGA Verilog Cyclone II
7	Process variation aware low power buffer design Lok, Mario Chichun 26 October 2010 (has links) In many digital designs there is a need to use multi-stage tapered buffers to drive large capacitive loads. These buffers contribute a significant percentage of overall power. In this thesis, we propose two novel tunable buffer designs that enable reduction in power in the presence of process variation. A strategy to derive the optimal buffer size and the optimal tuning rule in post-silicon phase is developed. By comparing several tunable buffer circuit topologies, we also demonstrate the tradeoffs in tunable buffer topology selection as a function of switching activity, timing requirements, and the magnitude of process variations. Using HSPICE simulations based on the high performance 32nm ASU Predictive Model, we show that up to 30% average power reduction can be achieved for a SRAM word-line decoder while maintaining the same timing yield. / text Low power design Adaptive circuit Statistical sizing Tunable circuit Adaptable optimization
8	Guarded Evaluation: An Algorithm for Dynamic Power Reduction in FPGAs Ravishankar, Chirag January 2012 (has links) Guarded evaluation is a power reduction technique that involves identifying sub-circuits (within a larger circuit) whose inputs can be held constant (guarded) at specific times during circuit operation, thereby reducing switching activity and lowering dynamic power. The concept is rooted in the property that under certain conditions, some signals within digital designs are not "observable" at design outputs, making the circuitry that generates such signals a candidate for guarding. Guarded evaluation has been demonstrated successfully for custom ASICs; in this work, we apply the technique to FPGAs. In ASICs, guarded evaluation entails adding additional hardware to the design, increasing silicon area and cost. Here, we apply the technique in a way that imposes minimal area overhead by leveraging existing unused circuitry within the FPGA. The LUT functionality is modified to incorporate the guards and reduce toggle rates. The primary challenge in guarded evaluation is in determining the specific conditions under which a sub-circuit's inputs can be held constant without impacting the larger circuit's functional correctness. We propose a simple solution to this problem based on discovering gating inputs using "non-inverting paths" and trimming inputs using "partial non-inverting paths" in the circuit's AND-Inverter graph representation. Experimental results show that guarded evaluation can reduce switching activity by as much as 32% for FPGAs with 6-LUT architectures and 25% for 4-LUT architectures, on average, and can reduce power consumption in the FPGA interconnect by 29% for 6-LUTs and 27% for 4-LUTs. A clustered architecture with four LUTs to a cluster and ten LUTs to a cluster produced the best power reduction results. We implement guarded evaluation at various stages of the FPGA CAD flow and analyze the reductions. We implement the algorithm as post technology mapping, post packing and post placement optimizations. Guarded Evaluation as a post technology mapping algorithm inserted the most number of guards and hence achieved the highest activity and interconnect reduction. However, guarding signals come with a cost of increased fanout and stress on routing resources. Packing and placement provides the algorithm with additional information of the circuit which is leveraged to insert high quality guards with minimal impact on routing. Experimental results show that post-packing and post-placement methods have comparable reductions to post-mapping with considerably lesser impact on the critical path delay and routability of the circuit. Field-programmable gate arrays Power optimization low-power design logic synthesis technology mapping Electrical and Computer Engineering
9	Design of a High-Speed CMOS Comparator Shar, Ahmad January 2007 (has links) <p>This master thesis describes the design of high-speed latched comparator with 6-bit resolution, full scale voltage of 1.6 V and the sampling frequency of 250 MHz. The comparator is designed in a 0.35 9m CMOS process with a supply voltage of 3.3 V.</p><p>The comparator is designed for time-interleaved bandpass sigma-delta ADC.</p><p>Due to the nature of the target application, it should be possible to turn off the components to avoid the static power consumption. The comparator of this design implements the turn off technique when it is not in use. The settling time of the comparator is less than half the clock cycle which means it does not effect the functionality of the bandpass sigma-delta ADC in terms of speed.</p><p>The simulation results are derived using Cadence environment. The results show that the comparator has 6-bit resolution and power consumption of 4.13 mW for the worst-case frequency of 250 MHz. It fulfills all the performance requirements, most of them with large margins.</p> Comparator CMOS comparator Sigma-delta ADC Low power design High-speed. Electrical engineering Elektroteknik
10	Design of a High-Speed CMOS Comparator Shar, Ahmad January 2007 (has links) This master thesis describes the design of high-speed latched comparator with 6-bit resolution, full scale voltage of 1.6 V and the sampling frequency of 250 MHz. The comparator is designed in a 0.35 9m CMOS process with a supply voltage of 3.3 V. The comparator is designed for time-interleaved bandpass sigma-delta ADC. Due to the nature of the target application, it should be possible to turn off the components to avoid the static power consumption. The comparator of this design implements the turn off technique when it is not in use. The settling time of the comparator is less than half the clock cycle which means it does not effect the functionality of the bandpass sigma-delta ADC in terms of speed. The simulation results are derived using Cadence environment. The results show that the comparator has 6-bit resolution and power consumption of 4.13 mW for the worst-case frequency of 250 MHz. It fulfills all the performance requirements, most of them with large margins. Comparator CMOS comparator Sigma-delta ADC Low power design High-speed. Electrical engineering Elektroteknik

Search results