Global ETD Search

341	Parallel Multiplier Designs for the Galois/Counter Mode of Operation Patel, Pujan January 2008 (has links) The Galois/Counter Mode of Operation (GCM), recently standardized by NIST, simultaneously authenticates and encrypts data at speeds not previously possible for both software and hardware implementations. In GCM, data integrity is achieved by chaining Galois field multiplication operations while a symmetric key block cipher such as the Advanced Encryption Standard (AES), is used to meet goals of confidentiality. Area optimization in a number of proposed high throughput GCM designs have been approached through implementing efficient composite Sboxes for AES. Not as much work has been done in reducing area requirements of the Galois multiplication operation in the GCM which consists of up to 30% of the overall area using a bruteforce approach. Current pipelined implementations of GCM also have large key change latencies which potentially reduce the average throughput expected under traditional internet traffic conditions. This thesis aims to address these issues by presenting area efficient parallel multiplier designs for the GCM and provide an approach for achieving low latency key changes. The widely known Karatsuba parallel multiplier (KA) and the recently proposed Fan-Hasan multiplier (FH) were designed for the GCM and implemented on ASIC and FPGA architectures. This is the first time these multipliers have been compared with a practical implementation, and the FH multiplier showed note worthy improvements over the KA multiplier in terms of delay with similar area requirements. Using the composite Sbox, ASIC designs of GCM implemented with subquadratic multipliers are shown to have an area savings of up to 18%, without affecting the throughput, against designs using the brute force Mastrovito multiplier. For low delay LUT Sbox designs in GCM, although the subquadratic multipliers are a part of the critical path, implementations with the FH multiplier showed the highest efficiency in terms of area resources and throughput over all other designs. FPGA results similarly showed a significant reduction in the number of slices using subquadratic multipliers, and the highest throughput to date for FPGA implementations of GCM was also achieved. The proposed reduced latency key change design, which supports all key types of AES, showed a 20% improvement in average throughput over other GCM designs that do not use the same techniques. The GCM implementations provided in this thesis provide some of the most area efficient, yet high throughput designs to date. GCM AES VLSI FPGA Karatsuba Multiplier Fan-Hasan Multiplier Electrical and Computer Engineering
342	CAD Techniques for Robust FPGA Design Under Variability Kumar, Akhilesh January 2010 (has links) The imperfections in the semiconductor fabrication process and uncertainty in operating environment of VLSI circuits have emerged as critical challenges for the semiconductor industry. These are generally termed as process and environment variations, which lead to uncertainty in performance and unreliable operation of the circuits. These problems have been further aggravated in scaled nanometer technologies due to increased process variations and reduced operating voltage. Several techniques have been proposed recently for designing digital VLSI circuits under variability. However, most of them have targeted ASICs and custom designs. The flexibility of reconfiguration and unknown end application in FPGAs make design under variability different for FPGAs compared to ASICs and custom designs, and the techniques proposed for ASICs and custom designs cannot be directly applied to FPGAs. An important design consideration is to minimize the modifications in architecture and circuit to reduce the cost of changing the existing FPGA architecture and circuit. The focus of this work can be divided into three principal categories, which are, improving timing yield under process variations, improving power yield under process variations and improving the voltage profile in the FPGA power grid. The work on timing yield improvement proposes routing architecture enhancements along with CAD techniques to improve the timing yield of FPGA designs. The work on power yield improvement for FPGAs selects a low power dual-Vdd FPGA design as the baseline FPGA architecture for developing power yield enhancement techniques. It proposes CAD techniques to improve the power yield of FPGAs. A mathematical programming technique is proposed to determine the parameters of the buffers in the interconnect such as the sizes of the transistors and threshold voltage of the transistors, all within constraints, such that the leakage variability is minimized under delay constraints. Two CAD techniques are investigated and proposed to improve the supply voltage profile of the power grids in FPGAs. The first technique is a place and route technique and the second technique is a logic clustering technique to reduce IR-drops and spatial variation of supply voltage in the power grid. FPGA Variability Power Delay CAD Statistical IR-drop VLSI Electrical and Computer Engineering
343	Sleepy Stack: a New Approach to Low Power VLSI and Memory Park, Jun Cheol 19 July 2005 (has links) New low power solutions for Very Large Scale Integration (VLSI) are proposed. Especially, we focus on leakage power reduction. Although neglected at 0.18u technology and above, leakage power is nearly equal to dynamic power consumption in nanoscale technology, e.g., 0.07u. We present a novel circuit structure, we call it sleepy stack, which is a combination of two well-known low-leakage techniques: the forced stack and sleep transistor techniques. Unlike the forced stack technique, the sleepy stack technique can utilize high-Vth transistors without incurring a large delay increase. Also, unlike the sleep transistor technique, the sleepy stack technique can retain exact logic state while achieving similar leakage power savings. In short, our sleepy stack structure achieves ultra-low leakage power consumption while retaining logic state. We apply the sleepy stack technique to both generic logic circuits as well as SRAM. At 0.07u technology, the sleepy stack logic circuits achieves up to 200X leakage reduction compared the forced stack technique with small (under 7%) delay variations and 51~118% area overheads. The sleepy stack SRAM cell with 1.5xVth achieves 5X leakage reduction with 32% delay increase or 2.49X leakage reduction without delay increase compared to the high-Vth SRAM cell. As such, the sleepy stack technique can be applicable to a design that requires ultra-low leakage power with quick response time while paying area and delay cost. We also propose a new low power architectural technique named Low-Power Pipelined Cache (LPPC). Although a conventional pipelined cache is mainly used to reduce cache access time, we lower supply voltage of cache using LPPC to save dynamic power. We achieve 20.43% processor dynamic energy savings with 4.14% execution cycle increase using 2-stage low-Vdd LPPC. Furthermore, we apply LPPC to the sleepy stack SRAM. The sleepy stack pipelined SRAM achieves 17X leakage power reduction while increasing execution time by 4% on average. Although this combined technique increases active power consumption by 33%, this technique is well suited for the system that spends most of its time in sleep mode. Low-power VLSI Pipelining Leakage power Low voltage systems
344	IMPACT OF DYNAMIC VOLTAGE SCALING (DVS) ON CIRCUIT OPTIMIZATION Esquit Hernandez, Carlos A. 16 January 2010 (has links) Circuit designers perform optimization procedures targeting speed and power during the design of a circuit. Gate sizing can be applied to optimize for speed, while Dual-VT and Dynamic Voltage Scaling (DVS) can be applied to optimize for leakage and dynamic power, respectively. Both gate sizing and Dual-VT are design-time techniques, which are applied to the circuit at a fixed voltage. On the other hand, DVS is a run-time technique and implies that the circuit will be operating at a different voltage than that used during the optimization phase at design-time. After some analysis, the risk of non-critical paths becoming critical paths at run-time is detected under these circumstances. The following questions arise: 1) should we take DVS into account during the optimization phase? 2) Does DVS impose any restrictions while performing design-time circuit optimizations?. This thesis is a case study of applying DVS to a circuit that has been optimized for speed and power, and aims at answering the previous two questions. We used a 45-nm CMOS design kit and flow. Synthesis, placement and routing, and timing analysis were applied to the benchmark circuit ISCAS?85 c432. Logical Effort and Dual-VT algorithms were implemented and applied to the circuit to optimize for speed and leakage power, respectively. Optimizations were run for the circuit operating at different voltages. Finally, the impact of DVS on circuit optimization was studied based on HSPICE simulations sweeping the supply voltage for each optimization. The results showed that DVS had no impact on gate sizing optimizations, but it did on Dual-VT optimizations. It is shown that we should not optimize at an arbitrary voltage. Moreover, simulations showed that Dual-VT optimizations should be performed at the lowest voltage that DVS is intended to operate, otherwise non-critical paths will become critical paths at run-time. VLSI DVS dynamic voltage scaling circuit optimization logical effort dual-vt
345	Lattice reduction for MIMO detection: from theoretical analysis to hardware realization Gestner, Brian Joseph 04 April 2011 (has links) The objective of the dissertation research is to understand the complex interaction between the algorithm and hardware aspects of symbol detection that is enhanced by lattice reduction (LR) preprocessing for wireless MIMO communication systems. The motivation for this work stems from the need to improve the bit-error-rate performance of conventional, low-complexity detectors while simultaneously exhibiting considerably reduced complexity when compared to the optimal method, maximum likelihood detection. Specifically, we first develop an understanding of the complex Lenstra-Lenstra-Lovász (CLLL) LR algorithm from a hardware perspective. This understanding leads to both algorithm modifications that reduce the required complexity and hardware architectures that are specifically optimized for the CLLL algorithm. Finally, we integrate this knowledge with an understanding of LR-aided MIMO symbol detection in a highly-correlated wireless environment, resulting in a joint LR/symbol detection algorithm that maps seamlessly to hardware. Hence, this dissertation forms the foundation for the adoption of lattice reduction algorithms in practical, high-throughput wireless MIMO communications systems. MIMO Wireless Lattice reduction VLSI Hardware FPGA MIMO systems Wireless communication systems Algorithms
346	Asynchronous Wrapper for Globally Asynchronous Locally Synchronous Systems / Asynkron wrapper för globalt asynkrona lokalt synkrona system Manbo, Olof January 2002 (has links) <p>This thesis is investigating the new globally asynchronous locally synchronous (GALS) technology for integrated circuits. Different types of asynchronous wrappers are tested and a new wrapper design is presented. It also investigates the possibility to use VHDL for asynchronous simulation and synthesis. The conclusions are that the GALS technology is possible to use but that it needs new synthesis tools, because todays tools are designed for synchronous technology.</p> Electronics GALS asynchronous wrapper electronics VHDL FPGA VLSI Elektronik Electronics Elektronik
347	A Fully Integrated High-Temperature, High-Voltage, BCD-on-SOI Voltage Regulator McCue, Benjamin Matthew 01 May 2010 (has links) Developments in automotive (particularly hybrid electric vehicles), aerospace, and energy production industries over the recent years have led to expanding research interest in integrated circuit (IC) design toward high-temperature applications. A high-voltage, high-temperature SOI process allows for circuit design to expand into these extreme environment applications. Nearly all electronic devices require a reliable supply voltage capable of operating under various input voltages and load currents. These input voltages and load currents can be either DC or time-varying signals. In this work, a stable supply voltage for embedded circuit functions is generated on chip via a voltage regulator circuit producing a stable 5-V output voltage. Although applications of this voltage regulator are not limited to gate driver circuits, this regulator was developed to meet the demands of a gate driver IC. The voltage regulator must provide reliable output voltage over an input range from 10 V to 30 V, a temperature range of −50 ºC to 200 ºC, and output loads from 0 mA to 200 mA. Additionally, low power stand-by operation is provided to help reduce heat generation and thus lower operating junction temperature. This regulator is based on the LM723 Zener reference voltage regulator which allows stable performance over temperature (provided proper design of the temperature compensation scheme). This circuit topology and the SOI silicon process allow for reliable operation under all application demands. The designed voltage regulator has been successfully tested from −50 ºC to 200 ºC while demonstrating an output voltage variation of less than 25 mV under the full range of input voltage. Line regulation tests from 10 V to 35 V show a 3.7-ppm/V supply sensitivity. With the use of a high-temperature ceramic output capacitor, a 5-nsec edge, 0 to 220 mA, 1-µsec pulse width load current induced only a 55 mV drop in regulator output voltage. In the targeted application, load current pulse widths will be much shorter, thereby improving the load transient performance. Full temperature and input voltage range tests reveal the no-load supply current draw is within 330 µA while still providing an excess of 200 mA of load current upon demand. voltage regulator high temperature high voltage silicon on insulator gate driver LM723
348	A METHODOLOGY OF SPICE SIMULATION TO EXTRACT SRAM SETUP AND HOLD TIMING PARAMETERS BASED ON DFF DELAY DEGRADATION Zhang, Xiaowei 01 January 2015 (has links) SRAM is a significant component in high speed computer design, which serves mainly as high speed storage elements like register files in microprocessors, or the interface like multiple-level caches between high speed processing elements and low speed peripherals. One method to design the SRAM is to use commercial memory compiler. Such compiler can generate different density/speed SRAM designs with single/dual/multiple ports to fulfill design purpose. There are discrepancy of the SRAM timing parameters between extracted layout netlist SPICE simulation vs. equation-based Liberty file (.lib) by a commercial memory compiler. This compiler takes spec values as its input and uses them as the starting points to generate the timing tables/matrices in the .lib. Originally large spec values are given to guarantee design correctness. While such spec values are usually too pessimistic when comparing with the results from extracted layout SPICE simulation, which serves as the “golden” rule. Besides, there is no margin information built-in such .lib generated by this compiler. A new methodology is proposed to get accurate spec values for the input of this compiler to generate more realistic matrices in .lib, which will benefit during the integration of the SRAM IP and timing analysis. SRAM Timing Parameters SPICE Liberty File DFF
349	Statistical methods for rapid system evaluation under transient and permanent faults Mirkhani, Shahrzad 10 February 2015 (has links) Traditional solutions for test and reliability do not scale well for modern designs with their size and complexity increasing with every technology generation. Therefore, in order to meet time-to-market requirements as well as acceptable product quality, it is imperative that new methodologies be developed for quickly evaluating a system in the presence of faults. In this research, statistical methods have been employed and implemented to 1) estimate the stuck-at fault coverage of a test sequence and evaluate the given test vector set without the need for complete fault simulation, and 2) analyze design vulnerabilities in the presence of radiation-based (soft) errors. Experimental results show that these statistical techniques can evaluate a system under test orders of magnitude faster than state-of-the-art methods with a small margin of error. In this dissertation, I have introduced novel methodologies that utilize the information from fault-free simulation and partial fault simulation to predict the fault coverage of a long sequence of test vectors for a design under test. These methodologies are practical for functional testing of complex designs under a long sequence of test vectors. Industry is currently seeking efficient solutions for this challenging problem. The last part of this dissertation discusses a statistical methodology for a detailed vulnerability analysis of systems under soft errors. This methodology works orders of magnitude faster than traditional fault injection. In addition, it is shown that the vulnerability factors calculated by this method are closer to complete fault injection (which is the ideal way of soft error vulnerability analysis), compared to statistical fault injection. Performing such a fast soft error vulnerability analysis is very cruicial for companies that design and build safety-critical systems. / text Functional fault grading VLSI testing Fault coverage estimation Soft error Vulnerability factors
350	Mechanical stress and circuit aging aware VLSI CAD Chakraborty, Ashutosh 09 February 2011 (has links) With the gradual advance of the state-of-the-art VLSI manufacturing technology into the sub-45nm regime, engineering a reliable, high performance VLSI chip with economically attractive yield in accordance with Moore's law of scaling and integration has become extremely difficult. Some of the most serious challenges that make this task difficult are: a) the delay of a transistor is strongly dependent on process induced mechanical stress around it, b) the reliability of devices is affected by several aging mechanisms like Negative Bias Temperature Instability (NBTI), hot carrier injection (HCI), etc and c) the delay and reliability of any device are strongly related to lithographically drawn geometry of various features on wafer. These three challenges are the main focus of this dissertation. High performance fabrication processes routinely use embedded silicon-germanium (eSiGe) technology that imparts compressive mechanical stress to PMOS devices. In this work, cell level timing models considering flexibility to modulate active area to change mechanical stress, were proposed and exploited to perform timing optimization during circuit placement phase. Analysis of key physical synthesis optimization steps such as gate sizing and repeater insertion was done to understand and exploit mechanical stress to significantly improve delay of interconnect and device dominated circuits. Regarding circuit reliability, the proposed work is focused on reducing the clock skew degradation due to NBTI effect specially due to the use of clock gating technique for achieving low power operation. In addition, we also target the detrimental impact of burn-in testing on NBTI. The problem is identified and a runtime technique to reduce clock skew increase was proposed. For designs with predictable clock gating activities, a zero overhead design time technique was proposed to reduce clock skew increase over time. The concept of using minimum degradation input vector during static burn-in testing is proposed to reduce the impact of burn-in testing on parametric yield. Delay and reliability strongly depend on dimension of various features on the wafer such as gate oxide thickness, channel length and contact position. Increased variability of these dimensions can severely restrict ability to analyze or optimize a design considering mechanical stress and circuit reliability. One key technique to control physical variability is to move towards regular fabrics. However, to make implementation on regular fabrics attractive, high quality physical design tools need to be developed. This dissertation proposes a new circuit placement algorithm to place a design on a structured ASIC platform with strict site and clock constraints and excellent overall wirelength. An algorithm for reducing the clock and leakage power dissipation of a structured ASIC by reducing spine usage is then proposed to allow lower power dissipation of designs implemented using structured ASICs. / text NBTI Negative bias temperature instability Stress Mobility Reliability CAD VLSI Mechanical stress Strain Aging PBTI

Search results