Global ETD Search

51	Design and Analysis of Modular Architectures for an RNS to Mixed Radix Conversion Multi-processor Shivashankar, Nithin 27 October 2014 (has links) No description available. Computer Engineering Residue Number System Mixed Radix RNS to Mixed Radix Conversion Multi-processor FPGA parallelization pipelining Modular Inverse
52	Implementation of Pipelined Bit-parallel Adders Wei, Lan January 2003 (has links) <p>Bit-parallel addition can be performed using a number of adder structures with different area and latency. However, the power consumption of different adder structures is not well studied. Further, the effect of pipelining adders to increase the throughput is not well studied. In this thesis four different adders are described, implemented in VHDL and compared after synthesis. The results give a general idea of the time-delay-power tradeoffs between the adder structures. Pipelining is shown to be a good technique for increasing the circuit speed.</p> Electronics Adder Ripple carry adder Carry look-ahead adder Carry select adder Carry save adder Pipelining Power consumption Elektronik Electronics Elektronik
53	Implementation of Pipelined Bit-parallel Adders Wei, Lan January 2003 (has links) Bit-parallel addition can be performed using a number of adder structures with different area and latency. However, the power consumption of different adder structures is not well studied. Further, the effect of pipelining adders to increase the throughput is not well studied. In this thesis four different adders are described, implemented in VHDL and compared after synthesis. The results give a general idea of the time-delay-power tradeoffs between the adder structures. Pipelining is shown to be a good technique for increasing the circuit speed. Electronics Adder Ripple carry adder Carry look-ahead adder Carry select adder Carry save adder Pipelining Power consumption Elektronik Electronics Elektronik
54	Throughput-Centric Wave-Pipelined Interconnect Circuits for Gigascale Integration Deodhar, Vinita Vasant 31 October 2005 (has links) The central thesis of this research is that VLSI interconnect design strategies should shift from using global wires that can support only a single binary transition during the latency of the line to global wires that can sustain multiple bits traveling simultaneously along the length of the line. It is shown in this thesis that such throughput-centric multibit transmission can be achieved by wave-pipelining the interconnects using repeaters. A holistic analysis of wave-pipelined interconnect circuits, along with the full-custom optimization of these circuits, is performed in this research. With the help of models and methodologies developed in this thesis, the design rules for repeater insertion are crafted to simultaneously optimize performance, power, and area of VLSI global interconnect networks through a simultaneous application of voltage scaling and wire sizing. A qualitative analysis of latency, throughput, signal integrity, power dissipation, and area is performed that compares the results of design optimizations in this work to those of conventional global interconnect circuits. The objective of this thesis is to study the circuit- and system-level opportunities of voltage scaling, wire sizing, and repeater insertion in wave-pipelined global interconnect networks that are implemented in deep submicron technologies. Digital integrated circuits Electronic circuits High throughput Interconnect Low-power Metal oxide semiconductors Wave-pipelining Complementary Integrated circuits Digital electronics
55	Wave Component Sampling Method For High Performance Pipelined Circuits Sever, Refik 01 September 2011 (has links) (PDF) In all of the previous pipelining methods such as conventional pipelining, wave pipelining, and mesochronous pipelining, a data wave propagating on the combinational circuit is sampled whenever it arrives to a synchronization stage. In this study, a new wave-pipelining methodology named as Wave Component Sampling Method (WCSM), is proposed. In this method, only the component of a wave, whose maximum and minimum delay difference exceeds the tolerable value, is sampled, and the other components continue to propagate on the circuit. Therefore, the total number of registers required for synchronization decreases significantly. For demonstrating the effectiveness of the proposed WCSM, an 8x8 bit carry save In all of the previous pipelining methods such as conventional pipelining, wave pipelining, and mesochronous pipelining, a data wave propagating on the combinational circuit is sampled whenever it arrives to a synchronization stage. In this study, a new wave-pipelining methodology named as Wave Component Sampling Method (WCSM), is proposed. In this method, only the component of a wave, whose maximum and minimum delay difference exceeds the tolerable value, is sampled, and the other components continue to propagate on the circuit. Therefore, the total number of registers required for synchronization decreases significantly. For demonstrating the effectiveness of the proposed WCSM, an 8x8 bit carry save adder (CSA) multiplier is implemented using 0.18&micro / m CMOS technology. A generic transmission gate logic block with optimized output delay variation depending on the input pattern is designed and used in all of the sub blocks of the multiplier. Post layout simulation results show that, this multiplier can operate at a speed of 3GHz, using only 70 latches. Comparing with the mesochronous pipelining scheme, the number of the registers is decreased by 41% and the total power of the chip is also decreased by 9.5% without any performance loss. An ultra high speed full pipelined CSA multiplier with an operating frequency of 5GHz is also implemented with WCSM. The number of registers is decreased by 45%, and the power consumption of the circuit is decreased by 18.4% comparing with conventional or mesochronous pipelining methods. WCSM is also applied to different multiplier structures employing booth encoders, Wallace trees, and carry look-ahead adders. Comparing full pipelined 8x8 bit WCSM multiplier with the conventional pipelined multiplier, the number of registers in the implementation of booth encoder, Wallace tree, and carry look-ahead adder is decreased by 30%, 51%, and %62, respectively.
56	Spill Code Minimization And Buffer And Code Size Aware Instruction Scheduling Techniques Nagarakatte, Santosh G 08 1900 (has links) Instruction scheduling and Software pipelining are important compilation techniques which reorder instructions in a program to exploit instruction level parallelism. They are essential for enhancing instruction level parallelism in architectures such as very Long Instruction Word and tiled processors. This thesis addresses two important problems in the context of these instruction reordering techniques. The first problem is for general purpose applications and architectures, while the second is for media and graphics applications for tiled and multi-core architectures. The first problem deals with software pipelining which is an instruction scheduling technique that overlaps instructions from multiple iterations. Software pipelining increases the register pressure and hence it may be required to introduce spill instructions. In this thesis, we model the problem of register allocation with optimal spill code generation and scheduling in software pipelined loops as a 0-1 integer linear program. By minimizing the amount of spill code produced, the formulation ensures that the initiation interval (II) between successive iterations of the loop is not increased unnecessarily. Experimental results show that our formulation performs better than the existing heuristics by preventing an increase in the II and also generating less spill code on average among loops extracted from Perfect Club and SPEC benchmarks. The second major contribution of the thesis deals with the code size aware scheduling of stream programs. Large scale synchronous dataflow graphs (SDF’s) and StreamIt have emerged as powerful programming models for high performance streaming applications. In these models, a program is represented as a dataflow graph where each node represents an autonomous filter and the edges represent the channels through which the nodes communicate. In constructing static schedules for programs in these models, it is important to optimize the execution time buffer requirements of the data channel and the space required to store the encoded schedule. Earlier approaches have either given priority to one of the requirements or proposed ad-hoc methods for generating schedules with good trade-offs. In this thesis, we propose a genetic algorithm framework based on non-dominated sorting for generating serial schedules which have good trade-off between code size and buffer requirement. We extend the framework to generate software pipelined schedules for tiled architectures. From our experiments, we observe that the genetic algorithm framework generates schedules with good trade-off and performs better than the earlier approaches. Compilers Parallel Processing Instruction Scheduling Software Pipelining Spill Code Scheduling Stream Programs - Scheduling Genetic Algorithms Integer Linear Programming Software Pipelined Loops Register Allocation Computer Science
57	Chemnitzer Linux-Tage 2012 Schöner, Axel, Meier, Wilhelm, Kubieziel, Jens, Berger, Uwe, Götz, Sebastian, Leuthäuser, Max, Piechnick, Christian, Reimann, Jan, Richly, Sebastian, Schroeter, Julia, Wilke, Claas, Aßmann, Uwe, Schütz, Georg, Kastrup, David, Lang, Jens, Luithardt, Wolfram, Gachet, Daniel, Nasrallah, Olivier, Kölbel, Cornelius, König, Harald, Wachtler, Axel, Wunsch, Jörg, Vorwerk, Matthias, Knopper, Klaus, Meier, Wilhelm, Kramer, Frederik, Jamous, Naoum 20 April 2012 (has links) (PDF) Die Chemnitzer Linux-Tage sind eine Veranstaltung rund um das Thema Open Source. Im Jahr 2012 wurden 104 Vorträge und Workshops gehalten. Der Band enthält ausführliche Beiträge zu 14 Hauptvorträgen sowie Zusammenfassungen zu 90 weiteren Vorträgen. / The "Chemnitz Linux Days" is a conference that deals with Linux and Open Source Software. In 2012 104 talks and workshops were given. This volume contains papers of 14 main lectures and 90 abstracts. Linux Freie Software Elektronischer Zahlungsverkehr Authentifikation Leistungsmessung Parallelverarbeitung linux open Source electronic funds transfer authentication performance measurement pipelining ddc:004 LINUX Open Source Elektronischer Zahlungsverkehr Authentifikation Leistungsmessung Parallelverarbeitung
58	Efficient Resource Usage Modelling Ramanan, V Janaki 04 1900 (has links) (PDF) No description available. Compilers Resource Usage Models Group Automation Model Dynamic Collision Matrix Software Pipelining Group Automation Approach Resource Usage Modelling Instruction Scheduling Method Computer Science
59	Poloautomatizovaný návrh vysoce výkonných číslicových obvodů s Xilinx FPGA / Semi-automated Design of High-performance Digital Circuits with Xilinx FPGAs Houška, David January 2021 (has links) Tato diplomová práce se zabývá návrhem sekvenčních digitálních obvodů s ohledem na optimalizaci zpoždění. V práci je popsána problematika dvou technik, které jsou běžně používané při optimalizaci – stručně je popsána technika tzv. synchronizace registrů (angl. retiming), větší pozornost je však věnována technice tzv. zřetězení (angl. pipelining). V rámci praktické části byla vypracována forma abstrakce sekvenčních digitálních obvodů pomocí acyklických orientovaných grafů. Obvod je tak přenesen do roviny, ve které je jednodušší jej transformovat. Zároveň je představen nástroj pro polo-automatickou optimalizaci digitálních obvodů vyvíjených v prostředí Xilinx ISE Design Suite využitím techniky zřetězení.
60	Throughput Constrained and Area Optimized Dataflow Synthesis for FPGAs Sun, Hua 21 February 2008 (has links) (PDF) Although high-level synthesis has been researched for many years, synthesizing minimum hardware implementations under a throughput constraint for computationally intensive algorithms remains a challenge. In this thesis, three important techniques are studied carefully and applied in an integrated way to meet this challenging synthesis requirement. The first is pipeline scheduling, which generates a pipelined schedule that meets the throughput requirement. The second is module selection, which decides the most appropriate circuit module for each operation. The third is resource sharing, which reuses a circuit module by sharing it between multiple operations. This work shows that combining module selection and resource sharing while performing pipeline scheduling can significantly reduce the hardware area, by either using slower, more area-efficient circuit modules or by time-multiplexing faster, larger circuit modules, while meeting the throughput constraint. The results of this work show that the combined approach can generate on average 43% smaller hardware than possible when a single technique (resource sharing or module selection) is applied. There are four major contributions of this work. First, given a fixed throughput constraint, it explores all feasible frequency and data introduction interval design points that meet this throughput constraint. This enlarged pipelining design space exploration results in superior hardware architectures than previous pipeline synthesis work because of the larger sapce. Second, the module selection algorithm in this work considers different module architectures, as well as different pipelining options for each architecture. This not only addresses the unique architecture of most FPGA circuit modules, it also performs retiming at the high-level synthesis level. Third, this work proposes a novel approach that integrates the three inter-related synthesis techniques of pipeline scheduling, module selection and resource sharing. To the author's best knowledge, this is the first attempt to do this. The integrated approach is able to identify more efficient hardware implementations than when only one or two of the three techniques are applied. Fourth, this work proposes and implements several algorithms that explore the combined pipeline scheduling, module selection and resource sharing design space, and identifies the most efficient hardware architecture under the synthesis constraint. These algorithms explore the combined design space in different ways which represents the trade off between algorithm execution time and the size of the explored design space. high-level synthesis pipelining design space pipeline scheduling circuit module characterization module selection retiming resource sharing design space exploration Electrical and Computer Engineering

Search results