1 |
Application of parallel processing techniques to routing for VLSI designSagar, V. K. January 1990 (has links)
No description available.
|
2 |
Logic synthesis and technology mapping using genetic algorithmsZhuang, Nan January 1998 (has links)
No description available.
|
3 |
A NOVEL MULTIPLIER USING MODIFIED SHIFT AND ADD ALGORITHMMohammad, Sakib 01 September 2021 (has links)
Binary multiplier has been a staple in the digital circuit design. It is used in microprocessor design, DSP applications etc. Here, we discuss the design of a novel multiplier that employs a modified shift and add logic to multiply two n-bit unsigned binary numbers. In our work, we changed the shift and add algorithm. We used a barrel shifter and a multiplexer to generate the partial products. We also found out a way to reduce the number of partial products so that we would have fewer numbers to add after we generated all of them. An array of Carry Save Adders (CSA) is used to add the partial products. With all our arrangements and setups, we aim to reduce delays and make the design as efficient as possible. As examples, we have shown it to multiply two 16-bit numbers, however, the design can easily be either scaled up or down according to the environment the multiplier is being used.
|
4 |
Setting CMOS environment for VLSI designChung, Chih-Ping January 1989 (has links)
No description available.
|
5 |
Asynchronous spike event coding scheme for programmable analogue arrays and its computational applicationsGouveia, Luiz Carlos Paiva January 2012 (has links)
This work is the result of the definition, design and evaluation of a novel method to interconnect the computational elements - commonly known as Configurable Analogue Blocks (CABs) - of a programmable analogue array. This method is proposed for total or partial replacement of the conventional methods due to serious limitations of the latter in terms of scalability. With this method, named Asynchronous Spike Event Coding (ASEC) scheme, analogue signals from CABs outputs are encoded as time instants (spike events) dependent upon those signals activity and are transmitted asynchronously by employing the Address Event Representation (AER) protocol. Power dissipation is dependent upon input signal activity and no spike events are generated when the input signal is constant. On-line, programmable computation is intrinsic to ASEC scheme and is performed without additional hardware. The ability of the communication scheme to perform computation enhances the computation power of the programmable analogue array. The design methodology and a CMOS implementation of the scheme are presented together with test results from prototype integrated circuits (ICs).
|
6 |
Trapp : uma ferramenta para particionamento/posicionamento de celulas para metodologia tranca / A trapp tool for partitioning/placement of methodology tranca's cellsSchermer, Paulo Armando January 1995 (has links)
Este trabalho propõe e avalia um novo algoritmo para o posicionamento de células de circuitos que utilizam a metodologia de projeto TRANCA. O algoritmo proposto realiza o posicionamento por particionamento, em n-blocos, baseado no conceito de balanceamento de redes, realizando um pré-roteamento global. A maioria dos algoritmos de posicionamento por particionamento são baseados na heurística de Kernighan-Lin[KER 70] e Fidducia-Mattheyses[FID 82] com migração de grupos. Estes algoritmos utilizam uma função de corte mínimo para diminuir o cruzamento de redes entre as duas partições, produzindo regiões saturadas. Sendo assim, o conceito de balanceamento de redes significa a busca de um equilíbrio no comprimento das conexões para evitar a criação de regiões saturadas, diminuindo o tempo computacional e facilitando a etapa de roteamento. Apresenta-se uma visão geral de síntese automática. Descreve-se os estilos de projeto mais utilizados, define-se e analisa-se o problema de particionamento e posicionamento de células. As principais características da metodologia TRANCA são apresentadas. Resume-se as principais características das ferramentas de síntese TRANCA, destacando-se as etapas de particionamento e posicionamento de cada uma, visando o aproveitamento destas características positivas. Com o propósito de fundamentar os conceitos usados para o desenvolvimento do algoritmo, apresenta-se os métodos de posicionamento mais relevantes, dando destaque aqueles baseados em particionamento. Descreve-se algumas das heurísticas existentes. Os conceitos utilizados para o desenvolvimento do algoritmo são então descritos. O algoritmo consiste basicamente da distribuição das conexões, utilizando um mapa de congestionamento do circuito, o que caracteriza um pré-roteamento global. O mapa de congestionamento é montado sobre as partições geradas no circuito. Além do mapa de congestionamento, a descrição dos caminhos das redes é realizada sobre um modelo definido para controlar o cruzamento de redes. Apos a definição dos conceitos, o ambiente criado para o algoritmo é apresentado. Com o objetivo de validar os conceitos estudados e aqueles propostos, implementou-se um protótipo, chamado TRAPP(TRAnsparent Placement by Partitioning), e um visualizador de posicionamento chamado CIPPATO. Finalmente, alguns resultados do protótipo desenvolvido e uma avaliação sobre o comportamento dente protótipo são apresentados. Propõe também implementações alternativas e direções para trabalhos futuros. / This work proposes and evaluates a new algorithm for cells' placement, for use on TRANCA[REI 87] layouts. The algorithm proposed makes a placement by partitioning using multiple steps, based on the concept of net balancing, in order to make a global prerouting. Most partitioning algorithms are based on the Kernighan-Lin[KER 70] and Fidducia-Mattheyses[FID 82] heuristics with migration groups. These algorithms use a mincut heuristic to decrease the crossing nets between the two blocks, producing saturated regions. Therefore, the nets balancing concept means to search for a balance in the connections size to avoid satured regions, decreasing a computation time and to increase the routing performance. The global vision of automatic synthesis is shown. The main design styles are described and the placement and partitioning problems are analysed. The main features of TRANCA methodology are shown. A summary about the TRANCA synthesis tools is presented, emphasizing the partitioning and placement step in each one. This main features are evaluated. The basic ideas that suported the development of the algorithm are described. The algorithm provides a connection distribuition, using a congestion map of the circuit that describes a global pre-routing. The congestion map is generated based on the circuit partitioning. In addition (to the congestion map), the net paths are defined to control the crossing nets. After the definition of the concepts, the environment created for the algorithm is showed. The most important placement methods are studied and presented in order to provide a general picture of the problem. Among them, specifc attention is given to those based an partitioning. Some particular heuristics are detailed. A prototype system called TRAPP( TRAnsparent Placement by Partitioning) was developed to evaluate this approach. It is completed by a placement viewer, CIPPATO. Finally, some results and conclusions are presented. New implementations and directions for further works are proposed too.
|
7 |
Generic low power reconfigurable distributed arithmetic processorLiu, Zhenyu January 2009 (has links)
Higher performance, lower cost, increasingly minimizing integrated circuit components, and higher packaging density of chips are ongoing goals of the microelectronic and computer industry. As these goals are being achieved, however, power consumption and flexibility are increasingly becoming bottlenecks that need to be addressed with the new technology in Very Large-Scale Integrated (VLSI) design. For modern systems, more energy is required to support the powerful computational capability which accords with the increasing requirements, and these requirements cause the change of standards not only in audio and video broadcasting but also in communication such as wireless connection and network protocols. Powerful flexibility and low consumption are repellent, but their combination in one system is the ultimate goal of designers. A generic domain-specific low-power reconfigurable processor for the distributed arithmetic algorithm is presented in this dissertation. This domain reconfigurable processor features high efficiency in terms of area, power and delay, which approaches the performance of an ASIC design, while retaining the flexibility of programmable platforms. The architecture not only supports typical distributed arithmetic algorithms which can be found in most still picture compression standards and video conferencing standards, but also offers implementation ability for other distributed arithmetic algorithms found in digital signal processing, telecommunication protocols and automatic control. In this processor, a simple reconfigurable low power control unit is implemented with good performance in area, power and timing. The generic characteristic of the architecture makes it applicable for any small and medium size finite state machines which can be used as control units to implement complex system behaviour and can be found in almost all engineering disciplines. Furthermore, to map target applications efficiently onto the proposed architecture, a new algorithm is introduced for searching for the best common sharing terms set and it keeps the area and power consumption of the implementation at low level. The software implementation of this algorithm is presented, which can be used not only for the proposed architecture in this dissertation but also for all the implementations with adder-based distributed arithmetic algorithms. In addition, some low power design techniques are applied in the architecture, such as unsymmetrical design style including unsymmetrical interconnection arranging, unsymmetrical PTBs selection and unsymmetrical mapping basic computing units. All these design techniques achieve extraordinary power consumption saving. It is believed that they can be extended to more low power designs and architectures. The processor presented in this dissertation can be used to implement complex, high performance distributed arithmetic algorithms for communication and image processing applications with low cost in area and power compared with the traditional methods.
|
8 |
Design and Analysis of High-Speed Arithmetic ComponentsJuang, Tso-Bing 11 December 2004 (has links)
In this dissertation, the design and analysis of several fast arithmetic components are presented. Our contributions focus on the fast CORDIC rotation architectures and multipliers. In the CORDIC design, we proposed a fast rotation architecture that can reduce by half the average number of rotations. Furthermore, a new parallel CORDIC rotation algorithm and architecture (called para-CORDIC) is proposed that leads to smaller area and delay compared with the conventional CORDIC algorithm and previous works. In the design of the multiplier generator, a delay-efficient algorithm is used to perform the partial products summation and the final addition during the synthesis of fast parallel multipliers based on standard cell library or other full-custom circuit components. In the field of fixed-width multiplier designs, a lower-error fixed-width carry-free multiplier with low-cost compensation circuits is proposed that has smaller absolute average errors and variances compared with pervious methods.
|
9 |
Arquiteturas de alto desempenho e baixo custo em hardware para a estimação de movimento em vídeos digitais / High performance and low cost hardware architectures for digital videos motion estimationPorto, Marcelo January 2008 (has links)
A evolução das Tecnologias de Informação e Comunicação (TIC) favoreceu o crescimento do uso de variados meios na comunicação. Entre diversos meios, o vídeo em particular, necessita de uma grande banda para ser transmitido, ou de um grande espaço para ser armazenado. Uma análise dos diversos sinais de uma comunicação multimídia mostra, entretanto, que existe uma grande redundância de informação. Utilizando técnicas de compressão é possível reduzir de uma a duas ordens de grandeza a quantidade de informação veiculada, mantendo uma qualidade satisfatória. Uma das formas de compressão busca a relação de similaridade entre os quadros vizinhos de uma cena, identificando a redundância temporal existente entre as imagens. Essa técnica chama-se estimação de movimento, este processo é muito eficaz, mas o custo computacional é elevado, exigindo a implementação de algoritmos eficientes em hardware, para o caso de compressão em tempo real de vídeos de alta resolução. Esta dissertação apresenta uma investigação sobre algoritmos de estimação de movimento visando implementações em hardware. Todos os algoritmos foram desenvolvidos primeiramente em linguagem C e submetidos a diversos testes para avaliação de desempenho e custo computacional. Os algoritmos foram aplicados a diversas amostras de vídeo utilizadas pela comunidade científica, para avaliação em aplicações reais. As avaliações demonstraram que os algoritmos rápidos conseguem realizar o processo de estimação de movimento de maneira eficiente, obtendo bons resultados em termos de qualidade de vetores, esforço computacional e desempenho. Com as análises dos resultados obtidos, o algoritmo Busca Diamante (Diamond Search) foi escolhido para ser implementado em hardware, com dois níveis diferentes de subamostragem de pixel: 2:1 e 4:1. As arquiteturas para o algoritmo Busca Diamante, com sub-amostragem de pixel de 2:1 e 4:1, foram descritas em VHDL, sintetizadas para FPGAs Virtex-4 da Xilinx e também para standard cells na tecnologia TSMC 0,18μm. Os resultados mostram que as arquiteturas desenvolvidas possuem desempenho superior ao necessário para tratar vídeos HDTV 1080p em tempo real a 30 quadros por segundo. As arquiteturas desenvolvidas também apresentam um baixo consumo de recursos de hardware, após a síntese para FPGA e ASIC. / The evolution of the communication and information technologies push the development of several communication media. These media, video in particular, need a large bandwidth to be transmitted, or a large digital storage capacity. Many multimedia signals show, however, a high information redundancy. By using compression techniques it is possible to reduce the amount of coded information by one or two orders of magnitude, keeping a satisfactory visual quality. One of these compression techniques searches the similarity between neighboring frames of a scene, identifying the temporal redundancy between them. This technique is called motion estimation, and it is a very efficient method for compression. However, the computational complexity of the motion estimation requires high performance algorithms in hardware, when used for real time compression of high resolution videos. This dissertation presents a comprehensive investigation about motion estimation algorithms, targeting a hardware implementation. All the investigated algorithms were first developed in C language and submitted to many evaluation tests. The algorithms were applied to ten video samples used by the scientific community for the evaluation of real application. The evaluation showed that fast algorithms can carry out the motion estimation process efficiently, producing good results in vectors quality, computational effort and performance. With the results analyses, the Diamond Search algorithm was chosen to be hardware designed, with two different levels of pixel subsampling, 2:1 and 4:1. The architectures for Diamond Search algorithm, with pixel subsampling of 2:1 and 4:1, were described in VHDL, synthesized to Xilinx Virtex-4 FPGAs and also to standard cells TSMC 0.18μm technology. The developed architectures have sufficient performance to process HDTV 1080p videos at 30 frames per second and demand small hardware resources consumption after synthesis to FPGA and ASIC. Keywords: Video compression, motion estimation, VLSI design.
|
10 |
Trapp : uma ferramenta para particionamento/posicionamento de celulas para metodologia tranca / A trapp tool for partitioning/placement of methodology tranca's cellsSchermer, Paulo Armando January 1995 (has links)
Este trabalho propõe e avalia um novo algoritmo para o posicionamento de células de circuitos que utilizam a metodologia de projeto TRANCA. O algoritmo proposto realiza o posicionamento por particionamento, em n-blocos, baseado no conceito de balanceamento de redes, realizando um pré-roteamento global. A maioria dos algoritmos de posicionamento por particionamento são baseados na heurística de Kernighan-Lin[KER 70] e Fidducia-Mattheyses[FID 82] com migração de grupos. Estes algoritmos utilizam uma função de corte mínimo para diminuir o cruzamento de redes entre as duas partições, produzindo regiões saturadas. Sendo assim, o conceito de balanceamento de redes significa a busca de um equilíbrio no comprimento das conexões para evitar a criação de regiões saturadas, diminuindo o tempo computacional e facilitando a etapa de roteamento. Apresenta-se uma visão geral de síntese automática. Descreve-se os estilos de projeto mais utilizados, define-se e analisa-se o problema de particionamento e posicionamento de células. As principais características da metodologia TRANCA são apresentadas. Resume-se as principais características das ferramentas de síntese TRANCA, destacando-se as etapas de particionamento e posicionamento de cada uma, visando o aproveitamento destas características positivas. Com o propósito de fundamentar os conceitos usados para o desenvolvimento do algoritmo, apresenta-se os métodos de posicionamento mais relevantes, dando destaque aqueles baseados em particionamento. Descreve-se algumas das heurísticas existentes. Os conceitos utilizados para o desenvolvimento do algoritmo são então descritos. O algoritmo consiste basicamente da distribuição das conexões, utilizando um mapa de congestionamento do circuito, o que caracteriza um pré-roteamento global. O mapa de congestionamento é montado sobre as partições geradas no circuito. Além do mapa de congestionamento, a descrição dos caminhos das redes é realizada sobre um modelo definido para controlar o cruzamento de redes. Apos a definição dos conceitos, o ambiente criado para o algoritmo é apresentado. Com o objetivo de validar os conceitos estudados e aqueles propostos, implementou-se um protótipo, chamado TRAPP(TRAnsparent Placement by Partitioning), e um visualizador de posicionamento chamado CIPPATO. Finalmente, alguns resultados do protótipo desenvolvido e uma avaliação sobre o comportamento dente protótipo são apresentados. Propõe também implementações alternativas e direções para trabalhos futuros. / This work proposes and evaluates a new algorithm for cells' placement, for use on TRANCA[REI 87] layouts. The algorithm proposed makes a placement by partitioning using multiple steps, based on the concept of net balancing, in order to make a global prerouting. Most partitioning algorithms are based on the Kernighan-Lin[KER 70] and Fidducia-Mattheyses[FID 82] heuristics with migration groups. These algorithms use a mincut heuristic to decrease the crossing nets between the two blocks, producing saturated regions. Therefore, the nets balancing concept means to search for a balance in the connections size to avoid satured regions, decreasing a computation time and to increase the routing performance. The global vision of automatic synthesis is shown. The main design styles are described and the placement and partitioning problems are analysed. The main features of TRANCA methodology are shown. A summary about the TRANCA synthesis tools is presented, emphasizing the partitioning and placement step in each one. This main features are evaluated. The basic ideas that suported the development of the algorithm are described. The algorithm provides a connection distribuition, using a congestion map of the circuit that describes a global pre-routing. The congestion map is generated based on the circuit partitioning. In addition (to the congestion map), the net paths are defined to control the crossing nets. After the definition of the concepts, the environment created for the algorithm is showed. The most important placement methods are studied and presented in order to provide a general picture of the problem. Among them, specifc attention is given to those based an partitioning. Some particular heuristics are detailed. A prototype system called TRAPP( TRAnsparent Placement by Partitioning) was developed to evaluate this approach. It is completed by a placement viewer, CIPPATO. Finally, some results and conclusions are presented. New implementations and directions for further works are proposed too.
|
Page generated in 0.04 seconds