461 |
Diseño de decodificadores de altas prestaciones para código LDPCAngarita Preciado, Fabián Enrique 02 September 2013 (has links)
En esta tesis se han investigado los algoritmos de decodificación para códigos de comprobación de paridad de baja densidad (LDPC) y las arquitecturas para la implementación hardware de éstos. El trabajo realizado se centra en los algoritmos del tipo de intercambio de mensajes para códigos estructurados los cuales se incluyen en varios estándares de comunicaciones.
Inicialmente se han evaluado las prestaciones de los algoritmos existentes Sum-product, Min-Sum y las principales variantes de este último (Min-Sum con escalado y Min-Sum con offset). Además, se ha realizado un análisis de precisión finita utilizando los códigos LDPC de los estándares IEEE 802.3an, IEEE 802.11n e IEEE 802.16e. Posteriormente se han propuesto dos algoritmos basados en el algoritmo Min-Sum, denominados Min-Sum entero y Min-Sum modificado con corrección. La complejidad de éstos es menor que las de los algoritmos estudiados anteriormente y además permiten una implementación hardware eficiente. Por otra parte, se han estudiado diferentes métodos de actualización de los algoritmos de decodificación: por inundación, por capas horizontales (layered) y por capas verticales (shuffled), y se ha propuesto un nuevo método por capas verticales entrelazadas (x-shuffled) que consigue mejorar la tasa de decodificación.
Tras el estudio algorítmico, se han realizado implementaciones hardwar} con diferentes arquitecturas para los algoritmos y métodos de actualización evaluados y propuestos. En la mayoría de algoritmos implementados se requiere el cálculo de los dos primeros mínimos, por lo que inicialmente se realiza un estudio de las arquitecturas hardware para realizar este cálculo y se ha propuesto una nueva arquitectura de menor complejidad. En segundo lugar se ha realizado una comparación de las prestaciones hardware de los diferentes algoritmos con las arquitecturas de referencia: completamente paralela y parcialmente paralela basada en memorias. También se han propuesto dos arquitecturas enfocadas a la alta velocidad, la cuales se implementan con el algoritmo Sum-Product. La primera es una modificación de la arquitectura Sliced Message-Passing que consigue una reducción en el área de la implementación, y la segunda, es una arquitectura específica para el método de actualización propuesto x-shuffled que alcanza tasas de decodificación muy altas. Finalmente, se han implementado los algoritmos propuestos con la arquitectura layered obteniendo implementaciones hardware eficientes con baja área y muy alta tasa de decodificación. Estas últimas consiguen un ratio entre tasa de decodificación y área mejor que las implementaciones existentes en la literatura.
Por último, se ha evaluado el comportamiento de los algoritmos de decodificación estudiados en la zona de baja tasa de error, donde las prestaciones se suelen degradar debido a la aparición de un suelo de error. Para ello se ha implementado un simulador hardware usando dispositivos FPGA. La tasa de datos alcanzada con el simulador hardware diseñado es superior a la de otros simuladores documentados en la literatura. En la zona de baja tasa de error el algoritmo propuesto Min-Sum modificado con corrección presenta un mejor comportamiento que el resto de algoritmos evaluados, consiguiendo bajar el suelo de error varios órdenes de magnitud. / Angarita Preciado, FE. (2013). Diseño de decodificadores de altas prestaciones para código LDPC [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/31646
|
462 |
A SINDy Hardware Accelerator For Efficient System Identification On Edge DevicesGallagher, Michael Sean 01 March 2024 (has links) (PDF)
The SINDy (Sparse Identification of Non-linear Dynamics) algorithm is a method of turning a set of data representing non-linear dynamics into a much smaller set of equations comprised of non-linear functions summed together. This provides a human readable system model the represents the dynamic system analyzed. The SINDy algorithm is important for a variety of applications, including high precision industrial and robotic applications. A Hardware Accelerator was designed to decrease the time spent doing calculations. This thesis proposes an efficient hardware accelerator approach for a broad range of applications that use SINDy and similar system identification algorithms. The accelerator is leverages both systolic arrays for integrated neural network models with other numerical solvers. The novel and efficient reuse of similar processing elements allows this approach to only use a minimal footprint, so that it could be added to microcontroller devices or implemented on lower cost FPGA devices. Our proposed approach also allows the designer to offload calculations onto edge devices from controller nodes and requires less communication from those edge devices to the controller due to the reduced equation space.
|
463 |
VLSI Implementation of a Run-time Reconfigurable Custom Computing Integrated CircuitMusgrove, Mark D. 07 November 1996 (has links)
The growth of high performance computing to date can largely be attributed to continuing breakthroughs in materials and manufacturing.In order to increase computing capacity beyond these physical bounds, new computing paradigms must be developed that make more efficient use of existing manufacturing technologies. Custom Computing Machines (CCMs) are an emerging class of computers that offer promising possibilities for future high-performance computational needs. With the increasing popularity of the run-time reconfigurable (RTR) concept in the CCM community, questions have arisen as to what computational device should be at the heart of an RTR platform. Currently the preferred device, and really the only practical device, has been the RAM-based Field-Programmable Gate Array (FPGA).
Unfortunately, for applications that require high performance, FPGAs are limited by their narrow data path and small computational density. The Colt integrated circuit has been designed from the start to be the computational processing element in an RTR platform. Colt is an RTR data-flow processor array with a course-grain architecture (16-bit data path). This thesis covers the VLSI implementation and verification of the Colt integrated circuit, including the approach and methods necessary to make a functionally working integrated circuit. / Master of Science
|
464 |
Un outil pour la spécification de matériel et la génération de modèles exécutablesBabkine, Philippe-André 03 1900 (has links)
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal. / Ce mémoire présente un outil et une méthodologie permettant le développement rapide de
modèles structurés dans le but de vérifier la fonctionnalité de systèmes numériques VLSI.
Le modèle d'un système numérique s'élabore en décrivant le comportement du système
tel qu'observé à son interface avec le monde extérieur. Les chronogrammes représentent les
opérations de base de l'interface du système, sur lesquelles sont mises en évidence les relations
temporelles entre les événements composant l'opération. Ceci permet de décrire précisément
l'aspect temporel (timing) de l'opération. Son aspect fonctionnel est capturé en annotant de
procédures et de fonctions, les événements d'un chronogramme.
La composition à l aide d'opérateurs hiérarchiques des diagrammes chronologiques de base
complète la description du comportement du système tel qu'observable à son interface. Idéalement,
les diagrammes chronologiques hiérarchiques annotés sont capturés graphiquement. Un
langage de description de ces diagrammes est proposé afin de servir de forme intermédiaire entre
les outils de capture et des applications qui utilisent les diagrammes.
Une de ces applications est traitée: soit la génération de modèles exécutables de système en
langage de description de matériel, à partir de leur description sous forme de chronogrammes
hiérarchiques annotés. L aspect algorithmique de cette simulation est abordé et un exemple
précisant les concepts de modélisation est développé.
La modélisation, en composant hiérarchiquement les opérations de base d'un système sur
lesquelles l'aspect temporel et fonctionnel est capturé, permet le développement rapide et précis
de modèles. La possibilité d'exécuter ces modèles permet de vérifier le comportement d'un
système avant même de commencer le développement de sa réalisation matérielle. Les problèmes
potentiels du système sont ainsi identifiés à un stade où il est moins coûteux d'y remédier.
|
465 |
Implementación en hardware de sistemas de alta fiabilidad basados en metodologías estocásticasCanals Guinand, Vicente José 27 July 2012 (has links)
La sociedad actual demanda cada vez más aplicaciones computacionalmente exigentes y
que se implementen de forma energéticamente eficiente. Esto obliga a la industria del
semiconductor a mantener una continua progresión de la tecnología CMOS. No obstante,
los expertos vaticinan que el fin de la era de la progresión de la tecnología CMOS se
acerca, puesto que se prevé que alrededor del 2020 la tecnología CMOS llegue a su límite.
Cuando ésta llegue al punto conocido como “Red Brick Wall”, las limitaciones físicas,
tecnológicas y económicas no harán viable el proseguir por esta senda. Todo ello ha
motivado que a lo largo de la última década tanto instituciones públicas como privadas
apostasen por el desarrollo de soluciones tecnológicas alternativas como es el caso de la
nanotecnología (nanotubos, nanohilos, tecnologías basadas en el grafeno, etc.). En esta tesis
planteamos una solución alternativa para poder afrontar algunos de los problemas
computacionalmente exigentes. Esta solución hace uso de la tecnología CMOS actual
sustituyendo la forma de computación clásica desarrollada por Von Neumann por formas
de computación no convencionales. Éste es el caso de las computaciones basadas en lógicas
pulsantes y en especial la conocida como computación estocástica, la cual proporciona un
aumento de la fiabilidad y del paralelismo en los sistemas digitales.
En esta tesis se presenta el desarrollo y evaluación de todo un conjunto de bloques
computacionales estocásticos implementados mediante elementos digitales clásicos. A
partir de estos bloques se proponen diversas metodologías computacionalmente eficientes
que mediante su uso permiten afrontar algunos problemas de computación masiva de forma
mucho más eficiente. En especial se ha centrado el estudio en los problemas relacionados
con el campo del reconocimiento de patrones. / Today's society demands the use of applications with a high computational complexity that
must be executed in an energy-efficient way. Therefore the semiconductor industry is
forced to maintain the CMOS technology progression. However, experts predict that the
end of the age of CMOS technology progression is approaching. It is expected that at 2020
CMOS technology would reach the point known as "Red Brick Wall" at which the
physical, technological and economic limitations of CMOS technology will be unavoidable.
All of this has caused that over the last decade public and private institutions has bet by the
development of alternative technological solutions as is the case of nanotechnology
(nanotubes, nanowires, graphene, etc.). In this thesis we propose an alternative solution to
address some of the computationally exigent problems by using the current CMOS
technology but replacing the classical computing way developed by Von Neumann by other
forms of unconventional computing. This is the case of computing based on pulsed logic
and especially the stochastic computing that provide a significant increase of the
parallelism and the reliability of the systems. This thesis presents the development and
evaluation of different stochastic computing methodologies implemented by digital gates.
The different methods proposed are able to face some massive computing problems more
efficiently than classical digital electronics. This is the case of those fields related to pattern
recognition, which is the field we have focused the main part of the research work
developed in this thesis.
|
466 |
VLSI αρχιτεκτονική χαμηλής κατανάλωσης για συγχρονισμό σε Multi-band UWB ασύρματα δίκτυαΠούλος, Αθανάσιος 30 July 2007 (has links)
Η ΒΥΠ διαθέτει αντίτυπο της διατριβής σε έντυπη μορφή στο βιβλιοστάσιο διδακτορικών διατριβών που βρίσκεται στο ισόγειο του κτιρίου της. / Τα ψηφιακά συστήματα UWB (Ultra Wide-Band) παρέχουν τη δυνατότητα ασύρματης μετάδοσης σε πολύ υψηλό ρυθμό. Λόγω του μεγάλου εύρους ζώνης, το κανάλι εισάγει πολλαπλές ανακλάσεις οι οποίες φέρουν μεγάλο ποσοστό της ωφέλιμης ενέργειας του μεταδιδόμενου σήματος. Η ικανότητα του δέκτη για σύλληψη όσο το δυνατόν περισσότερης ωφέλιμης ενέργειας έχει αντίκτυπο στη συνολική απόδοση του συστήματος. Η χρήση της τεχνικής διαμόρφωσης με πολύπλεξη συχνότητας ορθογωνίων φερουσών (OFDM), που στην συγκεκριμένη περίπτωση (UWB) συνδυάζεται με πολυζωνική (Multi-band) μετάδοση, απλοποιεί τη διαχείριση του συνολικού φάσματος συχνοτήτων. Όμως η διαμόρφωση OFDM παρουσιάζει ιδιαίτερη ευαισθησία σε προβλήματα τόσο διασυμβολικής παρεμβολής (ISI) όσο και διακαναλικής παρεμβολής (ICI), λόγω του έντονου διασκορπιστικού χαρακτήρα του καναλιού καθώς επίσης και τυχόν αποκλίσεων που εμφανίζονται στους ταλαντωτές πομπού-δέκτη. Τα παραπάνω επιβάλλουν τη χρήση σύνθετων αλγορίθμων συγχρονισμού και συντονισμού (time and frequency synchronization) μεταξύ πομπού και δέκτη για την ομαλή λειτουργία.
Στα πλαίσια της διπλωματικής αυτής θα πραγματοποιηθεί επιλογή κατάλληλων αλγορίθμων για τα προαναφερθέντα προβλήματα, οι οποίοι θα πρέπει να πληρούν τις προδιαγραφές του υπό διαμόρφωση διεθνούς προτύπου 802.15.3α και θα αναπτυχθούν βέλτιστες αρχιτεκτονικές VLSI, με στόχο τόσο το χαμηλό κόστος υλοποίησης όσο και την χαμηλή κατανάλωση ισχύος. / In this project have been studied the low power VLSI architecture for synchronization algorithms in Multi-band UWB Wireless systems. The main issues are timing and frequency synchronization algorithms.
|
467 |
Conception d'un circuit integre arbitre de bus de communication multiprotocoles : ABC MBarone, Dante Augusto Couto January 1984 (has links)
L'étude de différents bus de communination parallèle à usage multi-microprocesseur (bus SM 90, MULTIBUS, VME), ainsi que des techniques d'arbitrage associées, a conduit à s'intéresser à la compatibilité de l'arbitre de bus intégré ABC 90 de la SM 90 (dont les functionnalités sont les plus puissantes) avec les autres types de bus (MULTIBUS, VME). La première étape de l'étude se traduit par la proposition d'utilisation de l'ABC 90 comme organe d'allocation de bus dans différentes configurations d'architectures, et ce par adjonction d'éléments discrets. La seconde étape consiste à proposer un circuit intégré d'arbitre de bus multiprotocole en partant des spécifications de l'ABC 90 et en y intégrant les résultats obtenus dans la proposition précédente. La validation de ces deux propositions a été obtenue par simulation. / O estudo de diferentes "bus" de comunicação paralela utilizados em arquiteturas multi-microprocesssodores ("bus" das estruturas SM 90, MULTIBUS e VME), assim que suas técnicas de arbitragem respectivas, nos permitiram de conduzir nosso trabalho sobre o estudo de compatibilidade do circuito integrado arbitro de bus ABC 90 da estrutura SM 90 (cujas funções são as mais potentes) com os outros tipos de "bus" (MULTIBUS e VME). A primeira etapa de nosso estudo se traduz pela proposição de utilização do circuito ABC 90 com órgão de alocação de "bus" em diferentes configurações arquiteturais multi-microprocessadores através da introdução de componentes discretos. A segunda etapa consiste na proposição de um circuito integrado arbitro de "bus" multi-protocolos partindo das especificações do circuito ABC 90 e dos resultados obtidos pela primeira proposição. A validação das duas proposições sugeridas par este trabalho foi obtida através de simulações. / The existence of so many parallel communication multi-microprocessor buses (buses of the SM 90, MULTIBUS & VME structures) and their different arbiter techniques led us to study the compatibility of the integrated bus arbiter ABC 90 of the SM 90 (which presents the widest range of functions) with other types of buses MULTIBUS and VME). The first part of the study involved the feasibility of using the ABC 90 circuit as bus arbiter in different architectural configurations; this has been realized by the addition of discrete components. The second step consisted in the design of an integrated multi - protocol communication arbiter, as an extension of the ABC 90's specifications and based on the results obtained in the first part of the study. The validation of both proposals was carried out by simulation.
|
468 |
Conception d'un circuit integre arbitre de bus de communication multiprotocoles : ABC MBarone, Dante Augusto Couto January 1984 (has links)
L'étude de différents bus de communination parallèle à usage multi-microprocesseur (bus SM 90, MULTIBUS, VME), ainsi que des techniques d'arbitrage associées, a conduit à s'intéresser à la compatibilité de l'arbitre de bus intégré ABC 90 de la SM 90 (dont les functionnalités sont les plus puissantes) avec les autres types de bus (MULTIBUS, VME). La première étape de l'étude se traduit par la proposition d'utilisation de l'ABC 90 comme organe d'allocation de bus dans différentes configurations d'architectures, et ce par adjonction d'éléments discrets. La seconde étape consiste à proposer un circuit intégré d'arbitre de bus multiprotocole en partant des spécifications de l'ABC 90 et en y intégrant les résultats obtenus dans la proposition précédente. La validation de ces deux propositions a été obtenue par simulation. / O estudo de diferentes "bus" de comunicação paralela utilizados em arquiteturas multi-microprocesssodores ("bus" das estruturas SM 90, MULTIBUS e VME), assim que suas técnicas de arbitragem respectivas, nos permitiram de conduzir nosso trabalho sobre o estudo de compatibilidade do circuito integrado arbitro de bus ABC 90 da estrutura SM 90 (cujas funções são as mais potentes) com os outros tipos de "bus" (MULTIBUS e VME). A primeira etapa de nosso estudo se traduz pela proposição de utilização do circuito ABC 90 com órgão de alocação de "bus" em diferentes configurações arquiteturais multi-microprocessadores através da introdução de componentes discretos. A segunda etapa consiste na proposição de um circuito integrado arbitro de "bus" multi-protocolos partindo das especificações do circuito ABC 90 e dos resultados obtidos pela primeira proposição. A validação das duas proposições sugeridas par este trabalho foi obtida através de simulações. / The existence of so many parallel communication multi-microprocessor buses (buses of the SM 90, MULTIBUS & VME structures) and their different arbiter techniques led us to study the compatibility of the integrated bus arbiter ABC 90 of the SM 90 (which presents the widest range of functions) with other types of buses MULTIBUS and VME). The first part of the study involved the feasibility of using the ABC 90 circuit as bus arbiter in different architectural configurations; this has been realized by the addition of discrete components. The second step consisted in the design of an integrated multi - protocol communication arbiter, as an extension of the ABC 90's specifications and based on the results obtained in the first part of the study. The validation of both proposals was carried out by simulation.
|
469 |
Conception d'un circuit integre arbitre de bus de communication multiprotocoles : ABC MBarone, Dante Augusto Couto January 1984 (has links)
L'étude de différents bus de communination parallèle à usage multi-microprocesseur (bus SM 90, MULTIBUS, VME), ainsi que des techniques d'arbitrage associées, a conduit à s'intéresser à la compatibilité de l'arbitre de bus intégré ABC 90 de la SM 90 (dont les functionnalités sont les plus puissantes) avec les autres types de bus (MULTIBUS, VME). La première étape de l'étude se traduit par la proposition d'utilisation de l'ABC 90 comme organe d'allocation de bus dans différentes configurations d'architectures, et ce par adjonction d'éléments discrets. La seconde étape consiste à proposer un circuit intégré d'arbitre de bus multiprotocole en partant des spécifications de l'ABC 90 et en y intégrant les résultats obtenus dans la proposition précédente. La validation de ces deux propositions a été obtenue par simulation. / O estudo de diferentes "bus" de comunicação paralela utilizados em arquiteturas multi-microprocesssodores ("bus" das estruturas SM 90, MULTIBUS e VME), assim que suas técnicas de arbitragem respectivas, nos permitiram de conduzir nosso trabalho sobre o estudo de compatibilidade do circuito integrado arbitro de bus ABC 90 da estrutura SM 90 (cujas funções são as mais potentes) com os outros tipos de "bus" (MULTIBUS e VME). A primeira etapa de nosso estudo se traduz pela proposição de utilização do circuito ABC 90 com órgão de alocação de "bus" em diferentes configurações arquiteturais multi-microprocessadores através da introdução de componentes discretos. A segunda etapa consiste na proposição de um circuito integrado arbitro de "bus" multi-protocolos partindo das especificações do circuito ABC 90 e dos resultados obtidos pela primeira proposição. A validação das duas proposições sugeridas par este trabalho foi obtida através de simulações. / The existence of so many parallel communication multi-microprocessor buses (buses of the SM 90, MULTIBUS & VME structures) and their different arbiter techniques led us to study the compatibility of the integrated bus arbiter ABC 90 of the SM 90 (which presents the widest range of functions) with other types of buses MULTIBUS and VME). The first part of the study involved the feasibility of using the ABC 90 circuit as bus arbiter in different architectural configurations; this has been realized by the addition of discrete components. The second step consisted in the design of an integrated multi - protocol communication arbiter, as an extension of the ABC 90's specifications and based on the results obtained in the first part of the study. The validation of both proposals was carried out by simulation.
|
470 |
Timing-Driven Routing in VLSI Physical Design Under UncertaintySamanta, Radhamanjari January 2013 (has links) (PDF)
The multi-net Global Routing Problem (GRP) in VLSI physical design is a problem of routing a set of nets subject to limited resources and delay constraints. Various state-of-the-art routers are available but their main focus is to optimize the wire length and minimize the over ow. However optimizing wire length do not necessarily meet timing constraints at the sink nodes. Also, in modern nano-meter scale VLSI process the consideration of process variations is a necessity for ensuring reasonable yield at the fab. In this work, we try to nd a fundamental strategy to address the timing-driven Steiner tree construction (i.e., the routing) problem subject to congestion constraints and process variation.
For congestion mitigation, a gradient based concurrent approach (over all nets) of Erzin et. al., rather than the traditional (sequential) rip-and-reroute is adopted in or- der to propagate the timing/delay-driven property of the Steiner tree candidates. The existing sequential rip-up and reroute methods meet the over ow constraint locally but cannot propagate the timing constraint which is non-local in nature. We build on this approach to accommodate the variation-aware statistical delay/timing requirements.
To further reduce the congestion, the cost function of the tree generation method is updated by adding history based congestion penalty to the base cost (delay). Iterative use of the timing-driven Steiner tree construction method and history based tree construction procedure generate a diverse pool of candidate Steiner trees for each net. The gradient algorithm picks one tree for each net from the pool of trees such that congestion is e ciently controlled.
As the technology scales down, process variation makes process dependent param- eters like resistance, capacitance etc non-deterministic. As a result, Statistical Static Timing Analysis or SSTA has replaced the traditional static timing in nano-meter scale VLSI processes. However, this poses a challenge regarding the max/min-plus algebra of Dijkstra like approximation algorithm that builds the Steiner trees. A new approach based on distance between distributions for nding maximum/minimum at the nodes is presented in this thesis. Under this metric, the approximation algorithm for variation aware timing driven congestion constrained routing is shown to be provably tight and one order of magnitude faster than existing approaches (which are not tight) such as the MVERT.
The results (mean value) of our variation aware router are quite close to the mean of the several thousand Monte Carlo simulations of the deterministic router, i.e the results converge in mean. Therefore, instead of running so many deterministic Monte Carlo simulations, we can generate an average design with a probability distribution reasonably close to that of the actual behaviour of the design by running the proposed statistical router only once and at a small fraction of the computational e ort involved in physical design in the nano regime VLSI.
The above approximation algorithm is extended to local routing, especially non- Manhattan lambda routing which is increasingly being allowed by the recent VLSI tech- nology nodes. Here also, we can meet delay driven constraints better and keep related wire lengths reasonable.
|
Page generated in 0.0348 seconds