Global ETD Search

91	Elastic circuits in FPGA Silva, Thiago de Oliveira January 2017 (has links) O avanço da microeletrônica nas últimas décadas trouxe maior densidade aos circuitos integrados, possibilitando a implementação de funções de alta complexidade em uma menor área de silício. Como efeito desta integração em larga escala, as latências dos fios passaram a representar uma maior fração do atraso de propagação de dados em um design, tornando a tarefa de “timing closure” mais desafiadora e demandando mais iterações entre etapas do design. Por meio de uma revisão na teoria dos circuitos insensíveis a latência (Latency-Insensitive theory), este trabalho explora a metodologia de designs elásticos (Elastic Design methodology) em circuitos síncronos, com o objetivo de solucionar o impacto que a latência adicional dos fios insere no fluxo de design de circuitos integrados, sem demandar uma grande mudança de paradigma por parte dos designers. A fim de exemplificar o processo de “elasticização”, foi implementada uma versão síncrona da arquitetura do microprocessador Neander que posteriormente foi convertida a um Circuito Elástico utilizando um protocolo insensível a latência nas transferências de dados entre os processos computacionais do design. Ambas as versões do Neander foram validadas em uma plataforma FPGA utilizando ferramentas e fluxo de design síncrono bem estabelecidos. A comparação das características de timing e área entre os designs demonstra que a versão Elástica pode apresentar ganhos de performance para sistemas complexos ao custo de um aumento da área necessária. Estes resultados mostram que a metodologia de designs elásticos é uma boa candidata para projetar circuitos integrados complexos sem demandar custosas iterações entre fases de design e reutilizando as já estabelecidas ferramentas de design síncrono, resultando em uma alternativa economicamente vantajosa para os designers. / The advance of microelectronics brought increased density to integrated circuits, allowing high complexity functions to be implemented in smaller silicon areas. As a side effect of this large-scale integration, the wire latencies became a higher fraction of a design’s data propagation latency, turning timing closure into a challenging task that often demand several iterations among design phases. By reviewing the Latency-Insensitive theory, this work presents the exploration of the Elastic Design methodology in synchronous circuits, with the objective of solving the increased wire latency impact on integrated circuits design flow without requiring a big paradigm change for designers. To exemplify the elasticization process, the educational Neander microprocessor architecture is synchronously implemented and turned into an Elastic Circuit by using a latency-insensitive protocol in the design’s computational processes data transfers. Both designs are validated in an FPGA platform, using well known synchronous design tools and flow. The timing and area comparison between the designs demonstrates that the Elastic version can present performance advantages for more complex systems at the price of increased area. These results show that the Elastic Design methodology is a good candidate for designing complex integrated circuits without costly iterations between design phases. This methodology also leverages the reuse of the mostly adopted synchronous design tools, resulting in a cost-effective alternative for designers. Microeletrônica Circuitos digitais Elastic Circuits Asynchronous Circuits Synchronous Circuits ASIC FPGA IC Design Methodology Digital IC
92	Arquitetura para invasão de matrizes usando circuito divisor eficiente baseado no algoritmo Goldschmidt Marques, Pedro Luís Carneiro 05 December 2016 (has links) Submitted by Cristiane Chim (cristiane.chim@ucpel.edu.br) on 2017-02-10T11:37:48Z No. of bitstreams: 1 pedro luis.pdf: 2493331 bytes, checksum: 38fdc4ec8b3fee0815ba222c508dc8d4 (MD5) / Made available in DSpace on 2017-02-10T11:37:48Z (GMT). No. of bitstreams: 1 pedro luis.pdf: 2493331 bytes, checksum: 38fdc4ec8b3fee0815ba222c508dc8d4 (MD5) Previous issue date: 2016-12-05 / The matrix inversion calculation is present in several applications in the area of Signal Processing. Among these applications, the adaptive filtering, based on the algorithm of Affine Projections, includes the calculation of matrix inversion, which adds a high computational complexity. There are several algorithms for calculating matrix inversion. The complexity of the algorithm is associated with the size of the matrix, which varies according to the target application. This dissertation proposes the implementation in dedicated hardware of the analytical algorithm of matrix inversion. This algorithm is most appropriate for the implementation of a 2x2 size matrix, which is the appropriate size for an implementation of the algorithm of Affine Projections for several practical applications. In the matrix inversion block, the divisor circuit is that adds the highest computational complexity. Among the division algorithms from the literature, algorithms based on functional iterations are considered the fastest, because they are able to take advantage of high speed multipliers to converge in a quadratic form to a result. Among the algorithms based on functional iterations, Newton-Raphson and Goldschmidt algorithms are the most used algorithms. However, the Goldschmidt algorithm has been more used in applications that demand high processing speed, because unlike the Newton-Raphson algorithm, where the multiplications are dependent on each other, in the Goldschmidt algorithm the multiplications are performed in parallel. In this work, it is proposed the hardware implementation of an efficient divisor circuit based on the Goldschmidt algorithm. The divider circuit uses a radix-4 multiplier from the literature, which is more efficient in terms of power dissipation, when compared to the divider circuit using the multiplier from the synthesis tool. The proposed divider circuit increases the range of operating values by using the Q7.8 standard, which allows values between -127.99609375 and +127.99609375, rather than the original Goldschmidt divider, which supports a narrow range of values between 1 and 2. The main results show that the use of the proposed efficient Goldschmidt divider circuit makes the matrix inverter circuit with a lower power dissipation, which becomes an attractive for a future implementation of the complete affine projections algorithm in dedicated hardware. / O cálculo de inversão de matrizes está presente em várias aplicações da área de Processamento de Sinais. Entre essas aplicações, a filtragem adaptativa, baseada no algoritmo de Projeções Afins, inclui o cálculo de inversão de matrizes, que agrega uma elevada complexidade computacional. Existem vários algoritmos para o cálculo de inversão de matrizes. A complexidade do algoritmo está associada ao tamanho da matriz, que varia de acordo com a aplicação alvo. Essa dissertação propõe a implementação em hardware dedicado do algoritmo analítico de inversão de matrizes. Esse algoritmo é o mais apropriado para a implementação de uma matriz de tamanho 2x2, que é o tamanho adequado para uma implementação do algoritmo de Projeções Afins para diversas aplicações práticas. No bloco de inversão de matriz, o circuito divisor é o que agrega a maior complexidade computacional. Dentre os algoritmos de divisão presentes na literatura, os algoritmos baseados em iterações funcionais são considerados os mais rápidos, pois são capazes de tirar proveito de multiplicadores de alta velocidade, para convergir de forma quadrática para um resultado. Dentre os algoritmos baseados em iterações funcionais, destacam-se os algoritmos de Newton-Raphson e de Goldschmidt. Entretanto, o algoritmo de Goldschmidt tem sido mais utilizado em aplicações que demandam alta velocidade de processamento, pois ao contrário do algoritmo Newton-Raphson, onde as multiplicações são dependentes umas das outras, no algoritmo Goldschmidt as multiplicações são realizadas em paralelo. Nesse trabalho, propõe-se a implementação em hardware de um circuito divisor eficiente baseado no algoritmo Goldschmidt. O circuito divisor usa um multiplicador na base 4 da literatura, que torna o divisor mais eficiente em termos de dissipação de potência, quando comparado ao circuito divisor usando o multiplicador da ferramenta de síntese. O circuito divisor proposto aumenta a faixa de valores de operação através do uso do padrão Q7.8, que permite valores entre -127.99609375 e +127.99609375, ao contrário do divisor Goldschmidt original, que admite uma estreita faixa de valores ente 1 e 2. Os principais resultados mostram que o uso do divisor Goldschmidt eficiente proposto torna o circuito inversor de matriz com uma menor dissipação de potência, o que se torna um atrativo para uma futura implementação da arquitetura completa do algoritmo de Projeções Afins. ENGENHARIAS# #4518971056484826825# #600
93	Energy Efficient Hardware Design of Neural Networks January 2018 (has links) abstract: Hardware implementation of deep neural networks is earning significant importance nowadays. Deep neural networks are mathematical models that use learning algorithms inspired by the brain. Numerous deep learning algorithms such as multi-layer perceptrons (MLP) have demonstrated human-level recognition accuracy in image and speech classification tasks. Multiple layers of processing elements called neurons with several connections between them called synapses are used to build these networks. Hence, it involves operations that exhibit a high level of parallelism making it computationally and memory intensive. Constrained by computing resources and memory, most of the applications require a neural network which utilizes less energy. Energy efficient implementation of these computationally intense algorithms on neuromorphic hardware demands a lot of architectural optimizations. One of these optimizations would be the reduction in the network size using compression and several studies investigated compression by introducing element-wise or row-/column-/block-wise sparsity via pruning and regularization. Additionally, numerous recent works have concentrated on reducing the precision of activations and weights with some reducing to a single bit. However, combining various sparsity structures with binarized or very-low-precision (2-3 bit) neural networks have not been comprehensively explored. Output activations in these deep neural network algorithms are habitually non-binary making it difficult to exploit sparsity. On the other hand, biologically realistic models like spiking neural networks (SNN) closely mimic the operations in biological nervous systems and explore new avenues for brain-like cognitive computing. These networks deal with binary spikes, and they can exploit the input-dependent sparsity or redundancy to dynamically scale the amount of computation in turn leading to energy-efficient hardware implementation. This work discusses configurable spiking neuromorphic architecture that supports multiple hidden layers exploiting hardware reuse. It also presents design techniques for minimum-area/-energy DNN hardware with minimal degradation in accuracy. Area, performance and energy results of these DNN and SNN hardware is reported for the MNIST dataset. The Neuromorphic hardware designed for SNN algorithm in 28nm CMOS demonstrates high classification accuracy (>98% on MNIST) and low energy (51.4 - 773 (nJ) per classification). The optimized DNN hardware designed in 40nm CMOS that combines 8X structured compression and 3-bit weight precision showed 98.4% accuracy at 33 (nJ) per classification. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2018 Engineering Computer engineering Accelerators ASIC Energy efficient Hardware design Neural Networks Spiking neural networks
94	The Hybrid Architecture Parallel Fast Fourier Transform (HAPFFT) Palmer, Joseph M. 16 June 2005 (has links) The FFT is an efficient algorithm for computing the DFT. It drastically reduces the cost of implementing the DFT on digital computing systems. Nevertheless, the FFT is still computationally intensive, and continued technological advances of computers demand larger and faster implementations of this algorithm. Past attempts at producing high-performance, and small FFT implementations, have focused on custom hardware (ASICs and FPGAs). Ultimately, the most efficient have been single-chipped, streaming I/O, pipelined FFT architectures. These architectures increase computational concurrency through the use of hardware pipelining. Streaming I/O, pipelined FFT architectures are capable of accepting a single data sample every clock cycle. In principle, the maximum clock frequency of such a circuit is limited only by its critical delay path. The delay of the critical path may be decreased by the addition of pipeline registers. Nevertheless this solution gives diminishing returns. Thus, the streaming I/O, pipelined FFT is ultimately limited in the maximum performance it can provide. Attempts have been made to map the Parallel FFT algorithm to custom hardware. Yet, the Parallel FFT was formulated and optimized to execute on a machine with multiple, identical, processing elements. When executed on such a machine, the FFT requires a large expense on communications. Therefore, a direct mapping of the Parallel FFT to custom hardware results in a circuit with complex control and global data movement. This thesis proposes the Hybrid Architecture Parallel FFT (HAPFFT) as an alternative. The HAPFFT is an improved formulation for building Parallel FFT custom hardware modules. It provides improved performance, efficient resource utilization, and reduced design time. The HAPFFT is modular in nature. It includes a custom front-end parallel processing unit which produces intermediate results. The intermediate results are sent to multiple, independent FFT modules. These independent modules form the back-end of the HAPFFT, and are generic, meaning that any prexisting FFT architecture may be used. With P back-end modules a speedup of P will be achieved, in comparison to an FFT module composed solely of a single module. Furthermore, the HAPFFT defines the front-end processing unit as a function of P. It hides the high communication costs typically seen in Parallel FFTs. Reductions in control complexity, memory demands, and logical resources, are achieved. An extraordinary result of the HAPFFT formulation is a sublinear area-time growth. This phenomenon is often also called superlinear speedup. Sublinear area-time growth and superlinear speedup are equivalent terms. This thesis will subsequently use the term superlinear speedup to refer to the HAPFFT's outstanding speedup behavior. A further benefit resulting from the HAPFFT formulation is reduced design time. Because the HAPFFT defines only the front-end module, and because the back-end parallel modules may be composed of any preexisting FFT modules, total design time for a HAPFFT is greatly reduced fft parallel fft fast fourier transform asic fpga field programmable gate arrays architecture Electrical and Computer Engineering
95	Why it hurts to exercise: a study of sex, acid sensing ion channels, and fatigue metabolites in the onset of muscle pain Gregory, Nicholas Scott 01 May 2015 (has links) Exercise has numerous health benefits. Yet, exercise can exacerbate pain for individuals with chronic musculoskeletal pain conditions such as myofascial pain syndrome (MPS) and fibromyalgia (FM). The exacerbation is out of proportion to the activity performed and lasts for long periods of time even after the cessation of activity. This pain acts as a barrier to healthy exercise and physical rehabilitation, which, when applied consistently, are effective treatments for MPS and FM--two diseases that produce substantial suffering and disability. The goal of the proposed studies is to determine the underlying peripheral mechanisms that contribute to enhanced pain following exercise. A better understanding of these mechanisms will lead to better pain management and prevention for these diseases. Previous data show that two hours of running wheel activity lowers the threshold necessary to induce muscle pain by acidic saline injection, producing robust pain behaviors to normally innocuous stimuli. Muscle activity that produces fatigue is associated with extracellular increases in protons, lactate, and ATP. These fatigue metabolites can directly activate muscle nociceptors and, when combined, produce a potentiated effect. Acid sensing ion channels (ASICs) are non-selective cation channels that open in response to increased proton concentrations, a response that is enhanced when lactate binds at a separate location. Ionotropic purinergic receptors (P2X) similarly produce an inward current in response to elevated ATP. Evidence suggests certain ASIC and P2X subtypes are capable of a physical interaction that allows ASIC activation at lower proton concentrations in the presence of ATP. This suggests that ATP, lactate, and protons released during exercise could activate ASIC and P2X receptors on muscle nociceptors, exciting the nociceptors and sensitizing them to subsequent muscle insult. However, the limitations of these experiments leave several gaps. First, the running wheel task fails to produce measurable increases in fatigue metabolites, possibly due to the fact that there was minimal fatigue (10%) or that their levels quickly return to baseline. Further, the running wheel task depends on central nervous system (CNS) activity and volitional running, which may introduce confounding factors upstream of muscle activation and result in large variation in the rate and duration of running. Second, it is unclear whether ASICs are necessary for the development of mechanical hyperalgesia induced by muscle activity, nor is it understood which ASIC subtypes might be required for such an effect. Finally, the molecules necessary for the induction of mechanical hyperalgesia after exercise are not known. Protons, lactate, and ATP have been suggested, but it is not known if these compounds are themselves sufficient or if they interact in an additive or synergistic manner. We address these concerns by developing an electrically-stimulated muscle fatigue paradigm that reliably fatigues a single muscle independent of the CNS, allowing for metabolite measurement during muscle activity and in vivo study of molecular mechanisms of muscle pain in the peripheral tissue. We then use genetic and pharmacologic approaches to test the role of ASIC subtypes in the development of mechanical hyperalgesia after exercise. Finally, we test the effectiveness of by-products of muscle activity in recapitulating the effects of the exercise-enhanced pain model. publicabstract ASIC Fatigue Ion Channels Muscle Pain Sex Differences Neuroscience and Neurobiology
96	Effects of overexpressing ASIC2a and ASIC3 in transgenic mice Costa, Vivian 01 July 2009 (has links) Acid-sensing ion channels (ASICs) are proton-gated cation channels expressed throughout the nervous system. These channels are activated by acidic pH conditions within an attainable physiologic range. The specific function of these channels has proven to be elusive, but it is clear that they are involved in various neuronal processes, both in the central nervous system as well as in the periphery.In order to further study the functions of these channels in an animal model system, transgenic animals were generated that overexpress individual ASIC subunits: ASIC2a and ASIC3. Transgenic proteins were detectable in brain and peripheral nervous tissue, and each had differential effects on acid-gated current properties in cultured neurons.Transgenes included N-terminal epitope tags to distinguish from endogenous ASICs, and expression was driven by a pan-neuronal promoter. Mechanical thermal sensory behaviors were tested in the transgenic mice. However, no effect was observed in these behaviors. The most interesting effect of overexpressing ASIC3 was the resulting impairment of conditioned fear behaviors in the transgenic animals without effect on unconditioned fear. ASIC3 transgenic behave like ASIC1a knockout mice in conditioned fear behaviors. Transgenic ASIC3 interacts with endogenous ASIC1, and is likely altering subunit composition of ASIC channels in the brain without abolishing proton-gated currenst like in the ASIC1a knockout. Overexpressing these two ASIC subunits in transgenic animals has produced tools that may be used to further study the functions of these channels. While this still is an artificial setting for studying ASIC functions, it nonetheless provides an in vivo method to study the effects of altering subunit composition in a whole animal and its behavioral effects, as well as in vivo expression of transgenes that can be studies biochemically. It is hopeful that studying localization in the transgenic mice will afford a better understanding of the localization and function of endogenous channels without the limitations of generating antibodies against endogenous mouse ASIC proteins, which is still in progress. acid-sending ion channel ASIC fear conditioning mechanosensation transgenic Neuroscience and Neurobiology
97	High Level Synthesis Of An Image Processing Algorithm For Cancer Detection Bilhanan, Anuleka 29 March 2004 (has links) There is a crucial need for real time detection and diagnosis in digital mammography. To date, most computer aided analysis applications are software driven and normally require long processing times. Digital filtering is often the initial stage in processing mammograms for both automated detection and tissue characterization, which relies on Fourier analysis. In this research the main objective is to lay the groundwork for converting software driven mammography applications to hardware implementations by using Application-Specific Integrated Circuits (ASICs). The long-term goal is to increase processing speed. This research focuses on achieving the main objective by using one specific mammographic image processing application for demonstration purposes. ASICs offer high performance at the price of high development costs and are suitable for real time diagnosis. In this research, we develop a behavioral VHDL model of a specific filtering algorithm. Automatic Design Instantiation System (AUDI)8, a high level synthesis tool is used to automatically synthesize an RTL design from the model. A floating point behavioral component library is developed to support the synthesis of the filtering algorithm. The work shows that the hardware output is identical to the software driven output at when considering eight-bit accuracy and shows only rounding errors at higher storage capacities. behavioral synthesis AUDI ASIC Fourier transform deconvolution American Studies Arts and Humanities
98	Spectromètre-autocorrélateur numérique spatialisable pour l'instrument FIRST-HIFI Ravera, Laurent 28 October 1999 (has links) (PDF) Cette thèse s'inscrit dans le cadre d'un programme de R&T dont l'objectif est de spatialiser les techniques de spectrométrie par autocorrélation numérique, notamment pour HIFI, l'instrument hétérodyne de la mission spatiale FIRST (Herschel) de l'ESA. Lancé en 2007, FIRST sera le premier observatoire submillimétrique dans l'espace. Les contraintes spatiales en terme de poids, d'encombrement, de consommation, imposaient des développements majeurs. Une architecture optimisée de spectromètre a été conçue à partir d'ASICs numériques à haut degré d'intégration. Elle inclut 3 principaux éléments. 1) Un sous-système analogique sélectionne, dans la bande spectrale d'entrée (4-8GHz), 8 sous bandes de 250MHz. 2) Chacun des signaux analogiques est numérisé sur 2 bits / 3 niveaux par un ASIC en BiCMOS cadencé à 550MHz. 3) Des modules numériques de corrélation, également cadencés à 550MHz, calculent 1024 coefficients de corrélation sur 28 bits. Ils peuvent être utilisés en cascade ou en parallèle pour privilégier la résolution ou la largeur de bande. Un module de corrélation est constitué d'ASICs en Arséniure de Gallium développés en « full custom » en utilisant la technologie Vitesse Hgaas4 0,5µm et d'ASICs en CMOS développés à partir de la bibliothèque de cellules standards AMS 0,6µm. Les premiers calculent à haute fréquence les produits de corrélation, les seconds accumulent les résultats à fréquence plus faible et permettent l'acquisition des données. Un spectromètre prototype 4x180MHz a été intégré et testé. Les tests en laboratoire et sur télescope ont permis de valider l'architecture adoptée et d'identifier plusieurs paramètres critiques comme la forme de la bande spectrale ou le format d'acquisition des données. Nous avons alors élaboré et optimisé un corrélateur pour l'instrument FIRST-HIFI. Un modèle de démonstration (2x250MHz) du spectromètre, à base de corrélateurs de 1024 canaux est actuellement développé et sera testé fin 1999. Corrélateur numérique Spectromètre Radioastronomie ASIC
99	Canaux ioniques, douleur et analgésie - Effets analgésiques du blocage d'ASIC1a par la Psalmotoxine 1 Mazzuca, Michel 21 December 2007 (has links) (PDF) L'étude des mécanismes moléculaires impliqués dans la douleur nous permet d'identifier les acteurs jouant des rôles majeurs et ainsi de définir précisément d'excellentes cibles thérapeutiques. Les canaux ioniques sont à l'origine de nos perceptions douloureuses, ils sont responsables du transport des messages nociceptifs et nous avons montré qu'ils pouvaient également moduler le message nociceptif via l'activation du système opioïdergique. Les canaux ioniques Asic Sensing Ion Channels (ASIC) sont directement impliqués dans les phénomènes de perception nociceptive de l'acidose tissulaire et de l'inflammation. Les sous-unités ASIC3 impliquées dans ces perceptions ne sont pas sensibles à la Psalmotoxine (PcTx1). En effet, PcTx1 est un bloqueur spécifique des canaux ASIC1a homomériques. Ces canaux ASIC1a sont exprimés dans le système nerveux central et notamment au niveau des neurones de la corne dorsale de la moelle épinière. L'injection de PcTx1 dans le liquide cérébrospinal de souris et de rats induit une analgésie efficace dans différents modèles de douleurs thermique, mécanique, inflammatoire et neuropathique. L'analgésie induite par le blocage spécifique des canaux ASIC1a résulte de la stimulation des récepteurs opioïdergiques μ et δ. Ces récepteurs sont activés par la Met-enképhaline qui est libérée massivement lors de l'inhibition des canaux ASIC1a. Bien que la Met-enképhaline relarguée induise une tolérance, elle ne produit pas de troubles locomoteurs comme le fait la morphine. Ainsi nous avons mis en évidence que le blocage d'ASIC1a dans le système nerveux central est responsable d'un relargage de Met-enképhaline qui en activant les récepteurs μ et δ provoque une analgésie puissante sans induire certains des effets indésirables de la morphine, une molécule analégésique de référence utilisée couramment en clinique. Douleur analgésie canaux ioniques ASIC PcTx1 opiacé
100	Electronique d'acquisition d'une gamma-caméra Gaglione, R. 03 November 2005 (has links) (PDF) Ce travail de thèse s'inscrit dans une collaboration entre le groupe Application et Valorisation des Interactions Rayonnement-Matière et l'entreprise Hamamatsu pour l'étude d'une électronique dédiée et fortement intégrée destinée à équiper un photomultiplicateur multianodes de type H8500. De par leur faible zone morte et leur configuration multianodes, ces photomultiplicateurs permettent d'améliorer les performances des gamma-caméras utilisées en particulier pour le dépistage du cancer du sein (scintimammographie). Après avoir élaboré un cahier des charges à partir des tests effectués sur ces tubes photomultiplicateurs, une électronique d'acquisition spécifique est proposée. Elle est composée d'un préamplificateur de courant multigain, d'un intégrateur commuté et d'un convertisseur analogique-numérique à rampe. L'ensemble est autodéclenché sur le signal. Cette électronique à fait l'objet de plusieurs prototypes multivoies dont la conception et les résultats de tests sont présentés. [PHYS:PHYS] Physics/Physics gamma-caméra photomultiplicateur multianode préamplificateur intégrateur bruit ASIC mixte ADC à rampe

Search results