Global ETD Search

71	Διερεύνηση επιδόσεων αρχιτεκτονικών υλικού-λογισμικού για εφαρμογές ψηφιακής επεξεργασίας σε FPGA Ρώσση, Μαρία-Ευγενία 20 July 2012 (has links) Οι συστοιχίες προγραμματιζόμενων πυλών (FPGAs) αποτελούν μια σημαντική τεχνολογία, η οποία επιτρέπει στους σχεδιαστές κυκλωμάτων την παραγωγή συγκεκριμένου σκοπού ολοκληρωμένων κυκλωμάτων σε σύντομο χρόνο. Tα σημαντικότερα των χαρακτηριστικών τους είναι η αρχιτεκτονική τους και η δυνατότητα σχεδιασμού τους μέσω υπολογιστών, η χαμηλή κατανάλωση ισχύος καθώς και το μικρό χρονικό διάστημα που απαιτείται για τον επαναπρογραμματισμό τους. Τα FPGAs είναι κατάλληλα σχεδιασμένα για ψηφιακές εφαρμογές φιλτραρίσματος. Η πυκνότητα των προγραμματιζόμενων αυτών συστημάτων είναι τέτοια ώστε πολύ μεγάλος αριθμός αριθμητικών πράξεων όπως αυτές που προκύπτουν μέσω ψηφιακού φιλτραρίσματος να μπορεί να εφαρμοστεί σε μία μόνο συσκευή. Τα πλεονεκτήματα των FPGA στην υλοποίηση ψηφιακών φίλτρων είναι μεταξύ άλλων οι υψηλότεροι ρυθμοί δειγματοληψίας από παραδοσιακούς DSP chip, το χαμηλότερο κόστος από μια μέτρια ASIC (Application Specific Integrated Circuit, Kύκλωμα οριζόμενο από εφαρμογή) για εφαρμογές μεγάλου όγκου, καθώς και η μεγαλύτερη ευελιξία από όλες τις εναλλακτικές προσεγγίσεις για την υλοποίηση των FIR φίλτρων. Σπουδαιότερο όλων είναι ότι προγραμματίζονται μέσα στο σύστημα και έχουν δυνατότητα επαναπρογραμματισμού για την υλοποίηση διαφόρων εναλλακτικών λειτουργιών φιλτραρίσματος. Στόχος της παρούσας διπλωματικής είναι να συνδυασθούν τεχνικές VLSI και ψηφιακής επεξεργασίας σήματος και μέσω κατανόησης της αρχιτεκτονικής του υπολογιστή να δημιουργηθεί μια χρήσιμη εφαρμογή. Επιλέχθηκε για τον λόγο αυτό: α) η ανάπτυξη ενός FIR φίλτρου σε γλώσσα περιγραφής υλικού, β) υλοποίησή του σε FPGA, γ) εισαγωγή αυτού σε ενσωματωμένο σύστημα και σύνδεση σε διάδρομο δεδομένων επεξεργαστή και δ) έλεγχος του φίλτρου με τη βοήθεια του επεξεργαστή μέσω γλώσσας υψηλού επιπέδου. Η συγγραφή του κώδικα του φίλτρου έγινε σε γλώσσα VHDL, με structural μεθόδους και η προσομοίωση του συστήματος στο Modelsim. Επιπροσθέτως χρησιμοποιήθηκε ο Project Navigator ISE της Xilinx για τον έλεγχο του κώδικα αλλά και τον προγραμματισμό του FPGA Spartan 3E Starter Board. Χρησιμοποιήθηκαν ακόμα τα υποπρογράμματα Plan Ahead και ChipScope Pro του ISE ώστε να ελεγχθεί η λειτουργία του κυκλώματος στο FPGA. To κύκλωμα τελικά εισάγεται σε ενσωματωμένο σύστημα με τη βοήθεια του εργαλείου σχεδίασης EDK της Xilinx και ελέγχεται η λειτουργία του προγραμματίζοντας τον επεξεργαστή Microblaze. Ακόμα ελέγχεται η λειτουργία του φίλτρου για διαφορετικούς συντελεστές FIR φίλτρων που χρησιμοποιούν διαφορετικά παράθυρα και συγκρίνονται οι «ιδανικές» τιμές που παράγονται από το Matlab με αυτές που παράγονται από το φίλτρο. Τέλος μετράται η ενέργεια (δυναμική και στατική) που καταναλώνεται κατά τη λειτουργία του κυκλώματος στο FPGA με τη βοήθεια του XPower Analyzer. / Field-programmable gate arrays (FPGAs) is a technology of great importance that allows the designers to produce specific purpose integrated circuits in a limited amount of time. The most important of their characteristics are their architecture and the ability of their design with the help of computers, the low power dissipation, as well as the need of a short amount of time to be reprogrammed. FPGAs are properly designed for digital filtering applications. The density of these programmable systems is such that a great amount of numerical calculations such as those that result via digital filtering can be applied to one device only. The advantages of FPGAs as for the implementation of digital filters is between others the great rates of sampling compared to traditional DSP chips, their low cost compared to a moderate ASIC (Application Specific Integrated Circuit) for applications that take up a large area, as well as the flexibility compared to alternative approaches for the implementation of FIR filters. Their most important characteristic is that they can be programmed on-chip and that they have the ability of being reprogrammed for the implementation of different filtering purposes. The aim of this thesis is to combine VLSI techniques and digital signal processing techniques and via the understanding of the computer architecture to create a useful application. To fulfill that purpose: a) a FIR filter was designed with the use of a hardware description language b) the filter was implemented by using an FPGA c) the filter was imported to an embedded system and it was connected to the bus of a microprocessor d) the filter was controlled by the microprocessor via a high-level programming language. The filter was designed using the VHDL language, specifically using structural methods, and its simulation was performed with Modelsim. Also the Project Navigator ISE of Xilinx was used to correct unwanted warnings and to program the FPGA Spartan 3E Starter Board. Some other subprograms of ISE were also used, such as Plan Ahead and ChipScope Pro in order to check the performance of the filter. The circuit is finally imported to an embedded system using the Embedded Developer’s Kit (EDK) of Xilinx. Microblaze was the microprocessor that was used to control the filter’s performance. Additionally, the performance of the filter is checked by using different coefficients of FIR filters by different windowing methods. The ideal values that are produced from Matlab are compared to those of the filter. Finally the power dissipation (static and dynamic) of the filter is measured using XPower Analyzer. FIR φίλτρα 621.395 Field-programmable gate array (FPGA) Spartan 3E starter board Xilinx platform studio Xilinx software development kit
72	Análise do uso de redundância em circuitos gerados por síntese de alto nível para FPGA programado por SRAM sob falhas transientes Santos, André Flores dos January 2017 (has links) Este trabalho consiste no estudo e análise da suscetibilidade a efeitos da radiação em projetos de circuitos gerados por ferramenta de Síntese de Alto Nível para FPGAs (Field Programmable Gate Array), ou seja, circuitos programáveis e sistemas em chip, do inglês System-on-Chip (SOC). Através de um injetor de falhas por emulação usando o ICAP (Internal Configuration Access Port) localizado dentro do FPGA é possível injetar falhas simples ou acumuladas do tipo SEU (Single Event Upset), definidas como perturbações que podem afetar o funcionamento correto do dispositivo através da inversão de um bit por uma partícula carregada. SEU está dentro da classificação de SEEs (Single Event Effects), efeitos transitórios em tradução livre, podem ocorrer devido a penetração de partículas de alta energia do espaço e do sol (raios cósmicos e solares) na atmosfera da Terra que colidem com átomos de nitrogênio e oxigênio resultando na produção de partículas carregadas, na grande maioria nêutrons. Dentro deste contexto além de analisar a suscetibilidade de projetos gerados por ferramenta de Síntese de Alto Nível, torna-se relevante o estudo de técnicas de redundância como TMR (Triple Modular Redundance) para detecção, correção de erros e comparação com projetos desprotegidos verificando a confiabilidade. Os resultados mostram que no modo de injeção de falhas simples os projetos com redundância TMR demonstram ser efetivos. Na injeção de falhas acumuladas o projeto com múltiplos canais apresentou melhor confiabilidade do que o projeto desprotegido e com redundância de canal simples, tolerando um maior número de falhas antes de ter seu funcionamento comprometido. / This work consists of the study and analysis of the susceptibility to effects of radiation in circuits projects generated by High Level Synthesis tool for FPGAs Field Programmable Gate Array (FPGAs), that is, system-on-chip (SOC). Through an emulation fault injector using ICAP (Internal Configuration Access Port), located inside the FPGA, it is possible to inject single or accumulated failures of the type SEU (Single Event Upset), defined as disturbances that can affect the correct functioning of the device through the inversion of a bit by a charged particle. SEU is within the classification of SEEs (Single Event Effects), can occur due to the penetration of high energy particles from space and from the sun (cosmic and solar rays) in the Earth's atmosphere that collide with atoms of nitrogen and oxygen resulting in the production of charged particles, most of them neutrons. In this context, in addition to analyzing the susceptibility of projects generated by a High Level Synthesis tool, it becomes relevant to study redundancy techniques such as TMR (Triple Modular Redundancy) for detection, correction of errors and comparison with unprotected projects verifying the reliability. The results show that in the simple fault injection mode TMR redundant projects prove to be effective. In the case of accumulated fault injection, the multichannel design presented better reliability than the unprotected design and with single channel redundancy, tolerating a greater number of failures before its operation was compromised. Microeletrônica Fpga Field-Programable Gate Arrays (FPGAs) Triple Modular Redundance (TMR) Single Event Upset (SEU) Single Event Effects (SEEs) System-on-Chips (SoCs)
73	Análise do uso de redundância em circuitos gerados por síntese de alto nível para FPGA programado por SRAM sob falhas transientes Santos, André Flores dos January 2017 (has links) Este trabalho consiste no estudo e análise da suscetibilidade a efeitos da radiação em projetos de circuitos gerados por ferramenta de Síntese de Alto Nível para FPGAs (Field Programmable Gate Array), ou seja, circuitos programáveis e sistemas em chip, do inglês System-on-Chip (SOC). Através de um injetor de falhas por emulação usando o ICAP (Internal Configuration Access Port) localizado dentro do FPGA é possível injetar falhas simples ou acumuladas do tipo SEU (Single Event Upset), definidas como perturbações que podem afetar o funcionamento correto do dispositivo através da inversão de um bit por uma partícula carregada. SEU está dentro da classificação de SEEs (Single Event Effects), efeitos transitórios em tradução livre, podem ocorrer devido a penetração de partículas de alta energia do espaço e do sol (raios cósmicos e solares) na atmosfera da Terra que colidem com átomos de nitrogênio e oxigênio resultando na produção de partículas carregadas, na grande maioria nêutrons. Dentro deste contexto além de analisar a suscetibilidade de projetos gerados por ferramenta de Síntese de Alto Nível, torna-se relevante o estudo de técnicas de redundância como TMR (Triple Modular Redundance) para detecção, correção de erros e comparação com projetos desprotegidos verificando a confiabilidade. Os resultados mostram que no modo de injeção de falhas simples os projetos com redundância TMR demonstram ser efetivos. Na injeção de falhas acumuladas o projeto com múltiplos canais apresentou melhor confiabilidade do que o projeto desprotegido e com redundância de canal simples, tolerando um maior número de falhas antes de ter seu funcionamento comprometido. / This work consists of the study and analysis of the susceptibility to effects of radiation in circuits projects generated by High Level Synthesis tool for FPGAs Field Programmable Gate Array (FPGAs), that is, system-on-chip (SOC). Through an emulation fault injector using ICAP (Internal Configuration Access Port), located inside the FPGA, it is possible to inject single or accumulated failures of the type SEU (Single Event Upset), defined as disturbances that can affect the correct functioning of the device through the inversion of a bit by a charged particle. SEU is within the classification of SEEs (Single Event Effects), can occur due to the penetration of high energy particles from space and from the sun (cosmic and solar rays) in the Earth's atmosphere that collide with atoms of nitrogen and oxygen resulting in the production of charged particles, most of them neutrons. In this context, in addition to analyzing the susceptibility of projects generated by a High Level Synthesis tool, it becomes relevant to study redundancy techniques such as TMR (Triple Modular Redundancy) for detection, correction of errors and comparison with unprotected projects verifying the reliability. The results show that in the simple fault injection mode TMR redundant projects prove to be effective. In the case of accumulated fault injection, the multichannel design presented better reliability than the unprotected design and with single channel redundancy, tolerating a greater number of failures before its operation was compromised. Microeletrônica Fpga Field-Programable Gate Arrays (FPGAs) Triple Modular Redundance (TMR) Single Event Upset (SEU) Single Event Effects (SEEs) System-on-Chips (SoCs)
74	Ferramentas e metodologias de desenvolvimento para sistemas parcialmente reconfiguráveis. / Development tools and methodologies for partial reconfigurable systems. Filippo Valiante Filho 19 May 2008 (has links) Alguns tipos de FPGA (Field Programmable Gate Array) possuem a capacidade de serem reconfigurados parcialmente em tempo de execução formando um Sistema Parcialmente Reconfigurável (SPR), cuja utilização traz diversas vantagens dentre as quais a redução de custos. A maior utilização de SPRs enfrenta, como um dos fatores limitantes, a dificuldade de acesso e de utilização de ferramentas de desenvolvimento apropriadas. Este trabalho aborda os SPRs, suas aplicações e uma análise das ferramentas de desenvolvimento existentes. posteriormente dedica-se ao aperfeiçoamento de uma dessas ferramentas, o PARBIT, com o desenvolvimento de uma interface gráfica de usuário (GUI, -- Graphical User Interface) e a atualização de sua metodologia de desenvolvimento. As metodologias de projeto suportadas pelo fabricante do FPGA também são apresentadas. As metodologias são validadas através do projeto de um SPR. / Some types of FPGA (Field Programmable Gate Array) can be partially reconfigured during run-time forming a Partial Reconfigurable System (PRS). The use of PRSs brings several advantages like cost reduction. A larger use of PRSs faces a limiting factor: the difficult to access and use appropriate development tools. This work shows the PRSs, its applications and an analysis of the existing development tools. Later, it dedicates to the improvement of one of these tools, the PARBIT, developing a graphical user interface (GUI) and updating its project methodology. The project methodologies supported by the manufacturer of the FPGA are also presented. The methodologies are validated through the design of a PRS. Arquitetura reconfigurável CAD Circuitos FPGA Circuitos integrados Computação reconfigurável Microeletrônica CAD for FPGAs FPGA Partial reconfigurable system Reconfigurable architecture Reconfigurable logic
75	Embebed wavelet image reconstruction in parallel computation hardware Guevara Escobedo, Jorge January 2016 (has links) In this thesis an algorithm is demonstrated for the reconstruction of hard-field Tomography images through localized block areas, obtained in parallel and from a multiresolution framework. Block areas are subsequently tiled to put together the full size image. Given its properties to preserve its compact support after being ramp filtered, the wavelet transform has received to date much attention as a promising solution in radiation dose reduction in medical imaging, through the reconstruction of essentially localised regions. In this work, this characteristic is exploited with the aim of reducing the time and complexity of the standard reconstruction algorithm. Independently reconstructing block images with geometry allowing to cover completely the reconstructed frame as a single output image, allows the individual blocks to be reconstructed in parallel, and to experience its performance in a multiprocessor hardware reconfigurable system (i.e. FPGA). Projection data from simulated Radon Transform (RT) was obtained at 180 evenly spaced angles. In order to define every relevant block area within the sinogram, forward RT was performed over template phantoms representing block frames. Reconstruction was then performed in a domain beyond the block frame limits, to allow calibration overlaps when fitting of adjacent block images. The 256 by 256 Shepp-Logan phantom was used to test the methodology of both parallel multiresolution and parallel block reconstruction generalisations. It is shown that the reconstruction time of a single block image in a 3-scale multiresolution framework, compared to the standard methodology, performs around 48 times faster. By assuming a parallel implementation, it can implied that the reconstruction time of a single tile, should be very close related to the reconstruction time of the full size and resolution image. 621.36
76	The realization of signal processing methods and their hardware implementation over multi-carrier modulation using FPGA technology : validation and implementation of multi-carrier modulation on FPGA, and signal processing of the channel estimation techniques and filter bank architectures for DWT using HDL coding for mobile and wireless applications Migdadi, Hassan Saleh Okleh January 2015 (has links) First part of this thesis presents the design, validation, and implementation of an Orthogonal Frequency Division Multiplexing (OFDM) transmitter and receiver on a Cyclone II FPGA chip using DSP builder and Quartus II high level design tools. The resources in terms of logical elements (LE) including combinational functions and logic registers allocated by the model have been investigated and addressed. The result shows that implementing the basic OFDM transceiver allocates about 14% (equivalent to 6% at transmitter and 8% at receiver) of the available LE resources on an Altera Cyclone II EP2C35F672C6 FPGA chip, largely taken up by the FFT, IFFT and soft decision encoder. Secondly, a new wavelet-based OFDM system based on FDPP-DA based channel estimation is proposed as a reliable ECG Patient Monitoring System, a Personal Wireless telemedicine application. The system performance for different wavelet mothers has been investigated. The effects of AWGN and multipath Rayleigh fading channels have also been studied in the analysis. The performances of FDPP-DA and HDPP-DA-based channel estimations are compared based on both DFT-based OFDM and wavelet-based OFDM systems. The system model was studied using MATLAB software in which the average BER was addressed for randomized data. The main error differences that reflect the quality of the received ECG signals between the reconstructed and original ECG signals are established. Finally a DA-based architecture for 1-D iDWT/DWT based on an OFDM model is implemented for an ECG-PMS wireless telemedicine application. In the portable wireless body transmitter unit at the patient site, a fully Serial-DA-based scheme for iDWT is realized to support higher hardware utilization and lower power consumption; whereas a fully Parallel-DA-based scheme for DWT is applied at the base unit of the hospital site to support a higher throughput. It should be noted that the behavioural level of HDL models of the proposed system was developed and implemented to confirm its correctness in simulation. Then, after the simulation process the design models were synthesised and implemented for the target FPGA to confirm their validation. 621.382
77	Scalable, Memory-Intensive Scientific Computing on Field Programmable Gate Arrays Mirza, Salma 01 January 2010 (has links) (PDF) Cache-based, general purpose CPUs perform at a small fraction of their maximum floating point performance when executing memory-intensive simulations, such as those required for many scientific computing problems. This is due to the memory bottleneck that is encountered with large arrays that must be stored in dynamic RAM. A system of FPGAs, with a large enough memory bandwidth, and clocked at only hundreds of MHz can outperform a CPU clocked at GHz in terms of floating point performance. An FPGA core designed for a target performance that does not unnecessarily exceed the memory imposed bottleneck can then be distributed, along with multiple memory interfaces, into a scalable architecture that overcomes the bandwidth limitation of a single interface. Interconnected cores can work together to solve a scientific computing problem and exploit a bandwidth that is the sum of the bandwidth available from all of their connected memory interfaces. The implementation demonstrates this concept of scalability with two memory interfaces through the use of available FPGA prototyping platforms. Even though the FPGAs operate at 133 MHz, which is twenty one times slower than an AMD Phenom X4 processor operating at 2.8 GHz, the system of two FPGAs performs eight times slower than the processor for the example problem of SMVM in heat transfer. However, the system is demonstrated to be scalable with a run-time that decreases linearly with respect to the available memory bandwidth. The floating point performance of a single board implementation is 12 GFlops which doubles to 24 GFlops for a two board implementation, for a gather or scatter operation on matrices of varying sizes. Scientific Computation on FPGAs Accelerating Scientific Computation Sparse Matrix Vector Multiplications Memory-Intensive Computation Reconfigurable Computing Electrical engineering
78	The realization of signal processing methods and their hardware implementation over multi-carrier modulation using FPGA technology. Validation and implementation of multi-carrier modulation on FPGA, and signal processing of the channel estimation techniques and filter bank architectures for DWT using HDL coding for mobile and wireless applications. Migdadi, Hassan S.O. January 2015 (has links) First part of this thesis presents the design, validation, and implementation of an Orthogonal Frequency Division Multiplexing (OFDM) transmitter and receiver on a Cyclone II FPGA chip using DSP builder and Quartus II high level design tools. The resources in terms of logical elements (LE) including combinational functions and logic registers allocated by the model have been investigated and addressed. The result shows that implementing the basic OFDM transceiver allocates about 14% (equivalent to 6% at transmitter and 8% at receiver) of the available LE resources on an Altera Cyclone II EP2C35F672C6 FPGA chip, largely taken up by the FFT, IFFT and soft decision encoder. Secondly, a new wavelet-based OFDM system based on FDPP-DA based channel estimation is proposed as a reliable ECG Patient Monitoring System, a Personal Wireless telemedicine application. The system performance for different wavelet mothers has been investigated. The effects of AWGN and multipath Rayleigh fading channels have also been studied in the analysis. The performances of FDPP-DA and HDPP-DA-based channel estimations are compared based on both DFT-based OFDM and wavelet-based OFDM systems. The system model was studied using MATLAB software in which the average BER was addressed for randomized data. The main error differences that reflect the quality of the received ECG signals between the reconstructed and original ECG signals are established. Finally a DA-based architecture for 1-D iDWT/DWT based on an OFDM model is implemented for an ECG-PMS wireless telemedicine application. In the portable wireless body transmitter unit at the patient site, a fully Serial-DA-based scheme for iDWT is realized to support higher hardware utilization and lower power consumption; whereas a fully Parallel-DA-based scheme for DWT is applied at the base unit of the hospital site to support a higher throughput. It should be noted that the behavioural level of HDL models of the proposed system was developed and implemented to confirm its correctness in simulation. Then, after the simulation process the design models were synthesised and implemented for the target FPGA to confirm their validation.
79	Design, Analysis, and Applications of Approximate Arithmetic Modules Ullah, Salim 06 April 2022 (has links) From the initial computing machines, Colossus of 1943 and ENIAC of 1945, to modern high-performance data centers and Internet of Things (IOTs), four design goals, i.e., high-performance, energy-efficiency, resource utilization, and ease of programmability, have remained a beacon of development for the computing industry. During this period, the computing industry has exploited the advantages of technology scaling and microarchitectural enhancements to achieve these goals. However, with the end of Dennard scaling, these techniques have diminishing energy and performance advantages. Therefore, it is necessary to explore alternative techniques for satisfying the computational and energy requirements of modern applications. Towards this end, one promising technique is analyzing and surrendering the strict notion of correctness in various layers of the computation stack. Most modern applications across the computing spectrum---from data centers to IoTs---interact and analyze real-world data and take decisions accordingly. These applications are broadly classified as Recognition, Mining, and Synthesis (RMS). Instead of producing a single golden answer, these applications produce several feasible answers. These applications possess an inherent error-resilience to the inexactness of processed data and corresponding operations. Utilizing these applications' inherent error-resilience, the paradigm of Approximate Computing relaxes the strict notion of computation correctness to realize high-performance and energy-efficient systems with acceptable quality outputs. The prior works on circuit-level approximations have mainly focused on Application-specific Integrated Circuits (ASICs). However, ASIC-based solutions suffer from long time-to-market and high-cost developing cycles. These limitations of ASICs can be overcome by utilizing the reconfigurable nature of Field Programmable Gate Arrays (FPGAs). However, due to architectural differences between ASICs and FPGAs, the utilization of ASIC-based approximation techniques for FPGA-based systems does not result in proportional performance and energy gains. Therefore, to exploit the principles of approximate computing for FPGA-based hardware accelerators for error-resilient applications, FPGA-optimized approximation techniques are required. Further, most state-of-the-art approximate arithmetic operators do not have a generic approximation methodology to implement new approximate designs for an application's changing accuracy and performance requirements. These works also lack a methodology where a machine learning model can be used to correlate an approximate operator with its impact on the output quality of an application. This thesis focuses on these research challenges by designing and exploring FPGA-optimized logic-based approximate arithmetic operators. As multiplication operation is one of the computationally complex and most frequently used arithmetic operations in various modern applications, such as Artificial Neural Networks (ANNs), we have, therefore, considered it for most of the proposed approximation techniques in this thesis. The primary focus of the work is to provide a framework for generating FPGA-optimized approximate arithmetic operators and efficient techniques to explore approximate operators for implementing hardware accelerators for error-resilient applications. Towards this end, we first present various designs of resource-optimized, high-performance, and energy-efficient accurate multipliers. Although modern FPGAs host high-performance DSP blocks to perform multiplication and other arithmetic operations, our analysis and results show that the orthogonal approach of having resource-efficient and high-performance multipliers is necessary for implementing high-performance accelerators. Due to the differences in the type of data processed by various applications, the thesis presents individual designs for unsigned, signed, and constant multipliers. Compared to the multiplier IPs provided by the FPGA Synthesis tool, our proposed designs provide significant performance gains. We then explore the designed accurate multipliers and provide a library of approximate unsigned/signed multipliers. The proposed approximations target the reduction in the total utilized resources, critical path delay, and energy consumption of the multipliers. We have explored various statistical error metrics to characterize the approximation-induced accuracy degradation of the approximate multipliers. We have also utilized the designed multipliers in various error-resilient applications to evaluate their impact on applications' output quality and performance. Based on our analysis of the designed approximate multipliers, we identify the need for a framework to design application-specific approximate arithmetic operators. An application-specific approximate arithmetic operator intends to implement only the logic that can satisfy the application's overall output accuracy and performance constraints. Towards this end, we present a generic design methodology for implementing FPGA-based application-specific approximate arithmetic operators from their accurate implementations according to the applications' accuracy and performance requirements. In this regard, we utilize various machine learning models to identify feasible approximate arithmetic configurations for various applications. We also utilize different machine learning models and optimization techniques to efficiently explore the large design space of individual operators and their utilization in various applications. In this thesis, we have used the proposed methodology to design approximate adders and multipliers. This thesis also explores other layers of the computation stack (cross-layer) for possible approximations to satisfy an application's accuracy and performance requirements. Towards this end, we first present a low bit-width and highly accurate quantization scheme for pre-trained Deep Neural Networks (DNNs). The proposed quantization scheme does not require re-training (fine-tuning the parameters) after quantization. We also present a resource-efficient FPGA-based multiplier that utilizes our proposed quantization scheme. Finally, we present a framework to allow the intelligent exploration and highly accurate identification of the feasible design points in the large design space enabled by cross-layer approximations. The proposed framework utilizes a novel Polynomial Regression (PR)-based method to model approximate arithmetic operators. The PR-based representation enables machine learning models to better correlate an approximate operator's coefficients with their impact on an application's output quality.:1. Introduction 1.1 Inherent Error Resilience of Applications 1.2 Approximate Computing Paradigm 1.2.1 Software Layer Approximation 1.2.2 Architecture Layer Approximation 1.2.3 Circuit Layer Approximation 1.3 Problem Statement 1.4 Focus of the Thesis 1.5 Key Contributions and Thesis Overview 2. Preliminaries 2.1 Xilinx FPGA Slice Structure 2.2 Multiplication Algorithms 2.2.1 Baugh-Wooley’s Multiplication Algorithm 2.2.2 Booth’s Multiplication Algorithm 2.2.3 Sign Extension for Booth’s Multiplier 2.3 Statistical Error Metrics 2.4 Design Space Exploration and Optimization Techniques 2.4.1 Genetic Algorithm 2.4.2 Bayesian Optimization 2.5 Artificial Neural Networks 3. Accurate Multipliers 3.1 Introduction 3.2 Related Work 3.3 Unsigned Multiplier Architecture 3.4 Motivation for Signed Multipliers 3.5 Baugh-Wooley’s Multiplier 3.6 Booth’s Algorithm-based Signed Multipliers 3.6.1 Booth-Mult Design 3.6.2 Booth-Opt Design 3.6.3 Booth-Par Design 3.7 Constant Multipliers 3.8 Results and Discussion 3.8.1 Experimental Setup and Tool Flow 3.8.2 Performance comparison of the proposed accurate unsigned multiplier 3.8.3 Performance comparison of the proposed accurate signed multiplier with the state-of-the-art accurate multipliers 3.8.4 Performance comparison of the proposed constant multiplier with the state-of-the-art accurate multipliers 3.9 Conclusion 4. Approximate Multipliers 4.1 Introduction 4.2 Related Work 4.3 Unsigned Approximate Multipliers 4.3.1 Approximate 4 × 4 Multiplier (Approx-1) 4.3.2 Approximate 4 × 4 Multiplier (Approx-2) 4.3.3 Approximate 4 × 4 Multiplier (Approx-3) 4.4 Designing Higher Order Approximate Unsigned Multipliers 4.4.1 Accurate Adders for Implementing 8 × 8 Approximate Multipliers from 4 × 4 Approximate Multipliers 4.4.2 Approximate Adders for Implementing Higher-order Approximate Multipliers 4.5 Approximate Signed Multipliers (Booth-Approx) 4.6 Results and Discussion 4.6.1 Experimental Setup and Tool Flow 4.6.2 Evaluation of the Proposed Approximate Unsigned Multipliers 4.6.3 Evaluation of the Proposed Approximate Signed Multiplier 4.7 Conclusion 5. Designing Application-specific Approximate Operators 5.1 Introduction 5.2 Related Work 5.3 Modeling Approximate Arithmetic Operators 5.3.1 Accurate Multiplier Design 5.3.2 Approximation Methodology 5.3.3 Approximate Adders 5.4 DSE for FPGA-based Approximate Operators Synthesis 5.4.1 DSE using Bayesian Optimization 5.4.2 MOEA-based Optimization 5.4.3 Machine Learning Models for DSE 5.5 Results and Discussion 5.5.1 Experimental Setup and Tool Flow 5.5.2 Accuracy-Performance Analysis of Approximate Adders 5.5.3 Accuracy-Performance Analysis of Approximate Multipliers 5.5.4 AppAxO MBO 5.5.5 ML Modeling 5.5.6 DSE using ML Models 5.5.7 Proposed Approximate Operators 5.6 Conclusion 6. Quantization of Pre-trained Deep Neural Networks 6.1 Introduction 6.2 Related Work 6.2.1 Commonly Used Quantization Techniques 6.3 Proposed Quantization Techniques 6.3.1 L2L: Log_2_Lead Quantization 6.3.2 ALigN: Adaptive Log_2_Lead Quantization 6.3.3 Quantitative Analysis of the Proposed Quantization Schemes 6.3.4 Proposed Quantization Technique-based Multiplier 6.4 Results and Discussion 6.4.1 Experimental Setup and Tool Flow 6.4.2 Image Classification 6.4.3 Semantic Segmentation 6.4.4 Hardware Implementation Results 6.5 Conclusion 7. A Framework for Cross-layer Approximations 7.1 Introduction 7.2 Related Work 7.3 Error-analysis of approximate arithmetic units 7.3.1 Application Independent Error-analysis of Approximate Multipliers 7.3.2 Application Specific Error Analysis 7.4 Accelerator Performance Estimation 7.5 DSE Methodology 7.6 Results and Discussion 7.6.1 Experimental Setup and Tool Flow 7.6.2 Behavioral Analysis 7.6.3 Accelerator Performance Estimation 7.6.4 DSE Performance 7.7 Conclusion 8. Conclusions and Future Work info:eu-repo/classification/ddc/004 ddc:004
80	Filtragem digital e reconstrução de sinais em frequência intermediária usando FPGA CORDEIRO, Patrício Elvis Sousa 05 April 2011 (has links) Submitted by Samira Prince (prince@ufpa.br) on 2012-05-04T14:48:55Z No. of bitstreams: 1 Dissertacao_FiltragemDigitalReconstrucao.pdf: 3558449 bytes, checksum: 10f6cea2d74402e2d9d3ebe5bf45e8fe (MD5) / Approved for entry into archive by Samira Prince(prince@ufpa.br) on 2012-05-04T14:50:00Z (GMT) No. of bitstreams: 1 Dissertacao_FiltragemDigitalReconstrucao.pdf: 3558449 bytes, checksum: 10f6cea2d74402e2d9d3ebe5bf45e8fe (MD5) / Made available in DSpace on 2012-05-04T14:50:00Z (GMT). No. of bitstreams: 1 Dissertacao_FiltragemDigitalReconstrucao.pdf: 3558449 bytes, checksum: 10f6cea2d74402e2d9d3ebe5bf45e8fe (MD5) Previous issue date: 2011 / O presente trabalho trata da filtragem e reconstrução de sinais em frequência intermediária usando FPGA. É feito o desenvolvimento de algoritmos usando processamento digital de sinais e também a implementação dos mesmos, constando desde o projeto da placa de circuito impresso, montagem e teste. O texto apresenta um breve estudo de amostragem e reconstrução de sinais em geral. Especial atenção é dada à amostragem de sinais banda-passante e à análise de questões práticas de reconstrução de sinais em frequência intermediária. Dois sistemas de reconstrução de sinais baseados em processamento digital de sinais, mais especificamente reamostragem no domínio discreto, são apresentados e analisados. São também descritas teorias de processos de montagem e soldagem de placas eletrônicas com objetivo de definir uma metodologia de projeto, montagem e soldagem de placas eletrônicas. Tal metodologia é aplicada no projeto e manufatura do protótipo de um módulo de filtragem digital para repetidores de telefonia celular. O projeto, implementado usando FPGA, é baseado nos dois sistemas supracitados. Ao final do texto, resultados obtidos em experimentos de filtragem digital e reconstrução de sinais em frequência intermediária com o protótipo desenvolvido são apresentados. / This work deals with filtering and reconstruction of intermediate frequency signals using FPGA. Development and implementation of digital signal processing algorithms are performed, including the design of a printed circuit board, its assembly and the circuit testing. The dissertation presents a brief study of sampling and reconstruction of analog signals. Special attention is given to the sampling of analog band-pass signals and to the analysis of practical issues in the reconstruction of intermediate frequency signals. Two systems for signal reconstruction based on digital signal processing specifically in discrete resampling are presented and analyzed. Theories are also described for the welding and assembly processes of electronic boards in order to define a methodology for the design, assembly and soldering of electronic boards. This methodology is applied in the design and manufacture of the prototype of a digital filtering module for digital cellular repeaters. The project, implemented using FPGA, is based on the two systems mentioned above. At the end of the text, results of digital filtering and reconstruction of intermediate frequency signals using the developed prototype are presented. Processamento de sinal Filtros digitais (Matemática) Modulação digital Protótipo Telecomunicações

Search results