Spelling suggestions: "subject:"field programmable date array"" "subject:"field programmable date srray""
111 |
Digital Image Processing Algorithms Research Based on FPGAXu, Haifeng January 2011 (has links)
As we can find through the development of TV systems in America, the digital TV related digital broadcasting is just the road we would walk into. Nowadays digital television is prevailing in China, and the government is promoting the popularity of digital television. However, because of the economic development, analog television will still take its place in the TV market during a long period. But the broadcasting system has not been reformed, as a result, we should not only take use of the traditional analog system we already have, but also improve the quality of the pictures of analog system. With the high-speed development of high-end television, the research and application of digital television technique, the flaws caused by interlaced scan in traditional analog television, such as color-crawling, flicker and fast-moved object's boundary blur and zigzag, are more and more obvious. Therefore the conversion of interlaced scan to progressing scan, which is de-interlacing, is an important part of current television production. At present there are many kinds of TV sets appearing in the market. They are based on digital processing technology and use various digital methods to process the interlaced, low-field rate video data, including the de-interlacing and field rate conversion. The digital process chip of television is the heart of the new-fashioned TV set, and is the reason of visual quality improvement. As a requirement of real time television signal processing, most of these chips has developed novel hardware architecture or data processing algorithm. So far, the most quality effective algorithm is based on motion compensation, in which motion detection and motion estimation will be inevitably involved, in despite of the high computation cost. in video processing chips, the performance and complexity of motion estimation algorithm have a direct impact on speed area and power consumption of chips. Also, motion estimation determined the efficiency of the coding algorithms in video compression. This thesis proposes a Down-sampled Diamond NTSS algorithm (DSD-NTSS) based on New Three Step Search (NTSS) algorithm, taking both performance and complexity of motion estimation algorithms into consideration. The proposed DSD-NTSS algorithm makes use of the similarity of neighboring pixels in the same image and down-samples pixels in the reference blocks with the decussate pattern to reduce the computation cost. Experiment results show that DSD-NTSS is a better tradeoff in the terms of performance and complexity. The proposed DSD-NTSS reduces the computation cost by half compared with NTSS when having the equivalent image quality. Further compared with Four Step Search(FSS) Diamond Search(DS)、Three Step Search(TSS) and some other fast searching algorithms, the proposed DSD-NTSS generally surpasses in performance and complexity. This thesis focuses on a novel computation-release motion estimation algorithm in video post-processing system and researches the FPGA design of the system.
|
112 |
Classification of road side material using convolutional neural network and a proposed implementation of the network through Zedboard Zynq 7000 FPGARahman, Tanvir 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / In recent years, Convolutional Neural Networks (CNNs) have become the state-of-
the-art method for object detection and classi cation in the eld of machine learning
and arti cial intelligence. In contrast to a fully connected network, each neuron of a
convolutional layer of a CNN is connected to fewer selected neurons from the previous
layers and kernels of a CNN share same weights and biases across the same input layer
dimension. These features allow CNN architectures to have fewer parameters which in
turn reduces calculation complexity and allows the network to be implemented in low
power hardware. The accuracy of a CNN depends mostly on the number of images
used to train the network, which requires a hundred thousand to a million images.
Therefore, a reduced training alternative called transfer learning is used, which takes
advantage of features from a pre-trained network and applies these features to the new
problem of interest. This research has successfully developed a new CNN based on
the pre-trained CIFAR-10 network and has used transfer learning on a new problem
to classify road edges. Two network sizes were tested: 32 and 16 Neuron inputs with
239 labeled Google street view images on a single CPU. The result of the training
gives 52.8% and 35.2% accuracy respectively for 250 test images. In the second part
of the research, High Level Synthesis (HLS) hardware model of the network with 16
Neuron inputs is created for the Zynq 7000 FPGA. The resulting circuit has 34%
average FPGA utilization and 2.47 Watt power consumption. Recommendations to
improve the classi cation accuracy with deeper network and ways to t the improved
network on the FPGA are also mentioned at the end of the work.
|
113 |
Design of an FPGA-based Array Formatter for Casa Phase-Tilt Radar SystemKrishnamurthy, Akilesh 01 January 2011 (has links) (PDF)
Weather monitoring and forecasting systems have witnessed rapid advancement in recent years. However, one of the main challenges faced by these systems is poor coverage in lower atmospheric regions due to earth's curvature. The Engineering Research Center for the Collaborative Adaptive Sensing of the Atmosphere (CASA) has developed a dense network of small low-power radars to improve the coverage of weather sensing systems. Traditional, mechanically-scanned antennas used in these radars are now being replaced with high-performance electronically-scanned phased-arrays. Phased-Array radars, however, require large number of active microwave components to scan electronically in both the azimuth and elevation planes, thus significantly increasing the cost of the entire radar system. To address this issue, CASA has designed a "Phase-Tilt" radar, that scans electronically in azimuth and mechanically in elevation. One of the core components of this system is the Phased-Array controller or the Array Formatter. The Array Formatter is a Field Programmable Gate Array (FPGA)-based master controller that translates user commands from a computer to control and timing signals for the radar system. The objective of this thesis is to design and test an FPGA-based Array Formatter for CASA's Phase-Tilt radar system.
|
114 |
FPGA Accelerated Digital Image Correlation For Clamping Force MeasurementCsuvarszki, János Csanád January 2023 (has links)
Digital image correlation is a contactless optical method used for displacement and strain measurement which has become increasingly popular in the field of experimental mechanics. A specialized use case for the algorithm is to measure the clamping force in bolted joints, a crucial metric when considering the longevity and reliability of the constructs. However, in order to be able to measure the clamping force in real-time, the digital image correlation has to be carried out rapidly as the tightening of the bolts can happen in milliseconds. One approach to increase the speed of the process is hardware acceleration. This thesis presents and evaluates multiple variations of an Field Programmable Gate Arrays (FPGA)-accelerated digital image correlation framework. The goal of the project is to accelerate the image correlation to sufficient speeds so it can be used for highly dynamic and continuous tightenings, which can take 20 to 200 ms and 200 to 1000 ms or more to finish respectively. A baseline implementation was created based on an innovative digital image correlation framework. Strain calculation was altered for the specialized use of clamping force determination. Afterward, different parts of the framework were selected and optimized for hardware acceleration. The parts include both preprocessing and correlation steps. The targets for acceleration were optimized using techniques such as quantization and pipelining. The accelerators were created using high-level synthesis and the resulting implementations utilize both the processor and FPGA parts of a Zynq-7000 system-on-chip. Results show that all accelerators reduce the total execution time of the framework by varying degrees. Accelerators targeting the preprocessing parts such as Gaussian and B-spline filtering proved to be the most effective in speeding up the process achieving a 1,56 and 1,12 times speedup for the fixed-point and a 1,2 and 1,07 times speedup for the double floating-point versions respectively. A combined version containing multiple accelerators resulted in a 1,9 times average speedup. It can be concluded that the presented approach is not fast enough for all highly dynamic tightening processes, as the fastest execution speed achieved is above 100 ms, but could be used for continuous tightening depending on constructs. / Digital image correlation(DIC) är en kontaktlös optisk metod, använd för mätning av förskjutning och töjning, som blivit en allt mer populär inom experimentell mekanik. Ett användningsområde för algoritmen är att mäta klämkraften i skruvförband, en avgörande faktor för hållbarhet och tillförlitlighet i konstruktioner. Men för att mäta klämkraft i realtid, behöver DIC utföras väldigt snabbt då åtdragningsförloppet kan ske inom loppet av millisekunder. En metod för att öka hastigheten är hårdvaruacceleration. Denna avhandling presenterar och utvärderar ett flertal varianter av ett Field Programmable Gate Arrays (FPGA)-accelererat DIC ramverk. Avhandlingen syftar till att accelerera bildkorrelationen tillräckligt mycket för att kunna användas till dynamiska och kontinuerliga åtdragningar som tar 20 till 200 ms respektive 200 till 1000 ms eller mer. En referens-implementation skapades baserat på ett innovativt DIC ramverk. Beräkning av töjning anpassades för specialfallet: bestämmandet av klämkraft. Efter det valdes olika delar av ramverket ut och optimerades för hårdvaruacceleration. De valda delarna innehåller både preprocessor- och korrelationssteg. Delarna som valdes ut för acceleration optimerades med hjälp av tekniker som kvantisering och pipelining. Acceleratorerna skapades med hjälp av high-level synthesis och de resulterande implementationerna använder både processor och FPGA i en Zynq-7000 system-on-chip. Resultaten visar att alla acceleratorer reducerar ramverkets totala exekveringstid med varierande grad. Acceleratorer som riktar sig mot preprocessing som Gaussian och B-spline filtrering visade sig vara mest effektiva och resulterade i en 1.56 respektive 1.12 gånger snabbare exekveringstid för fixed point, och 1.2 respektive 1.07 gånger snabbare exikveringstid för double floating-point. En kombinerad version som innehöll flera acceleratorer resulterade i en 1.9 gånger snabbare genomsnittlig exekveringstid. Slutsatsen är att den presenterade metoden inte är tillräckligt snabb för alla dynamiska åtdragningsförlopp, då den snabbaste uppnådda exekveringstiden är över 100 ms. Men metoden skulle kunna användas för kontinuerliga åtdragningar beroende på konstruktionen.
|
115 |
Modular Multi-Signal Tracking Pulse Descriptor Word (PDW) Generator WithField Programmable Gate Array (FPGA) ImplementationPelan, Justin Darrell 26 August 2016 (has links)
No description available.
|
116 |
Hardware Design And Certification Aspects Of A Field Programmable Gate Array-Based Terrain Database Integrity Monitor For A Synthetic Vision SystemKakkeroda, Anupriya 18 December 2004 (has links)
No description available.
|
117 |
A Cost-Efficient Digital ESN Architecture on FPGAGan, Victor Ming 01 September 2020 (has links)
Echo State Network (ESN) is a recently developed machine-learning paradigm whose processing capabilities rely on the dynamical behavior of recurrent neural networks (RNNs). Its performance metrics outperform traditional RNNs in nonlinear system identification and temporal information processing. In this thesis, we design and implement ESNs through Field-programmable gate array (FPGA) and explore their full capacity of digital signal processors (DSPs) to target low-cost and low-power applications. We propose a cost-optimized and scalable ESN architecture on FPGA, which exploits Xilinx DSP48E1 units to cut down the need of configurable logic blocks (CLBs). The proposed work includes a linear combination processor with negligible deployment of CLBs, as well as a high-accuracy non-linear function approximator, both with the help of only 9 DSP units in each neuron. The architecture is verified with the classical NARMA dataset, and a symbol detection task for an orthogonal frequency division multiplexing (OFDM) system on a wireless communication testbed. In the worst-case scenario, our proposed architecture delivers a matching bit error rate (BER) compares to its corresponding software ESN implementation. The performance difference between the hardware and software approach is less than 6.5%. The testbed system is built on a software-defined radio (SDR) platform, showing that our work is capable of processing the real-world data. / Master of Science / Machine learning is a study of computer algorithms that evolves itself by learning through experiences. Currently, machine learning thrives as it opens up promising opportunities of solving the problems that is difficult to deal with conventional methods. Echo state network (ESN), a recently developed machine-learning paradigm, has shown extraordinary effectiveness on a wide variety of applications, especially in nonlinear system identification and temporal information processing. Despite the fact, ESN is still computationally expensive on battery-driven and cost-sensitive devices. A fast and power-saving computer for ESN is desperately needed. In this thesis, we design and implement an ESN computational architecture through the field-programmablegate array (FPGA). FPGA allows designers to build highly flexible customized hardware with rapid development time. Our design further explores the full capacity of digital signal processors (DSP) on Xilinx FPGA to target low-cost and low-power applications. The proposed cost-optimized and scalable ESN architecture exploits Xilinx DSP48E1 units to cut down the need of configurable logic blocks (CLBs). The work includes a linear combination processor with negligible deployment of CLBs, and a high-accuracy non-linear function approximator, both with the help of only 9 DSP units in each neuron. The architecture is verified with the classical NARMA dataset, and a symbol detection task for an orthogonal frequency division multiplexing (OFDM) system in a wireless communication testbed. In the worst-case scenario, our proposed architecture delivers a matching bit error rate (BER) compares to its corresponding software ESN implementation. The performance difference between the hardware and software approach is less than 6.5%. The testbed system is built on a software-defined radio (SDR) platform, showing that our work is capable of processing the real-world data.
|
118 |
Optimizing Reservoir Computing Architecture for Dynamic Spectrum Sensing ApplicationsSharma, Gauri 25 April 2024 (has links)
Spectrum sensing in wireless communications serves as a crucial binary classification tool in cognitive radios, facilitating the detection of available radio spectrums for secondary users, especially in scenarios with high Signal-to-Noise Ratio (SNR). Leveraging Liquid State Machines (LSMs), which emulate spiking neural networks like the ones in the human brain, prove to be highly effective for real-time data monitoring for such temporal tasks. The inherent advantages of LSM-based recurrent neural networks, such as low complexity, high power efficiency, and accuracy, surpass those of traditional deep learning and conventional spectrum sensing methods. The architecture of the liquid state machine processor and its training methods are crucial for the performance of an LSM accelerator. This thesis presents one such LSM-based accelerator that explores novel architectural improvements for LSM hardware. Through the adoption of triplet-based Spike-Timing-Dependent Plasticity (STDP) and various spike encoding schemes on the spectrum dataset within the LSM, we investigate the advantages offered by these proposed techniques compared to traditional LSM models on the FPGA. FPGA boards, known for their power efficiency and low latency, are well-suited for time-critical machine learning applications. The thesis explores these novel onboard learning methods, shares the results of the suggested architectural changes, explains the trade-offs involved, and explores how the improved LSM model's accuracy can benefit different classification tasks. Additionally, we outline the future research directions aimed at further enhancing the accuracy of these models. / Master of Science / Machine Learning (ML) and Artificial Intelligence (AI) have significantly shaped various applications in recent years. One notable domain experiencing substantial positive impact is spectrum sensing within wireless communications, particularly in cognitive radios. In light of spectrum scarcity and the underutilization of RF spectrums, accurately classifying spectrums as occupied or unoccupied becomes crucial for enabling secondary users to efficiently utilize available resources. Liquid State Machines (LSMs), made of spiking neural networks resembling human brain, prove effective in real-time data monitoring for this classification task. Exploiting the temporal operations, LSM accelerators and processors, facilitate high performance and accurate spectrum monitoring than conventional spectrum sensing methods.
The architecture of the liquid state machine processor's training and optimal learning methods plays a pivotal role in the performance of a LSM accelerator. This thesis delves into various architectural enhancements aimed at spectrum classification using a liquid state machine accelerator, particularly implemented on an FPGA board. FPGA boards, known for their power efficiency and low latency, are well-suited for time-critical machine learning applications. The thesis explores onboard learning methods, such as employing a targeted encoder and incorporating Triplet Spike Timing-Dependent Plasticity (Triplet STDP) in the learning reservoir. These enhancements propose improvements in accuracy for conventional LSM models. The discussion concludes by presenting results of the architectural implementations, highlighting trade-offs, and shedding light on avenues for enhancing the accuracy of conventional liquid state machine-based models further.
|
119 |
FPGA Reservoir Computing Networks for Dynamic Spectrum SensingShears, Osaze Yahya 14 June 2022 (has links)
The rise of 5G and beyond systems has fuelled research in merging machine learning with wireless communications to achieve cognitive radios. However, the portability and limited power supply of radio frequency devices limits engineers' ability to combine them with powerful predictive models. This hinders the ability to support advanced 5G applications such as device-to-device (D2D) communication and dynamic spectrum sharing (DSS). This challenge has inspired a wave of research in energy efficient machine learning hardware with low computational and area overhead. In particular, hardware implementations of the delayed feedback reservoir (DFR) model show promising results for meeting these constraints while achieving high accuracy in cognitive radio applications. This thesis answers two research questions surrounding the applicability of FPGA DFR systems for DSS. First, can a DFR network implemented on an FPGA run faster and with lower power than a purely software approach? Second, can the system be implemented efficiently on an edge device running at less than 10 watts?
Two systems are proposed that prove FPGA DFRs can achieve these feats: a mixed-signal circuit, followed by a high-level synthesis circuit. The implementations execute up to 58 times faster, and operate at more than 90% lower power than the software models. Furthermore, the lowest recorded average power of 0.130 watts proves that these approaches meet typical edge device constraints. When validated on the NARMA10 benchmark, the systems achieve a normalized error of 0.21 compared to state-of-the-art error values of 0.15. In a DSS task, the systems are able to predict spectrum occupancy with up to 0.87 AUC in high noise, multiple input, multiple output (MIMO) antenna configurations compared to 0.99 AUC in other works. At the end of this thesis, the trade-offs between the approaches are analyzed, and future directions for advancing this study are proposed. / Master of Science / The rise of 5G and beyond systems has fuelled research in merging machine learning with wireless communications to achieve cognitive radios. However, the portability and limited power supply of radio frequency devices limits engineers' ability to combine them with powerful predictive models. This hinders the ability to support advanced 5G and internet-of-things (IoT) applications. This challenge has inspired a wave of research in energy efficient machine learning hardware with low computational and area overhead. In particular, hardware implementations of a low complexity neural network model, called the delayed feedback reservoir, show promising results for meeting these constraints while achieving high accuracy in cognitive radio applications. This thesis answers two research questions surrounding the applicability of field-programmable gate array (FPGA) delayed feedback reservoir systems for wireless communication applications. First, can this network implemented on an FPGA run faster and with lower power than a purely software approach? Second, can the network be implemented efficiently on an edge device running at less than 10 watts? Two systems are proposed that prove the FPGA networks can achieve these feats. The systems demonstrate lower power consumption and latency than the software models. Additionally, the systems maintain high accuracy on traditional neural network benchmarks and wireless communications tasks. The second implementation is further demonstrated in a software-defined radio architecture. At the end of this thesis, the trade-offs between the approaches are analyzed, and future directions for advancing this study are proposed.
|
120 |
UTILIZATION OF FIELD PROGRAMMABLE GATE ARRAYS AND DIGITAL SIGNAL PROCESSING MICROPROCESSORS IN AN ADVANCED PC TT&C SATCOM SYSTEMMeyers, Tom 10 1900 (has links)
International Telemetering Conference Proceedings / October 25-28, 1999 / Riviera Hotel and Convention Center, Las Vegas, Nevada / L-3 Communications Telemetry & Instrumentation (L-3 T&I) has developed an advanced
IBM PC-AT Telemetry, Tracking, and Commanding (TT&C) SATCOM system based on
the utilization of Field Programmable Gate Array / Digital Signal Processing (FPGA/DSP)
microprocessors. This system includes up-link, down-link, and range processing sections.
Physically, the system consists of one IF Transceiver and two or more FPGA/DSP
microprocessor boards called Advanced Processing Microprocessors (APMs). The form
factor of these PWBs is compliant with full length, full height IBM PC PCI bus cards. This
paper describes the features and functionality of an advanced Telemetry, Tracking, and
Commanding Processing System (TT&CPS) based on the implementation of FPGA and
DSP microprocessors.
The high-level functional attributes of the TT&CPS are depicted in Figure 1. There are
four main functional blocks: the IF Transceiver, the Down-Link Processing Section, the
Up-Link Processing Section, and the Range Processor. The analog/IF circuitry in the IF
Transceiver card interfaces between the 68–72 MHz (70 MHz, nominal) IF I/O signals and
the Up-Link and Down-Link Processing Section's DSP equipment. The down-link portion
of the IF Transceiver card has two user-selected input ports. From the selected input, the
signal is processed through selectable bandwidth limiting, gain control, Doppler correction
(optional), quadrature down-conversion to zero hertz (baseband), selectable baseband
filtering, and precision Analog-to-Digital (A/D) conversion. The up-link portion of the IF
Transceiver card takes I/Q digital data from the APM performing the up-link processing
functions. This baseband I/Q digital data is Digital-to-Analog (D/A) converted, filtered,
quadrature up-converted to 68–72 MHz, up-link Doppler corrected (optional), output level
detected and level controlled, and sent to a two-position output selector switch. The down-link portion of the TT&CPS provides main carrier linear PM or BPSK or QPSK
demodulation and can also, in composite linear PM demodulation mode, receive and
demodulate FSK and/or BPSK subcarriers and ranging signals. The demodulators use
symbol timing loops and bit decision circuits (matched filters) to perform the bit
synchronization function. Several decoding algorithms, including differential, de-interleaving,
Viterbi, and Reed-Solomon, are available for the down-link telemetry.
Command format checking and CRC status is also available on FSK-demodulated data.
Direct carrier BPSK/QPSK demodulation has decoding and frame synchronization
capabilities. Because of the modular construction of the firmware and the use of FPGAs
and DSPs, the system can be loaded with only the functions in use, lowering initial setup
time while increasing overall system capability. To support a particular function, the card
is downloaded with an “image,” which programs the FPGAs and DSPs at initialization.
The user can change configurations by simply downloading a new set of instructions to the
FPGA/DSP on the fly to keep the ground station running with minimal downtime. The
flexibility of the design minimizes spare board costs, while achieving greater
programmability at the end-user location.
|
Page generated in 0.0735 seconds