Global ETD Search

271	Rapid Prototyping of an FPGA-Based Video Processing System Shi, Zhun 20 June 2016 (has links) Computer vision technology can be seen in a variety of applications ranging from mobile phones to autonomous vehicles. Many computer vision applications such as drones and autonomous vehicles requires real-time processing capability in order to communicate with the control unit for sending commands in real time. Besides real-time processing capability, it is crucial to keep the power consumption low in order to extend the battery life of not only mobile devices, but also drones and autonomous vehicles. FPGAs are desired platforms that can provide high-performance and low-power solutions for real-time video processing. As hardware designs typically are more time consuming than equivalent software designs, this thesis proposes a rapid prototyping flow for FPGA-based video processing system design by taking advantage of the use of high performance AXI interface and a high level synthesis tool, Vivado HLS. Vivado HLS provides the convenience of automatically synthesizing a software implementation to hardware implementation. But the tool is far from being perfect, and users still need embedded hardware knowledge and experience in order to accomplish a successful design. In order to effectively create a stream type video processing system as well as to utilize the fastest memory on an FPGA, a sliding window memory architecture is proposed. This memory architecture can be applied to a series of video processing algorithms while the latency between an input pixel and an output pixel is minimized. By comparing my approach with other works, this optimized memory architecture proves to offer better performance and lower resource usage over what other works could offer. Its reconfigurability also provides better adaptability of other algorithms. In addition, this work includes performance and power analysis among an Intel CPU based design, an ARM based design, and an FPGA-based embedded design. / Master of Science Field programmable gate arrays Computer Vision Video Processing Rapid Prototyping High-Level Synthesis
272	A Zynq-based Cluster Cognitive Radio Rooks, Kurtis M. 25 July 2014 (has links) Traditional hardware radios provide very rigid solutions to radio problems. Intelligent software defined radios, also known as cognitive radios, provide flexibility and agility compared to hardware radio systems. Cognitive radios are well suited for radio applications in a changing radio frequency environment, such as dynamic spectrum access. In this thesis, a cognitive radio is demonstrated where the system self reconfigures to demodulate a detected waveform. The GNU Radio framework is used to provide basic software defined radio building blocks and is supplemented with FPGA accelerators. The use of GNU Radio compliant hardware interfaces allows for seamless hardware/software radio deployments. Dynamic resource mapping allows radio designers to operate at a layer of abstraction above the physical radio implementation. By establishing lower level abstraction layers, future researchers can focus on larger picture concepts such as learning algorithms and behavioral models for the cognitive engine. / Master of Science Cognitive radio networks Xilinx Zynq tFlow Field programmable gate arrays GNU Radio cluster
273	Methods for Securing the Integrity of FPGA Configurations Webb, James Braxton 18 October 2006 (has links) As Field Programmable Gate Arrays (FPGAs) continue to become integral parts of embedded systems, it is imperative to consider their security. While much of the research in this field is oriented toward the protection of the intellectual property contained in the FPGA's configuration, the protection of the design's integrity from malicious attack against the configuration is critical to the operation of the system. Methods for attacking the configuration are semi-invasive attacks, such as fault injection, and data tampering of incoming partial bitstreams. This thesis introduces methods for securing the integrity of an FPGA's configuration. The design and implementation is discussed for a system that consists of three parts. The first subsystem monitors the running configuration. The second subsystem authenticates partial bistreams that may be used for repairing the configuration from malicious alterations during run-time. The third subsystem indicates if the system itself succumbs to a malicious attack. The system is implemented on-chip, allowing the FPGA to effectively secure itself from attack. / Master of Science authentication integrity fault-injection embedded system reconfigurable dynamic partial security Field programmable gate arrays configuration configurable
274	Improving Field-Programmable Gate Array Scaling Through Wire Emulation Fong, Ryan Joseph Lim 23 September 2004 (has links) Field-programmable gate arrays (FPGAs) are excellent devices for high-performance computing, system-on-chip realization, and rapid system prototyping. While FPGAs offer flexibility and performance, they continue to lag behind application specific integrated circuit (ASIC) performance and power consumption. As manufacturing technology improves and IC feature size decreases, FPGAs may further lag behind ASICs due to interconnection scalability issues. To improve FPGA scalability, this thesis proposes an architectural enhancement to improve global communications in large FPGAs, where chip-length programmable interconnects are slow. It is expected that this architectural enhancement, based on wire emulation techniques, can reduce chip-length communication latency and routing congestion. A prototype wire emulation system that uses FPGA self-reconfiguration as a non-traditional means of intra-FPGA communication is implemented and verified on a Xilinx Virtex-II XC2V1000 FPGA. Wire emulation benefits and impact to FPGA architecture are examined with quantitative and qualitative analysis. / Master of Science Xilinx wire Field programmable gate arrays scaling emulation ICAP self-reconfiguration Virtex-II
275	Enabling the use of Heterogeneous Computing for Bioinformatics Bijanapalli Chakri, Ramakrishna 02 October 2013 (has links) The huge amount of information in the encoded sequence of DNA and increasing interest in uncovering new discoveries has spurred interest in accelerating the DNA sequencing and alignment processes. The use of heterogeneous systems, that use different types of computational units, has seen a new light in high performance computing in recent years; However expertise in multiple domains and skills required to program these systems is causing an hindrance to bioinformaticians in rapidly deploying their applications into these heterogeneous systems. This work attempts to make an heterogeneous system, Convey HC-1, with an x86-based host processor and FPGA-based co-processor, accessible to bioinformaticians. First, a highly efficient dynamic programming based Smith-Waterman kernel is implemented in hardware, which is able to achieve a peak throughput of 307.2 Giga Cell Updates per Second (GCUPS) on Convey HC-1. A dynamic programming accelerator interface is provided to any application that uses Smith-Waterman. This implementation is also extended to General Purpose Graphics Processing Units (GP-GPUs), which achieved a peak throughput of 9.89 GCUPS on NVIDIA GTX580 GPU. Second, a well known graphical programming tool, LabVIEW is enabled as a programming tool for the Convey HC-1. A connection is established between the graphical interface and the Convey HC-1 to control and monitor the application running on the FPGA-based co-processor. / Master of Science Field programmable gate arrays Hardware Acceleration High Performance Computing DNA Alignment LabVIEW Heterogeneous Computing GP-GPUs
276	Characterization of FPGA-based High Performance Computers Pimenta Pereira, Karl Savio 02 September 2011 (has links) As CPU clock frequencies plateau and the doubling of CPU cores per processor exacerbate the memory wall, hybrid core computing, utilizing CPUs augmented with FPGAs and/or GPUs holds the promise of addressing high-performance computing demands, particularly with respect to performance, power and productivity. While traditional approaches to benchmark high-performance computers such as SPEC, took an architecture-based approach, they do not completely express the parallelism that exists in FPGA and GPU accelerators. This thesis follows an application-centric approach, by comparing the sustained performance of two key computational idioms, with respect to performance, power and productivity. Specifically, a complex, single precision, floating-point, 1D, Fast Fourier Transform (FFT) and a Molecular Dynamics modeling application, are implemented on state-of-the-art FPGA and GPU accelerators. As results show, FPGA floating-point FFT performance is highly sensitive to a mix of dedicated FPGA resources; DSP48E slices, block RAMs, and FPGA I/O banks in particular. Estimated results show that for the floating-point FFT benchmark on FPGAs, these resources are the performance limiting factor. Fixed-point FFTs are important in a lot of high performance embedded applications. For an integer-point FFT, FPGAs exploit a flexible data path width to trade-off circuit cost and speed of computation, improving performance and resource utilization. GPUs cannot fully take advantage of this, having a fixed data-width architecture. For the molecular dynamics application, FPGAs benefit from the flexibility in creating a custom, tightly-pipelined datapath, and a highly optimized memory subsystem of the accelerator. This can provide a 250-fold improvement over an optimized CPU implementation and 2-fold improvement over an optimized GPU implementation, along with massive power savings. Finally, to extract the maximum performance out of the FPGA, each implementation requires a balance between the formulation of the algorithm on the platform, the optimum use of available external memory bandwidth, and the availability of computational resources; at the expense of a greater programming effort. / Master of Science FFT molecular dynamics integer-point floating-point GPU HPC Field programmable gate arrays
277	FPGA-Based Accelerator Development for Non-Engineers Uliana, David Christopher 02 June 2014 (has links) In today's world of big-data computing, access to massive, complex data sets has reached an unprecedented level, and the task of intelligently processing such data into useful information has become a growing concern to the high-performance computing community. However, domain experts, who are the brains behind this processing, typically lack the skills required to build FPGA-based hardware accelerators ideal for their applications, as traditional development flows targeting such hardware require digital design expertise. This work proposes a usable, end-to-end accelerator development methodology that attempts to bridge this gap between domain-experts and the vast computational capacity of FPGA-based heterogeneous platforms. To accomplish this, two development flows were assembled, both targeting the Convey Hybrid-Core HC-1 heterogeneous platform and utilizing existing graphical design environments for design entry. Furthermore, incremental implementation techniques were applied to one of the flows to accelerate bitstream compilation, improving design productivity. The efficacy of these flows in extending FPGA-based acceleration to non-engineers in the life sciences was informally tested at two separate instances of an NSF-funded summer workshop, organized and hosted by the Virginia Bioinformatics Institute at Virginia Tech. In both workshops, groups of four or five non-engineer participants made significant modifications to a bare-bones Smith-Waterman accelerator, extending functionality and improving performance. / Master of Science Big-data HPC Field programmable gate arrays Heterogeneous Computing Life Sciences
278	Flexible and Lightweight Cryptographic Engines for Constrained Systems Gulcan, Ege 04 June 2015 (has links) There is a significant effort in building lightweight cryptographic operations, yet the proposed solutions are typically single purpose modules that can only provide a fixed functionality. However, flexibility is an important aspect of cryptographic designs where a module can perform multiple operations with different configurations. In this work, we combine flexibility with lightweight designs and propose two cryptographic engines based on the SIMON block cipher. The first proposed engine is the Flexible SIMON, which can execute all configurations of SIMON thus enables an adaptive security with variable key sizes. Our second proposed implementation is BitCryptor, a bit-serialized Compact Crypto Engine that can perform symmetric key encryption, hash computation and pseudo-random-number-generation. The implementation results on a Spartan-3 s50 FPGA show that the proposed engines occupies 90 and 95 slices respectively, which are more compact than the majority of their single purpose counterparts. Therefore, these engines are suitable cryptographic blocks for resource-constrained systems. / Master of Science Lightweight Cryptography Block Ciphers Flexible Architectures SIMON Field programmable gate arrays
279	Making Radios with GReasy: GNU Radio With FPGAs Made Easy Marlow, Ryan Lane 29 August 2014 (has links) Radio technology is rapidly evolving and as processing capabilities and algorithms become more complex, the need for alternative compilation and user interface abstraction increases. Field Programmable Gate Array (FPGA) technology introduces unique reconfigurable hardware architectures that can aid in software defined radio (SDR) design. FPGAs have greater processing capability than traditional general purpose processors (GPP) found in desktop workstations. This work builds on an ongoing project, GReasy, that augments a Linux based open source SDR development platform, GNU Radio, with FPGA processing capabilities. By delegating processing intensive portions of a radio design to the Xilinx Zynq FPGA architecture, the domain of deployable radios by GNU Radio can be broadened. Xilinx Zynq, integrates the FPGA fabric and CPU onto a single chip, which eliminates the need for a controlling host computer; thus, providing a single, portable, low-power, embedded platform. This thesis presents a Zynq capable version of GNU Radio -- an open-source rapid radio deployment tool -- with an enhanced flow that utilizes the processing capability of FPGAs. This work features TFlow -- an FPGA back-end compilation accelerator for instant FPGA assembly. GReasy generates a description of the hardware components that are used by TFlow for the instant FPGA assembly. Once the FPGA is programmed with a design based on the description generated by GReasy, modules and the target hardware can be parameterized to realize an even larger class of applications and further solidify the concept of rapid assembly of software defined radios. / Master of Science Field programmable gate arrays GNU Radio Software radio Rapid Assembly Productivity
280	Exploits in Concurrency for Boolean Satisfiability Sohanghpurwala, Ali Asgar Ali Akbar 14 December 2018 (has links) Boolean Satisfiability (SAT) is a problem that holds great theoretical significance along with effective formulations that benefit many real-world applications. While the general problem is NP-complete, advanced solver algorithms and heuristics allow for fast solutions to many large industrial problems. In addition to SAT, many applications rely on generalizations of Satisfiability such as MaxSAT, and Satisfiability Modulo Theories (SMT). Much of the advancement in SAT solver performance has been in the realm of improved sequential solvers with advanced conflict resolution, learning mechanisms, and sophisticated heuristics. There have been some successful demonstrations of massively parallel and hardware-accelerated solvers for SAT, but these have failed to find their way into mainstream usage. This document first presents previous work in Hardware Acceleration of Satisfiability followed by an analysis of why these attempts failed to gain widespread acceptance. It then demonstrates an alternative, hardware-centric approach, based on distributed Stochastic Local Search (SLS) that is better suited to efficient hardware implementation. Then a parallel SLS/CDCL hybrid approach is proposed that is suitable for distributed search with minimal communication overhead while maintaining completeness. Finally the efficacy and flexibility of distributed local search is considered with an adaptation to Weighted Partial MaxSAT (WPMS) and a focused case study on converted Probabilistic Inference instances. / Ph. D. / The Boolean Satisfiability (SAT) problem is an important decision problem that asks whether there exists a solution that satisfies all given constraints over a set of variables that can assume values of either 0 or 1. May real-world decision problems can be translated into SAT, and there exist efficient sequential solvers that can quickly resolve many such instances. Less progress has been made in efficiently scaling SAT solvers to modern multi-core systems and massively parallel hardware accelerators such as GPUs and Field Programmable Gate Arrays (FPGAs). This thesis explore different approaches to solving SAT based decision and optimization problems with the goal of increasing concurrency. Satisfiability Field programmable gate arrays SLS MaxSAT Parallel Local Search SAT

Search results