Global ETD Search

171	Deep Learning Binary Neural Network on an FPGA Redkar, Shrutika 27 April 2017 (has links) In recent years, deep neural networks have attracted lots of attentions in the field of computer vision and artificial intelligence. Convolutional neural network exploits spatial correlations in an input image by performing convolution operations in local receptive fields. When compared with fully connected neural networks, convolutional neural networks have fewer weights and are faster to train. Many research works have been conducted to further reduce computational complexity and memory requirements of convolutional neural networks, to make it applicable to low-power embedded applications. This thesis focuses on a special class of convolutional neural network with only binary weights and activations, referred as binary neural networks. Weights and activations for convolutional and fully connected layers are binarized to take only two values, +1 and -1. Therefore, the computations and memory requirement have been reduced significantly. The proposed architecture of binary neural networks has been implemented on an FPGA as a real time, high speed, low power computer vision platform. Only on-chip memories are utilized in the FPGA design. The FPGA implementation is evaluated using the CIFAR-10 benchmark and achieved a processing speed of 332,164 images per second for CIFAR-10 dataset with classification accuracy of about 86.06%. Real time Deep Learning Neural Networks FPGA
172	Real time implementation of SURF algorithm on FPGA platform Zhu, Sichao 30 April 2014 (has links) Too many traffic accidents are caused by driversâ€™ failure of noticing buildings, traffic sign and other objects. Video based scene or object detection which can easily enhance driversâ€™ judgment performance by automatically detecting scene and signs. Two of the recent popular video detection algorithms are Background Differentiation and Feature based object detection. The background Differentiation is an efficient and fast way of observing a moving object in a relatively stationary background, which makes it easy to be implemented on a mobile platform and performs a swift processing speed. The Feature based scene detection such like the Speeded Up Robust Feature (SURF), is an appropriate way of detecting specific scene with accuracy and rotation and illumination invariance. By comparison, SURF computational expense is much higher, which remains the algorithm limited in real time mobile platform. In this thesis, I present two real time tracking algorithms, Differentiation based and SURF based scene detection systems on FPGA platform. The proposed hardware designs are able to process video of 800*600 resolution at 60 frames per second, the video clock rate is 40 MHz. real time. video SURF FPGA Implementation
173	Design and Implementation of Shen, Chen 14 January 2010 (has links) Multiple-input multiple-output (MIMO) technique in communication system has been widely researched. Compared with single-input single-output (SISO) communication, its properties of higher throughput, more e?cient spectrum and usage make it one of the most significant technology in modern wireless communications. In MIMO system, sphere detection is the fundamental part. The purpose of traditional sphere detection is to achieve the maximum likelihood (ML) demodulation of the MIMO system. However, with the development of advanced forward error correction (FEC) techniques, such as the Convolutional code, Turbo code and LDPC code, the sphere detection algorithms that can provide soft information for the outer decoder attract more interests recently. Considering the computing complexity of generating the soft information, it is important to develop a high-speed VLSI architecture for MIMO detection. The first part of this thesis is about MIMO sphere detection algorithms. Two sphere detection algorithms are introduced. The depth first Schnorr-Euchner (SE) algorithm which generates the ML detection solution and the width first K-BEST algorithm which only generates the nearly-ML detection solution but more efficient in implementation are presented. Based on these algorithms, an improved nearly-ML algorithm with lower complexity and limited performance lose, compared with traditional K-BEST algorithms, is presented. The second part is focused on the hardware design. A 4*4 16-QAM MIMO detection system which can generate both soft information and hard decision solution is designed and implemented in FPGA. With the fully pipelined and parallel structure, it can achieve a throughput of 3.7 Gbps. In this part, the improved nearly-ML algorithm is implmented as a detector to generat both the hard output and candidate list. Then, a soft information calculation block is designed to succeed the detector and produce the log-likelihood ratio (LLR) values for every bit as the soft output. soft-output decoding FPGA sphere detection MIMO
174	Feature detection in an indoor environment using Hardware Accelerators for time-efficient Monocular SLAM Vyas, Shivang 03 August 2015 (has links) " In the field of Robotics, Monocular Simultaneous Localization and Mapping (Monocular SLAM) has gained immense popularity, as it replaces large and costly sensors such as laser range finders with a single cheap camera. Additionally, the well-developed area of Computer Vision provides robust image processing algorithms which aid in developing feature detection technique for the implementation of Monocular SLAM. Similarly, in the field of digital electronics and embedded systems, hardware acceleration using FPGAs, has become quite popular. Hardware acceleration is based upon the idea of offloading certain iterative algorithms from the processor and implementing them on a dedicated piece of hardware such as an ASIC or FPGA, to speed up performance in terms of timing and to possibly reduce the net power consumption of the system. Good strides have been taken in developing massively pipelined and resource efficient hardware implementations of several image processing algorithms on FPGAs, which achieve fairly decent speed-up of the processing time. In this thesis, we have developed a very simple algorithm for feature detection in an indoor environment by means of a single camera, based on Canny Edge Detection and Hough Transform algorithms using OpenCV library, and proposed its integration with existing feature initialization technique for a complete Monocular SLAM implementation. Following this, we have developed hardware accelerators for Canny Edge Detection & Hough Transform and we have compared the timing performance of implementation in hardware (using FPGAs) with an implementation in software (using C++ and OpenCV). " Canny Hardware Acceleration Hough SLAM FPGA
175	Academic Packing for Commercial FPGA Architectures Haroldsen, Travis D. 01 July 2017 (has links) With a few exceptions, academic packing algorithms for FPGAs are typically applied solely to theoretical architectures. This has allowed the algorithms to focus on the basic components of packing while abstracting away many of the details dictated by real hardware. As commercially available FPGAs have advanced, however, the academic algorithms and architectures have diverged significantly from their commercial counterparts. In this dissertation, the RapidSmith 2 framework is presented. This framework accurately reflects the architecture of Xilinx FPGAs and provides support for integrating custom tools into the commercial CAD tools. Using this framework, the RSVPack packing algorithm is implemented. The RSVPack algorithm can accept a design synthesized using the commercial Xilinx CAD tools, pack designs which make use of the many features of commercial FPGA architectures and return the packed designs to the Xilinx CAD tools to be placed and routed in their software. This enables researchers to isolate the packing portion of the algorithm from the commercial flow and evaluate different packing techniques while allowing the high-quality commercial tools to perform the remainder of the flow. Integrating the RSVPack algorithm the commercial flow shows RSVPack produces packing which lead to circuits with minimum clock periods within 10%, on average, of circuits generated using the pure Xilinx flow. Included in this work is a novel table lookup-based algorithm which RSVPack utilizes to quickly determine the routability of a cluster. This algorithm performs 5 times faster on average than the current academic alternatives. Finally, using RSVPack, this dissertation explores various techniques for improving the quality of packing for Xilinx circuits. Together, this demonstrates the potential for academic research into FPGA CAD tools for commercial architectures. FPGA packing Xilinx algorithms Electrical and Computer Engineering
176	Integração de unidades de memória flash em sistemas embutidos sobre plataformas FPGA Basadre, Francisco Nuno Alves Orge January 2009 (has links) Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores (Major Telecomunicações). Faculdade de Engenharia. Universidade do Porto. 2009 Plataformas FPGA Memórias flash Sistemas embebidos
177	Educational package based on the MIPS architecture for FPGA platforms Pereira, João Luís Silva Campos January 2009 (has links) Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores (Major em Telecomunicações). Faculdade de Engenharia. Universidade do Porto. 2009 Ensino-aprendizagem Arquitectura MIPS Plataformas FPGA
178	Suporte em Linux para reconfiguração dinâmica de hardware Monteiro, Bruno Miguel da Silva January 2009 (has links) Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores (Major de Telecomunicações). Faculdade de Engenharia. Universidade do Porto. 2009 Sistemas reconfiguráveis Plataformas FPGA Linux - sistema operativo
179	Biblioteca de módulos Verilog para interface de FPGAs com periféricos I/O Machado, Ricardo Jorge dos Santos January 2010 (has links) Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores (Telecomunicações). Universidade do Porto. Faculdade de Engenharia. 2010 Plataformas FPGA
180	Diseño e implementación de la correlación y de la correntropía cruzada, utilizando FPGA Rivera Serrano, Francisco Javier January 2017 (has links) Magíster en Ciencias de la Ingeniería, Mención Eléctrica / La Correntropía es una medida no lineal de similitud entre dos variables aleatorias. Esta Tesis plantea una forma de implementación de la correntropía, haciendo uso de dispositivos digitales de alta integración llamados FPGA (Field Programmable Gate Array ) los cuales permiten procesar la información directamente en hardware, logrando mejoras significativas en los tiempos de proceso. El objetivo de esta Tesis es el diseño e implementación en hardware de la correlación cruzada y de la correntropía cruzada, utilizando FPGA. De acuerdo a lo investigado a la fecha, existen trabajos previos en la implementación de la correlación pero no así para la correntropía en la forma como aquí se plantea. Para poder comparar lo obtenido con correntropía, se implementó también la correlación cruzada, utilizando los mismos dispositivos FPGA. En base a lo anterior, se desarrolló un diseño considerando la obtención de la menor latencia posible para el cálculo de la Correntropía, siendo la latencia el retardo producido entre la entrada y la salida para producir un resultado esperado. Se supone que la latencia de un FPGA es menor entre uno y dos órdenes de magnitud, comparado con un procesador, lo cual se demuestra en este trabajo. En esta Tesis, con el fin de implementar el hardware en base a dispositivos FPGA, se ha desarrollado una metodología de diseño en Sistemas Digitales, basada en Máquinas de Estado Finito que separa claramente el diseño de la implementación y puede ser aplicada para abordar sistemas digitales complejos y de gran envergadura. Para desarrollar esta Tesis se decidió utilizar la tarjeta de desarrollo Nexys4 de Xilinx la cual utiliza la herramienta de software VIVADO. Dentro de VIVADO, el lenguaje de descripción de hardware (HDL) utilizado fue SystemVerilog. En relación al desarrollo del proyecto, éste se dividió en dos etapas: la primera contempló el diseño e implementación de la Correlación Cruzada, utilizando un FPGA. Se utilizó la definición de correlación en el dominio de la frecuencia. Esto implicó utilizar módulos que calculan la Transformada de Fourier para cada una de las entradas. La segunda etapa del proyecto contempló el diseño e implementación de la Correntropía Cruzada, propiamente tal, utilizando un FPGA. El enfoque de diseño es diferente al aplicado a la correlación, dado que la definición de correntropía incluye un Kernel Gaussiano. En ambas etapas del proyecto se lograron los resultados esperados: salidas del diseño implementado para FPGA, idénticas a las salidas dadas por la herramienta MATLAB, considerando diferentes tipos de entradas: señales sinusoidales de distinto tipo dado que son más fáciles de implementar y visualizar, series de tiempo de señales electromagnéticas de Astronomía y eventos de husos de sueño en registros de electroencefalogramas (EEG). Se confirma, además, la menor latencia, de al menos un orden de magnitud, de las salidas de la herramienta VIVADO en comparación a lo obtenido con la herramienta MATLAB, obteniéndose menores latencias para la Correlación que para la Correntropía. Curvas de luz Correntropía cruzada Correlación cruzada FPGA

Search results