Global ETD Search

41	Akcelerace šifrování přenosu síťových dat / Acceleration of Network Traffic Encryption Koranda, Karel January 2013 (has links) This thesis deals with the design of hardware unit used for acceleration of the process of securing network traffic within Lawful Interception System developed as a part of Sec6Net project. First aim of the thesis is the analysis of available security mechanisms commonly used for securing network traffic. Based on this analysis, SSH protocol is chosen as the most suitable mechanism for the target system. Next, the thesis aims at introduction of possible variations of acceleration unit for SSH protocol. In addition, the thesis presents a detailed design description and implementation of the unit variation based on AES-GCM algorithm, which provides confidentiality, integrity and authentication of transmitted data. The implemented acceleration unit reaches maximum throughput of 2,4 Gbps.
42	Hardware Accelerated Digital Image Stabilization in a Video Stream / Hardware Accelerated Digital Image Stabilization in a Video Stream Pacura, Dávid January 2016 (has links) Cílem této práce je návrh nové techniky pro stabilizaci obrazu za pomoci hardwarové akcelerace prostřednictvím GPGPU. Využití této techniky umožnuje stabilizaci videosekvencí v reálném čase i pro video ve vysokém rozlišení. Toho je zapotřebí pro ulehčení dalšího zpracování v počítačovém vidění nebo v armádních aplikacích. Z důvodu existence vícerých programovacích modelů pro GPGPU je navrhnutý stabilizační algoritmus implementován ve třech nejpoužívanějších z nich. Jejich výkon a výsledky jsou následně porovnány a diskutovány.
43	Hardware acceleration of convolutional neural networks on FPGA Myrén, Adam January 2020 (has links) With the evolution of machine learning algorithms they are seeing a wider use in traditional signal processing applications. One of these areas is in radios for improved signal identification algorithms. With the large computational complexity of convolutional neural networks, it is of importance to use platforms that are as fast and energy efficient as possible. This thesis investigates hardware acceleration of convolutional neural networks on field programmable gate arrays, an reconfigurable integrated circuit. An existing toolflow, Haddoc2, is used and evaluated. This tool automates the mapping of a convolutional neural network from a high level description in Caffe to a synthesisable hardware description in VHDL hardware description language. Four models of different sizes are trained on the MNIST dataset and accelerators for these at different bitwidths are generated and then simulated in a VHDL testbench. The resulting accuracies are tolerable for the target problem and Haddoc2 can produce fast accelerators that would work well for smaller networks. Big networks was found to consume large amounts of resources in the field programmable gate array and is not feasible for a practical application. The treatment of weights as constants makes the accelerators fast since there is no memory bottleneck but makes the accelerator less flexible since a new set of weights would require to re-synthesize the design and reprogramming the field programmable gate array. Hardware acceleration Convolutional neural network Machine learning FPGA Annan elektroteknik och elektronik
44	Rendering av geodata med OpenGL Ingelborn, Marcus January 2020 (has links) Den här studien undersökte om det är lönsamt eller ej att implementera hård-varustöd, med hjälp av OpenGL, för rendering av geografisk data. I detta fall innebar det skapande av kartbilder med tillfälliga föremål positionerade och inritade. Föremålen var under konstant förändring och en bildruta kunde inte antas se likadan ut som nästa. För att besvara på frågan användes en testmiljö hos företaget Saab och den öppna programvaran Geotools. En aktionsforskning genomfördes där en ny ren-deringsmodul till Geotools implementerades. Den nya och den föregående ren-deringsmodulen, från Geotools, testades och deras renderingstider uppmättes. Därefter analyserades mätresultaten och jämfördes med statistiska metoder. Renderingstiden för en bild i den tidigare renderingsmodulen tog i snitt mellan 481 och 495 ms med sannolikhet på 99,9%. Renderingstiden utfördes i snitt med den nya renderingsmodulen på mellan 145 och 150 ms med samma sannolikhet. Inom ett konfidensintervall på 99,9% minskade snittrenderingstiden med mellan 333 och 347 ms för den nyutvecklade modulen med hårdvarustöd. Computer and Information Sciences Data- och informationsvetenskap
45	Turbo Code Performance Analysis Using Hardware Acceleration Nordmark, Oskar January 2016 (has links) The upcoming 5G mobile communications system promises to enable use cases requiring ultra-reliable and low latency communications. Researchers therefore require more detailed information about aspects such as channel coding performance at very low block error rates. The simulations needed to obtain such results are very time consuming and this poses achallenge to studying the problem. This thesis investigates the use of hardware acceleration for performing fast simulations of turbo code performance. Special interest is taken in investigating diﬀerent methods for generating normally distributed noise based on pseudorandom number generator algorithms executed in DSP:s. A comparison is also done regarding how well diﬀerent simulator program structures utilize the hardware. Results show that even a simple program for utilizing parallel DSP:s can achieve good usage of hardware accelerators and enable fast simulations. It is also shown that for the studied process the bottleneck is the conversion of hard bits to soft bits with addition of normally distributed noise. It is indicated that methods for noise generation which do not adhere to a true normal distribution can further speed up this process and yet yield simulation quality comparable to methods adhering to a true Gaussian distribution. Overall, it is show that the proposed use of hardware acceleration in combination with the DSP software simulator program can in a reasonable time frame generate results for turbo code performance at block error rates as low as 10−9. Turbo code hardware acceleration digital signal processors DSP simulation block error rate BLER pseudo random number generation PRNG 5G Elektroteknik och elektronik
46	Efficient FPGA SoC Processing Design for a Small UAV Radar Newmeyer, Luke Oliver 01 April 2018 (has links) Modern radar technology relies heavily on digital signal processing. As radar technology pushes the boundaries of miniaturization, computational systems must be developed to support the processing demand. One particular application for small radar technology is in modern drone systems. Many drone applications are currently inhibited by safety concerns of autonomous vehicles navigating shared airspace. Research in radar based Detect and Avoid (DAA) attempts to address these concerns by using radar to detect nearby aircraft and choosing an alternative flight path. Implementation of radar on small Unmanned Air Vehicles (UAV), however, requires a lightweight and power efficient design. Likewise, the radar processing system must also be small and efficient.This thesis presents the design of the processing system for a small Frequency Modulated Continuous Wave (FMCW) phased array radar. The radar and processing is designed to be light-weight and low-power in order to fly onboard a UAV less than 25 kg in weight. The radar algorithms for this design include a parallelized Fast Fourier Transform (FFT), cross correlation, and beamforming. Target detection algorithms are also implemented. All of the computation is performed in real-time on a Xilinx Zynq 7010 System on Chip (SoC) processor utilizing both FPGA and CPU resources.The radar system (excluding antennas) has dimensions of 2.25 x 4 x 1.5 in3, weighs 120 g, and consumes 8 W of power of which the processing system occupies 2.6 W. The processing system performs over 652 million arithmetic operations per second and is capable of performing the full processing in real-time. The radar has also been tested in several scenarios both airborne on small UAVs as well as on the ground. Small UAVs have been detected to ranges of 350 m and larger aircraft up to 800 m. This thesis will describe the radar design architecture, the custom designed radar hardware, the FPGA based processing implementations, and conclude with an evaluation of the system's effectiveness and performance. Efficient Computing Phased Array Radar FMCW Radar Digital Beamforming FPGA SoC Xilinx Zynq Heterogeneous Processing Hardware Acceleration Detect and Avoid Sense and Avoid Unmanned Air Vehicles Drone Technology Engineering
47	Hardware Acceleration of a Monte Carlo Simulation for Photodynamic Therapy Treatment Planning Lo, William Chun Yip 15 February 2010 (has links) Monte Carlo (MC) simulations are widely used in the field of medical biophysics, particularly for modelling light propagation in biological tissue. The iterative nature of MC simulations and their high computation time currently limit their use to solving the forward solution for a given set of source characteristics and tissue optical properties. However, applications such as photodynamic therapy treatment planning or image reconstruction in diffuse optical tomography require solving the inverse problem given a desired light dose distribution or absorber distribution, respectively. A faster means for performing MC simulations would enable the use of MC-based models for such tasks. In this thesis, a gold standard MC code called MCML was accelerated using two distinct hardware-based approaches, namely designing custom hardware on field-programmable gate arrays (FPGAs) and programming commodity graphics processing units (GPUs). Currently, the GPU-based approach is promising, offering approximately 1000-fold speedup with 4 GPUs compared to an Intel Xeon CPU. Photodynamic therapy Hardware acceleration Monte Carlo simulation MCML Treatment planning Graphics processing unit (GPU) Field programmable gate array (FPGA) 0760 0544 0752
48	Hardware Acceleration of a Monte Carlo Simulation for Photodynamic Therapy Treatment Planning Lo, William Chun Yip 15 February 2010 (has links) Monte Carlo (MC) simulations are widely used in the field of medical biophysics, particularly for modelling light propagation in biological tissue. The iterative nature of MC simulations and their high computation time currently limit their use to solving the forward solution for a given set of source characteristics and tissue optical properties. However, applications such as photodynamic therapy treatment planning or image reconstruction in diffuse optical tomography require solving the inverse problem given a desired light dose distribution or absorber distribution, respectively. A faster means for performing MC simulations would enable the use of MC-based models for such tasks. In this thesis, a gold standard MC code called MCML was accelerated using two distinct hardware-based approaches, namely designing custom hardware on field-programmable gate arrays (FPGAs) and programming commodity graphics processing units (GPUs). Currently, the GPU-based approach is promising, offering approximately 1000-fold speedup with 4 GPUs compared to an Intel Xeon CPU. Photodynamic therapy Hardware acceleration Monte Carlo simulation MCML Treatment planning Graphics processing unit (GPU) Field programmable gate array (FPGA) 0760 0544 0752
49	Approche de conception haut-niveau pour l'accélération matérielle de calcul haute performance en finance / High-level approach for hardware acceleration of high-performance computing in finance Mena morales, Valentin 12 July 2017 (has links) Les applications de calcul haute-performance (HPC) nécessitent des capacités de calcul conséquentes, qui sont généralement atteintes à l'aide de fermes de serveurs au détriment de la consommation énergétique d'une telle solution. L'accélération d'applications sur des plateformes hétérogènes, comme par exemple des FPGA ou des GPU, permet de réduire la consommation énergétique et correspond donc à un compromis architectural plus séduisant. Elle s'accompagne cependant d'un changement de paradigme de programmation et les plateformes hétérogènes sont plus complexes à prendre en main pour des experts logiciels. C'est particulièrement le cas des développeurs de produits financiers en finance quantitative. De plus, les applications financières évoluent continuellement pour s'adapter aux demandes législatives et concurrentielles du domaine, ce qui renforce les contraintes de programmabilité de solutions d'accélérations. Dans ce contexte, l'utilisation de flots haut-niveaux tels que la synthèse haut-niveau (HLS) pour programmer des accélérateurs FPGA n'est pas suffisante. Une approche spécifique au domaine peut fournir une réponse à la demande en performance, sans que la programmabilité d'applications accélérées ne soit compromise.Nous proposons dans cette thèse une approche de conception haut-niveau reposant sur le standard de programmation hétérogène OpenCL. Cette approche repose notamment sur la nouvelle implémentation d'OpenCL pour FPGA introduite récemment par Altera. Quatre contributions principales sont apportées : (1) une étude initiale d'intégration de c'urs de calculs matériels à une librairie logicielle de calcul financier (QuantLib), (2) une exploration d'architectures et de leur performances respectives, ainsi que la conception d'une architecture dédiée pour l'évaluation d'option américaine et l'évaluation de volatilité implicite à partir d'un flot haut-niveau de conception, (3) la caractérisation détaillée d'une plateforme Altera OpenCL, des opérateurs élémentaires, des surcouches de contrôle et des liens de communication qui la compose, (4) une proposition d'un flot de compilation spécifique au domaine financier, reposant sur cette dernière caractérisation, ainsi que sur une description des applications financières considérées, à savoir l'évaluation d'options. / The need for resources in High Performance Computing (HPC) is generally met by scaling up server farms, to the detriment of the energy consumption of such a solution. Accelerating HPC application on heterogeneous platforms, such as FPGAs or GPUs, offers a better architectural compromise as they can reduce the energy consumption of a deployed system. Therefore, a change of programming paradigm is needed to support this heterogeneous acceleration, which trickles down to an increased level of programming complexity tackled by software experts. This is most notably the case for developers in quantitative finance. Applications in this field are constantly evolving and increasing in complexity to stay competitive and comply with legislative changes. This puts even more pressure on the programmability of acceleration solutions. In this context, the use of high-level development and design flows, such as High-Level Synthesis (HLS) for programming FPGAs, is not enough. A domain-specific approach can help to reach performance requirements, without impairing the programmability of accelerated applications.We propose in this thesis a high-level design approach that relies on OpenCL, as a heterogeneous programming standard. More precisely, a recent implementation of OpenCL for Altera FPGA is used. In this context, four main contributions are proposed in this thesis: (1) an initial study of the integration of hardware computing cores to a software library for quantitative finance (QuantLib), (2) an exploration of different architectures and their respective performances, as well as the design of a dedicated architecture for the pricing of American options and their implied volatility, based on a high-level design flow, (3) a detailed characterization of an Altera OpenCL platform, from elemental operators, memory accesses, control overlays, and up to the communication links it is made of, (4) a proposed compilation flow that is specific to the quantitative finance domain, and relying on the aforementioned characterization and on the description of the considered financial applications (option pricing). Conception haut-Niveau OpenCL Fpga Gpu Finance Accélération matérielle Hpc Hls Prototypage High-Level design OpenCL Fpga Gpu Quantitative finance Hardware acceleration Hpc Hls Prototyping 004
50	FPGA-based Speed Limit Sign Detection Tallawi, Reham 27 September 2017 (has links) (PDF) This thesis presents a new hardware accelerated approach using image processing and detection algorithms for implementing fast and robust traffic sign detection system with focus on speed limit sign detection. The proposed system targets reconfigurable integrated circuits particularly Field Programmable Gate Array (FPGA) devices. This work propose a fully parallelized and pipelined parallel system architecture to exploit the high performance and flexibility capabilities of FPGA devices. This thesis is divided into two phases, the first phase, is a software prototype implementation of the proposed system. The software system was designed and developed using C++ and OpenCV library on general purpose CPU. The prototype is used to explore and investigate potential segmentation and detection algorithms that might be feasible to design and implement in hardware accelerated environments. These algorithms includes RGB colour conversion, colour segmentation through thresholding, noise reduction through median filter, morphological operations through erosion and dilation, and sign detection through template matching. The second phase, a hardware-based design of the system was developed using the same algorithms used in the software design. The hardware design is composed of 20 image processing components each designed to xxx fully parallelized and pipelined xxx. The hardware implementation was developed using VHDL as the hardware description language targeting a Xilinix Virtex-6 FPGA XC6VLX240T device. The development environment is Xilinx ISE®Design Suite version 14.3. A set of 20 640x480 test images was used as the test data for the verification and testing of this work. The images was captured by a smart-phone camera in various weather and lightning conditions. The software implementation delivered speed limit detection results with a success rate of 75%. The hardware implementation was only simulated using Xilinx ISE Simulator (ISim) with a overall system latency of 12964 clock cycles. According to the Place and Route report the maximum operation frequency for the proposed hardware design is 71,2 MHz. The design only utilized 2% of the slice registers, 4% of the slice Look up Tables (LUT), and 11% of the block memory. This thesis project concludes the work based on the provided software and hardware implementation and performance analysis results. Also the conclusions chapter provides recommendations and future work for possible extension of the project. VHDL FPGA Verkehrsschildererkennung FPGA traffc sign detection image processing VHDL hardware acceleration ddc:004 Informatik Verkehrszeichen Bildverarbeitung VHDL Field programmable gate array

Search results