• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 40
  • 10
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 71
  • 71
  • 37
  • 22
  • 16
  • 15
  • 14
  • 13
  • 13
  • 12
  • 12
  • 9
  • 9
  • 9
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Návrh protokolu hardwarového akcelerátoru náročných výpočtů nad více jádry / A Hardware-acceleration Protocol Design for Demanding Computations over Multiple Cores

Bareš, Jan January 2018 (has links)
This work deals with design of communication protocol for data transmission between control computer and computing cores implemented on FPGA chips. The purpose of the communication is speeding the performance demanding software algorithms of non-stream data processing by their hardware computation on accelerating system. The work defines a terminology used for protocol design and analyses current solutions of given issue. After that the work designs structure of the accelerating system and communication protocol. In the main part the work describes the implementation of the protocol in VHDL language and the simulation of implemented modules. At the end of the work the aplication of designed solution is presented along with possible extension of this work.
42

Akcelerace šifrování přenosu síťových dat / Acceleration of Network Traffic Encryption

Koranda, Karel January 2013 (has links)
This thesis deals with the design of hardware unit used for acceleration of the process of securing network traffic within Lawful Interception System developed as a part of Sec6Net project. First aim of the thesis is the analysis of available security mechanisms commonly used for securing network traffic. Based on this analysis, SSH protocol is chosen as the most suitable mechanism for the target system. Next, the thesis aims at introduction of possible variations of acceleration unit for SSH protocol. In addition, the thesis presents a detailed design description and implementation of the unit variation based on AES-GCM algorithm, which provides confidentiality, integrity and authentication of transmitted data. The implemented acceleration unit reaches maximum throughput of 2,4 Gbps.
43

Hardware Accelerated Digital Image Stabilization in a Video Stream / Hardware Accelerated Digital Image Stabilization in a Video Stream

Pacura, Dávid January 2016 (has links)
Cílem této práce je návrh nové techniky pro stabilizaci obrazu za pomoci hardwarové akcelerace prostřednictvím GPGPU. Využití této techniky umožnuje stabilizaci videosekvencí v reálném čase i pro video ve vysokém rozlišení. Toho je zapotřebí pro ulehčení dalšího zpracování v počítačovém vidění nebo v armádních aplikacích. Z důvodu existence vícerých programovacích modelů pro GPGPU je navrhnutý stabilizační algoritmus implementován ve třech nejpoužívanějších z nich. Jejich výkon a výsledky jsou následně porovnány a diskutovány.
44

Hardware acceleration of convolutional neural networks on FPGA

Myrén, Adam January 2020 (has links)
With the evolution of machine learning algorithms they are seeing a wider use in traditional signal processing applications. One of these areas is in radios for improved signal identification algorithms. With the large computational complexity of convolutional neural networks, it is of importance to use platforms that are as fast and energy efficient as possible. This thesis investigates hardware acceleration of convolutional neural networks on field programmable gate arrays, an reconfigurable integrated circuit. An existing toolflow, Haddoc2, is used and evaluated. This tool automates the mapping of a convolutional neural network from a high level description in Caffe to a synthesisable hardware description in VHDL hardware description language. Four models of different sizes are trained on the MNIST dataset and accelerators for these at different bitwidths are generated and then simulated in a VHDL testbench. The resulting accuracies are tolerable for the target problem and Haddoc2 can produce fast accelerators that would work well for smaller networks. Big networks was found to consume large amounts of resources in the field programmable gate array and is not feasible for a practical application. The treatment of weights as constants makes the accelerators fast since there is no memory bottleneck but makes the accelerator less flexible since a new set of weights would require to re-synthesize the design and reprogramming the field programmable gate array.
45

Rendering av geodata med OpenGL

Ingelborn, Marcus January 2020 (has links)
Den här studien undersökte om det är lönsamt eller ej att implementera hård-varustöd, med hjälp av OpenGL, för rendering av geografisk data. I detta fall innebar det skapande av kartbilder med tillfälliga föremål positionerade och inritade. Föremålen var under konstant förändring och en bildruta kunde inte antas se likadan ut som nästa. För att besvara på frågan användes en testmiljö hos företaget Saab och den öppna programvaran Geotools. En aktionsforskning genomfördes där en ny ren-deringsmodul till Geotools implementerades. Den nya och den föregående ren-deringsmodulen, från Geotools, testades och deras renderingstider uppmättes. Därefter analyserades mätresultaten och jämfördes med statistiska metoder. Renderingstiden för en bild i den tidigare renderingsmodulen tog i snitt mellan 481 och 495 ms med sannolikhet på 99,9%. Renderingstiden utfördes i snitt med den nya renderingsmodulen på mellan 145 och 150 ms med samma sannolikhet. Inom ett konfidensintervall på 99,9% minskade snittrenderingstiden med mellan 333 och 347 ms för den nyutvecklade modulen med hårdvarustöd.
46

Turbo Code Performance Analysis Using Hardware Acceleration

Nordmark, Oskar January 2016 (has links)
The upcoming 5G mobile communications system promises to enable use cases requiring ultra-reliable and low latency communications. Researchers therefore require more detailed information about aspects such as channel coding performance at very low block error rates. The simulations needed to obtain such results are very time consuming and this poses achallenge to studying the problem. This thesis investigates the use of hardware acceleration for performing fast simulations of turbo code performance. Special interest is taken in investigating different methods for generating normally distributed noise based on pseudorandom number generator algorithms executed in DSP:s. A comparison is also done regarding how well different simulator program structures utilize the hardware. Results show that even a simple program for utilizing parallel DSP:s can achieve good usage of hardware accelerators and enable fast simulations. It is also shown that for the studied process the bottleneck is the conversion of hard bits to soft bits with addition of normally distributed noise. It is indicated that methods for noise generation which do not adhere to a true normal distribution can further speed up this process and yet yield simulation quality comparable to methods adhering to a true Gaussian distribution. Overall, it is show that the proposed use of hardware acceleration in combination with the DSP software simulator program can in a reasonable time frame generate results for turbo code performance at block error rates as low as 10−9.
47

Efficient FPGA SoC Processing Design for a Small UAV Radar

Newmeyer, Luke Oliver 01 April 2018 (has links)
Modern radar technology relies heavily on digital signal processing. As radar technology pushes the boundaries of miniaturization, computational systems must be developed to support the processing demand. One particular application for small radar technology is in modern drone systems. Many drone applications are currently inhibited by safety concerns of autonomous vehicles navigating shared airspace. Research in radar based Detect and Avoid (DAA) attempts to address these concerns by using radar to detect nearby aircraft and choosing an alternative flight path. Implementation of radar on small Unmanned Air Vehicles (UAV), however, requires a lightweight and power efficient design. Likewise, the radar processing system must also be small and efficient.This thesis presents the design of the processing system for a small Frequency Modulated Continuous Wave (FMCW) phased array radar. The radar and processing is designed to be light-weight and low-power in order to fly onboard a UAV less than 25 kg in weight. The radar algorithms for this design include a parallelized Fast Fourier Transform (FFT), cross correlation, and beamforming. Target detection algorithms are also implemented. All of the computation is performed in real-time on a Xilinx Zynq 7010 System on Chip (SoC) processor utilizing both FPGA and CPU resources.The radar system (excluding antennas) has dimensions of 2.25 x 4 x 1.5 in3, weighs 120 g, and consumes 8 W of power of which the processing system occupies 2.6 W. The processing system performs over 652 million arithmetic operations per second and is capable of performing the full processing in real-time. The radar has also been tested in several scenarios both airborne on small UAVs as well as on the ground. Small UAVs have been detected to ranges of 350 m and larger aircraft up to 800 m. This thesis will describe the radar design architecture, the custom designed radar hardware, the FPGA based processing implementations, and conclude with an evaluation of the system's effectiveness and performance.
48

Hardware Acceleration of a Monte Carlo Simulation for Photodynamic Therapy Treatment Planning

Lo, William Chun Yip 15 February 2010 (has links)
Monte Carlo (MC) simulations are widely used in the field of medical biophysics, particularly for modelling light propagation in biological tissue. The iterative nature of MC simulations and their high computation time currently limit their use to solving the forward solution for a given set of source characteristics and tissue optical properties. However, applications such as photodynamic therapy treatment planning or image reconstruction in diffuse optical tomography require solving the inverse problem given a desired light dose distribution or absorber distribution, respectively. A faster means for performing MC simulations would enable the use of MC-based models for such tasks. In this thesis, a gold standard MC code called MCML was accelerated using two distinct hardware-based approaches, namely designing custom hardware on field-programmable gate arrays (FPGAs) and programming commodity graphics processing units (GPUs). Currently, the GPU-based approach is promising, offering approximately 1000-fold speedup with 4 GPUs compared to an Intel Xeon CPU.
49

Hardware Acceleration of a Monte Carlo Simulation for Photodynamic Therapy Treatment Planning

Lo, William Chun Yip 15 February 2010 (has links)
Monte Carlo (MC) simulations are widely used in the field of medical biophysics, particularly for modelling light propagation in biological tissue. The iterative nature of MC simulations and their high computation time currently limit their use to solving the forward solution for a given set of source characteristics and tissue optical properties. However, applications such as photodynamic therapy treatment planning or image reconstruction in diffuse optical tomography require solving the inverse problem given a desired light dose distribution or absorber distribution, respectively. A faster means for performing MC simulations would enable the use of MC-based models for such tasks. In this thesis, a gold standard MC code called MCML was accelerated using two distinct hardware-based approaches, namely designing custom hardware on field-programmable gate arrays (FPGAs) and programming commodity graphics processing units (GPUs). Currently, the GPU-based approach is promising, offering approximately 1000-fold speedup with 4 GPUs compared to an Intel Xeon CPU.
50

Approche de conception haut-niveau pour l'accélération matérielle de calcul haute performance en finance / High-level approach for hardware acceleration of high-performance computing in finance

Mena morales, Valentin 12 July 2017 (has links)
Les applications de calcul haute-performance (HPC) nécessitent des capacités de calcul conséquentes, qui sont généralement atteintes à l'aide de fermes de serveurs au détriment de la consommation énergétique d'une telle solution. L'accélération d'applications sur des plateformes hétérogènes, comme par exemple des FPGA ou des GPU, permet de réduire la consommation énergétique et correspond donc à un compromis architectural plus séduisant. Elle s'accompagne cependant d'un changement de paradigme de programmation et les plateformes hétérogènes sont plus complexes à prendre en main pour des experts logiciels. C'est particulièrement le cas des développeurs de produits financiers en finance quantitative. De plus, les applications financières évoluent continuellement pour s'adapter aux demandes législatives et concurrentielles du domaine, ce qui renforce les contraintes de programmabilité de solutions d'accélérations. Dans ce contexte, l'utilisation de flots haut-niveaux tels que la synthèse haut-niveau (HLS) pour programmer des accélérateurs FPGA n'est pas suffisante. Une approche spécifique au domaine peut fournir une réponse à la demande en performance, sans que la programmabilité d'applications accélérées ne soit compromise.Nous proposons dans cette thèse une approche de conception haut-niveau reposant sur le standard de programmation hétérogène OpenCL. Cette approche repose notamment sur la nouvelle implémentation d'OpenCL pour FPGA introduite récemment par Altera. Quatre contributions principales sont apportées : (1) une étude initiale d'intégration de c'urs de calculs matériels à une librairie logicielle de calcul financier (QuantLib), (2) une exploration d'architectures et de leur performances respectives, ainsi que la conception d'une architecture dédiée pour l'évaluation d'option américaine et l'évaluation de volatilité implicite à partir d'un flot haut-niveau de conception, (3) la caractérisation détaillée d'une plateforme Altera OpenCL, des opérateurs élémentaires, des surcouches de contrôle et des liens de communication qui la compose, (4) une proposition d'un flot de compilation spécifique au domaine financier, reposant sur cette dernière caractérisation, ainsi que sur une description des applications financières considérées, à savoir l'évaluation d'options. / The need for resources in High Performance Computing (HPC) is generally met by scaling up server farms, to the detriment of the energy consumption of such a solution. Accelerating HPC application on heterogeneous platforms, such as FPGAs or GPUs, offers a better architectural compromise as they can reduce the energy consumption of a deployed system. Therefore, a change of programming paradigm is needed to support this heterogeneous acceleration, which trickles down to an increased level of programming complexity tackled by software experts. This is most notably the case for developers in quantitative finance. Applications in this field are constantly evolving and increasing in complexity to stay competitive and comply with legislative changes. This puts even more pressure on the programmability of acceleration solutions. In this context, the use of high-level development and design flows, such as High-Level Synthesis (HLS) for programming FPGAs, is not enough. A domain-specific approach can help to reach performance requirements, without impairing the programmability of accelerated applications.We propose in this thesis a high-level design approach that relies on OpenCL, as a heterogeneous programming standard. More precisely, a recent implementation of OpenCL for Altera FPGA is used. In this context, four main contributions are proposed in this thesis: (1) an initial study of the integration of hardware computing cores to a software library for quantitative finance (QuantLib), (2) an exploration of different architectures and their respective performances, as well as the design of a dedicated architecture for the pricing of American options and their implied volatility, based on a high-level design flow, (3) a detailed characterization of an Altera OpenCL platform, from elemental operators, memory accesses, control overlays, and up to the communication links it is made of, (4) a proposed compilation flow that is specific to the quantitative finance domain, and relying on the aforementioned characterization and on the description of the considered financial applications (option pricing).

Page generated in 0.0929 seconds