• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 166
  • 65
  • 52
  • 12
  • 10
  • 9
  • 6
  • 6
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • Tagged with
  • 399
  • 203
  • 118
  • 107
  • 80
  • 72
  • 70
  • 54
  • 42
  • 41
  • 38
  • 36
  • 36
  • 32
  • 31
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
221

Rychlý výpočet průsečíku paprsku s trojúhelníkem / Fast Ray-Triangle Intersection

Horák, František January 2013 (has links)
This work contains a few basic terms of analytical geometry. We mention some of ray-triangle intersection computation algorithms and present some use-case examples. We discuss capabilities of CUDA, optimization techniques of this architecture and implementation with focus on given issues. Algorithms of ray-triangle intersection are tested and results are discussed.
222

Akcelerace částicových rojů PSO pomocí GPU / Particle Swarm Optimization on GPUs

Záň, Drahoslav January 2013 (has links)
This thesis deals with a population based stochastic optimization technique PSO (Particle Swarm Optimization) and its acceleration. This simple, but very effective technique is designed for solving difficult multidimensional problems in a wide range of applications. The aim of this work is to develop a parallel implementation of this algorithm with an emphasis on acceleration of finding a solution. For this purpose, a graphics card (GPU) providing massive performance was chosen. To evaluate the benefits of the proposed implementation, a CPU and GPU implementation were created for solving a problem derived from the known NP-hard Knapsack problem. The GPU application shows 5 times average and almost 10 times the maximum speedup of computation compared to an optimized CPU application, which it is based on.
223

Překladač jazyka C# do jazyka Nvidia CUDA / Programming CUDA with C#

Zajíc, Jiří January 2012 (has links)
This master's thesis is focused on GPU accelerated calculations on NVidia graphics card. CUDA technology is used and converted to implementation on a .NET platform. The problem is solved as a compiler from C# programing language to NVidia CUDA language with expression atrributes of C# language that preserves the same semantics of actions. Application is implemented in C# programing language and uses NRefactory, the open-source library.
224

Simulace tekutin a plynů / Fluid Simulation

Štambachr, Jakub January 2011 (has links)
This diploma thesis addresses the problem of liquid and gas simulation, it particularly deals with computer simulation of flow of viscous newtonian liquids with a free surface. A main goal of this work is to create an efficient simulation model, utilizing the benefits of current GPU parallel architecture for general-purpose computing. I chose to implement Smoothed Particle Hydrodynamics, a lagrangian particle-based method. A significant portion of this thesis consists of speed analysis of the implemented algorithm, comparison with other authors' achievements in the field and a demonstration of benefits brought by GPU involvement in the computation. As an output of the thesis I present an interactive computer program that allows for real-time simulation (and visualization) of water-like fluids.
225

Konstrukce kD stromu na GPU / Building kD Tree on GPU

Bajza, Jakub January 2016 (has links)
This term project addresses the construction of kD tree acceleration structures and parallelization of this construction using GPU. At the beginning, there is an introduction of the reader into CUDA platform for parallel programming. There is a decription of generic principles as well as specific features that will be used in this thesis. Following that the reader is put into the issue of acceleration structures for Ray tracing. These structures are described and the kD tree acceleration structure and its variants are portrayed in detail. After that the analysis of chosen kD tree variant is broken down and the problems and issuse of its parallel implementation are adressed. As a part of implementation discription, there is a short descripton of CPU variant and detailed specifications of the CUDA kernels. The testing section brings the results of implementation in form of CPU vs GPU comparison, as well as evaluation of how much the metric set in design was fulfilled. In the end there is a summary of achieved goals and results followed by possible future improvements for the implementation.
226

Paralelizace ultrazvukových simulací na svazku grafických karet / Parallelisation of Ultrasound Simulations on Multi-GPU Clusters

Dujíček, Aleš January 2015 (has links)
This work is part of the k-Wave project, which is a toolbox designed for time ultrasound simulations in complex and heterogeneous media. The simulation functions are based on the k-space pseudospectral method. The goal of this work is to compute these simulations on graphics cards using local domain decompostion. Thanks to decomposition we could compute these simulations faster, and on larger data grids. The main goal of this work is to achieve efficiency and scalability.
227

Deep Learning with Go

Derek Leigh Stinson (8812109) 08 May 2020 (has links)
Current research in deep learning is primarily focused on using Python as a support language. Go, an emerging language, that has many benefits including native support for concurrency has seen a rise in adoption over the past few years. However, this language is not widely used to develop learning models due to the lack of supporting libraries and frameworks for model development. In this thesis, the use of Go for the development of neural network models in general and convolution neural networks is explored. The proposed study is based on a Go-CUDA implementation of neural network models called GoCuNets. This implementation is then compared to a Go-CPU deep learning implementation that takes advantage of Go's built in concurrency called ConvNetGo. A comparison of these two implementations shows a significant performance gain when using GoCuNets compared to ConvNetGo.<br>
228

Hardwarebeschleunigung von Matrixberechnungen auf Basis von GPU Verarbeitung

Götze, Johannes 02 July 2019 (has links)
In den heutigen Algorithmen zu Soundlokalisierungsverfahren sind Matrizenberechnungen allgegenwärtig, aus diesem Grund befasst sich diese Arbeit mit der Analyse von Matrixberechnungen und deren möglichen Realisierung auf eingebetteten Systemen. Hierzu werden die gängigen Beschleunigungstechnologien wie Prozessoren, Grafikbeschleunigung und Parallelisierung mit der Hilfe von FPGAs analysiert. Die Ergebnisse zeigen, dass ein Grafikchip in der Lage ist eine solche Matrixvektormultiplikation im Gegensatz zu einer Implementierung auf einem Prozessor zu beschleunigen. Eine Implementierung auf einem FPGA, welche in ihrem Entwicklungsaufwand deutlich über der einer Beschleunigung durch einen Grafikchip liegt, ist hinsichtlich der Laufzeit durch eine GPU nicht zu erreichen.
229

Parallelizing Digital Signal Processing for GPU

Ekstam Ljusegren, Hannes, Jonsson, Hannes January 2020 (has links)
Because of the increasing importance of signal processing in today's society, there is a need to easily experiment with new ways to process signals. Usually, fast-performing digital signal processing is done with special-purpose hardware that are difficult to develop for. GPUs pose an alternative for fast performing digital signal processing. The work in this thesis is an analysis and implementation of a GPU version of a digital signal processing chain provided by SAAB. Through an iterative process of development and testing, a final implementation was achieved. Two benchmarks, both comprised of 4.2 M test samples, were made to compare the CPU implementation with the GPU implementation. The benchmark was run on three different platforms: a desktop computer, a NVIDIA Jetson AGX Xavier and a NVIDIA Jetson TX2. The results show that the parallelized version can reach several magnitudes higher throughput than the CPU implementation.
230

Effective and Accelerated Informative Frame Filtering in Colonoscopy Videos Using Graphic Processing Units

Karri, Venkata Praveen 08 1900 (has links)
Colonoscopy is an endoscopic technique that allows a physician to inspect the mucosa of the human colon. Previous methods and software solutions to detect informative frames in a colonoscopy video (a process called informative frame filtering or IFF) have been hugely ineffective in (1) covering the proper definition of an informative frame in the broadest sense and (2) striking an optimal balance between accuracy and speed of classification in both real-time and non real-time medical procedures. In my thesis, I propose a more effective method and faster software solutions for IFF which is more effective due to the introduction of a heuristic algorithm (derived from experimental analysis of typical colon features) for classification. It contributed to a 5-10% boost in various performance metrics for IFF. The software modules are faster due to the incorporation of sophisticated parallel-processing oriented coding techniques on modern microprocessors. Two IFF modules were created, one for post-procedure and the other for real-time. Code optimizations through NVIDIA CUDA for GPU processing and/or CPU multi-threading concepts embedded in two significant microprocessor design philosophies (multi-core design and many-core design) resulted a 5-fold acceleration for the post-procedure module and a 40-fold acceleration for the real-time module. Some innovative software modules, which are still in testing phase, have been recently created to exploit the power of multiple GPUs together.

Page generated in 0.0381 seconds