Global ETD Search

401	GPU-Based Visualisation of Viewshed from Roads or Areas in a 3D Environment Christoph, Heilmair January 2016 (has links) Viewshed refers to the calculation and visualisation of what part of a terrain isvisible from a given observer point. It is used within many fields, such as militaryplanning or telecommunication tower placement. So far, no general fast methodsexist for calculating the viewshed for multiple observers that may for instancerepresent a road within the terrain. Additionally, if the terrain contains over-lapping structures such as man-made constructions like bridges, most currentviewshed algorithms fail. This report describes two novel methods for viewshedcalculation using multiple observers for terrain that may contain overlappingstructures. The methods have been developed at Vricon in Linköping as a Mas-ter’s Thesis project. Both methods are implemented using the graphics program-ming unit and the OpenGL graphics library, using a computer graphics approach.Results are presented in the form of figures and images, as well as running timetables using two different test setups. Lastly, future possible improvements arealso discussed. The results show that the first method is a viable real-time solu-tion and that the second method requires some additional work. viewshed gpu multiple observers observer graphics card efficient viewshed analysis road real-time real time
402	Étude de contraintes spatiales bas niveau appliquées à la vision par ordinateur Jodoin, Pierre-Marc January 2006 (has links) Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal. Vision par ordinateur Contraintes spatiales GPU Segmentation Fusion de données Flux optique Occlusion
403	Fast Parallel Machine Learning Algorithms for Large Datasets Using Graphic Processing Unit Li, Qi 30 November 2011 (has links) This dissertation deals with developing parallel processing algorithms for Graphic Processing Unit (GPU) in order to solve machine learning problems for large datasets. In particular, it contributes to the development of fast GPU based algorithms for calculating distance (i.e. similarity, affinity, closeness) matrix. It also presents the algorithm and implementation of a fast parallel Support Vector Machine (SVM) using GPU. These application tools are developed using Compute Unified Device Architecture (CUDA), which is a popular software framework for General Purpose Computing using GPU (GPGPU). Distance calculation is the core part of all machine learning algorithms because the closer the query is to some samples (i.e. observations, records, entries), the more likely the query belongs to the class of those samples. K-Nearest Neighbors Search (k-NNS) is a popular and powerful distance based tool for solving classification problem. It is the prerequisite for training local model based classifiers. Fast distance calculation can significantly improve the speed performance of these classifiers and GPUs can be very handy for their accelerations. Meanwhile, several GPU based sorting algorithms are also included to sort the distance matrix and seek for the k-nearest neighbors. The speed performances of the sorting algorithms vary depending upon the input sequences. The GPUKNN proposed in this dissertation utilizes the GPU based distance computation algorithm and automatically picks up the most suitable sorting algorithm according to the characteristics of the input datasets. Every machine learning tool has its own pros and cons. The advantage of SVM is the high classification accuracy. This makes SVM possibly the best classification tool. However, as in many other machine learning algorithms, SVM's slow training phase slows down when the size of the input datasets increase. The GPU version of parallel SVM based on parallel Sequential Minimal Optimization (SMO) implemented in this dissertation is proposed to reduce the time cost in both training and predicting phases. This implementation of GPUSVM is original. It utilizes many parallel processing techniques to accelerate and minimize the computations of kernel evaluation, which are considered as the most time consuming operations in SVM. Although the many-core architecture of GPU performs the best in data level parallelism, multi-task (aka. task level parallelism) processing is also integrated into the application to improve the speed performance of tasks such as multiclass classification and cross-validation. Furthermore, the procedure of finding worst violators is distributed to multiple blocks on the CUDA model. This reduces the time cost for each iteration of SMO during the training phase. All of these violators are shared among different tasks in multiclass classification and cross-validation to reduce the duplicate kernel computations. The speed performance results have shown that the achieved speedup of both the training phase and predicting phase are ranging from one order of magnitude to three orders of magnitude times faster compared to the state of the art LIBSVM software on some well known benchmarking datasets. SVM GPU CUDA Machine Learning Parallel Processing Computer Sciences Physical Sciences and Mathematics
404	Logiciel de génération de nombres aléatoires dans OpenCL Kemerchou, Nabil 08 1900 (has links) clRNG et clProbdist sont deux interfaces de programmation (APIs) que nous avons développées pour la génération de nombres aléatoires uniformes et non uniformes sur des dispositifs de calculs parallèles en utilisant l’environnement OpenCL. La première interface permet de créer au niveau d’un ordinateur central (hôte) des objets de type stream considérés comme des générateurs virtuels parallèles qui peuvent être utilisés aussi bien sur l’hôte que sur les dispositifs parallèles (unités de traitement graphique, CPU multinoyaux, etc.) pour la génération de séquences de nombres aléatoires. La seconde interface permet aussi de générer au niveau de ces unités des variables aléatoires selon différentes lois de probabilité continues et discrètes. Dans ce mémoire, nous allons rappeler des notions de base sur les générateurs de nombres aléatoires, décrire les systèmes hétérogènes ainsi que les techniques de génération parallèle de nombres aléatoires. Nous présenterons aussi les différents modèles composant l’architecture de l’environnement OpenCL et détaillerons les structures des APIs développées. Nous distinguons pour clRNG les fonctions qui permettent la création des streams, les fonctions qui génèrent les variables aléatoires uniformes ainsi que celles qui manipulent les états des streams. clProbDist contient les fonctions de génération de variables aléatoires non uniformes selon la technique d’inversion ainsi que les fonctions qui permettent de retourner différentes statistiques des lois de distribution implémentées. Nous évaluerons ces interfaces de programmation avec deux simulations qui implémentent un exemple simplifié d’un modèle d’inventaire et un exemple d’une option financière. Enfin, nous fournirons les résultats d’expérimentation sur les performances des générateurs implémentés. / clRNG and clProbdist are two application programming interfaces (APIs) that we have developed respectively for the generation of uniform and non-uniform random numbers on parallel computing devices in the OpenCL environment. The first interface is used to create at a central computer level (host) objects of type stream considered as parallel virtual generators that can be used both on the host and on parallel devices (graphics processing units, multi-core CPU, etc.) for generating sequences of random numbers. The second interface can be used also on the host or devices to generate random variables according to different continuous and discrete probability distributions. In this thesis, we will recall the basic concepts of random numbers generators, describe the heterogeneous systems and the generation techniques of parallel random number, then present the different models composing the OpenCL environment. We will detail the structures of the developed APIs, distinguish in clRNG the functions that allow creating streams from the functions that generate uniform random variables and the functions that manipulate the states of the streams.We will describe also clProbDist that allow the generation of non-uniform random variables based on the inversion technique as well as returning different statistical values related to the distributions implemented. We will evaluate these APIs with two simulations, the first one implements a simplified example of inventory model and the second one estimate the value of an Asian call option. Finally, we will provide results of experimentations on the performance of the implemented generators. Rng Parallélisme OpenCL Gpu Parallel computing
405	GPU implementace algoritmů irradiance a radiance caching / GPU implementation of the irradiance and radiance caching algorithms Bulant, Martin January 2015 (has links) The object of this work is to create software implementing two algorithms for global ilumination computing. Iradiance and radiance caching should be implemented in CUDA framework on graphics card (GPU). Parallel implementation on GPU should dramatically improve algoritm speed compared to CPU implementation. The software will be written using already done framework for global illumunation computation. That allow to focus to algorithm implementation only. This work should speed up testing of new or existing methods for global illumination computing, because saving and reusing of intermediate results can be used for other algorithms too. Powered by TCPDF (www.tcpdf.org)
406	Riešenie problému globálnej optimalizácie využitím GPU / Employing GPUs in Global Optimization Problems Hošala, Michal January 2014 (has links) The global optimization problem -- i.e., the problem of finding global extreme points of given function on restricted domain of values -- often appears in many real-world applications. Improving efficiency of this task can reduce the latency of the application or provide more precise result since the task is usually solved by an approximative algorithm. This thesis focuses on the practical aspects of global optimization algorithms, especially in the domain of algorithmic trading data analysis. Successful implementations of the global optimization solver already exist for CPUs, but they are quite time demanding. The main objective of this thesis is to design a GO solver that utilizes the raw computational power of the GPU devices. Despite the fact that the GPUs have significantly more computational cores than the CPUs, the parallelization of a known serial algorithm is often quite challenging due to the specific execution model and the memory architecture constraints of the existing GPU architectures. Therefore, the thesis will explore multiple approaches to the problem and present their experimental results.
407	Implementace neúplného inverzního rozkladu na grafických kartách / Implementing incomplete inverse decomposition on graphical processing units Dědeček, Jan January 2013 (has links) The goal of this Thesis was to evaluate a possibility to solve systems of linear algebraic equations with the help of graphical processing units (GPUs). While such solvers for generally dense systems seem to be more or less a part of standard production libraries, the Thesis concentrates on this low-level parallelization of equations with a sparse system that still presents a challenge. In particular, the Thesis considers a specific algorithm of an approximate inverse decomposition of symmetric and positive definite systems combined with the conjugate gradient method. An important part of this work is an innovative parallel implementation. The presented experimental results for systems of various sizes and sparsity structures point out that the approach is rather promising and should be further developed. Summarizing our results, efficient preconditioning of sparse systems by approximate inverses on GPUs seems to be worth of consideration. Powered by TCPDF (www.tcpdf.org)
408	GPU implementace algoritmů irradiance a radiance caching / GPU implementation of the irradiance and radiance caching algorithms Bulant, Martin January 2015 (has links) The objective of this work is to create software implementing two algorithms for global ilumination computation. Iradiance and radiance caching should be implemented in CUDA framework on a graphics card (GPU). Parallel implementation on the GPU should improve algoritm speed compared to CPU implementation. The software will be written using an already done framework for global illumunation computation. That allows to focus on algorithm implementation only. This work should speed up testing of new or existing methods for global illumination computing, because saving and reusing of intermediate results can be used for other algorithms too. Powered by TCPDF (www.tcpdf.org)
409	Quality and real-time performance assessment of color-correction methods : A comparison between histogram-based prefiltering and global color transfer Nilsson, Linus January 2018 (has links) In the field of computer vision and more specifically multi-camera systems color correction is an important topic of discussion. The need for color-tone similarity among multiple images that are used to construct a single scene is self-evident. The strength and weaknesses of color- correction methods can be assessed by using metrics to measure structural and color-tone similarity and timing the methods. Color transfer has a better structural similarity than histogram-based prefiltering and a worse color-tone similarity. The color transfer method is faster than the histogram-based prefiltering. Color transfer is a better method if the focus is a structural similar image after correction, if better color-tone similarity at the cost of structural similarity is acceptable histogram-based prefiltering is a better choice. Color transfer is a faster method and is easier to run with a parallel computing approach then histogram-based prefiltering. Color transfer might therefore be a better pick for real-time applications. There is however more room to optimize an implementation of histogram-based prefiltering utilizing parallel computing. GPU Color correction Computer vision Color transfer Histogram-based prefiltering Computer Systems Datorsystem
410	Risk Measures Extracted from Option Market Data Using Massively Parallel Computing Zhao, Min 27 April 2011 (has links) The famous Black-Scholes formula provided the first mathematically sound mechanism to price financial options. It is based on the assumption, that daily random stock returns are identically normally distributed and hence stock prices follow a stochastic process with a constant volatility. Observed prices, at which options trade on the markets, don¡¯t fully support this hypothesis. Options corresponding to different strike prices trade as if they were driven by different volatilities. To capture this so-called volatility smile, we need a more sophisticated option-pricing model assuming that the volatility itself is a random process. The price we have to pay for this stochastic volatility model is that such models are computationally extremely intensive to simulate and hence difficult to fit to observed market prices. This difficulty has severely limited the use of stochastic volatility models in the practice. In this project we propose to overcome the obstacle of computational complexity by executing the simulations in a massively parallel fashion on the graphics processing unit (GPU) of the computer, utilizing its hundreds of parallel processors. We succeed in generating the trillions of random numbers needed to fit a monthly options contract in 3 hours on a desktop computer with a Tesla GPU. This enables us to accurately price any derivative security based on the same underlying stock. In addition, our method also allows extracting quantitative measures of the riskiness of the underlying stock that are implied by the views of the forward-looking traders on the option markets. Financial risk management Massively parallel GPU comtuing Stochastic volatility model Black-Scholes Formula

Search results