Global ETD Search

81	gcn.MOPS: accelerating cn.MOPS with GPU Alkhamis, Mohammad 16 June 2017 (has links) cn.MOPS is a model-based algorithm used to quantitatively detect copy-number variations in next-generation, DNA-sequencing data. The algorithm is implemented as an R package and can speed up processing with multi-CPU parallelism. However, the maximum achievable speedup is limited by the overhead of multi-CPU parallelism, which increases with the number of CPU cores used. In this thesis, an alternative mechanism of process acceleration is proposed. Using one CPU core and a GPU device, the proposed solution, gcn.MOPS, achieved a speedup factor of 159× and decreased memory usage by more than half. This speedup was substantially higher than the maximum achievable speedup in cn.MOPS, which was ∼20×. / Graduate / 0984 / 0544 / 0715 / alkhamis@uvic.ca GPU GPGPU cn.MOPS gcn.MOPS CUDA C++ parallel computing CNV
82	Online 3D rekonstrukce / Online 3D reconstruction Bastl, Jiří January 2011 (has links) This thesis describes reconstruction of scene which is scan trough two cameras. There are described methods of calibration of cameras system, methods for finding the corners and methods for finding correspondences. Corners are searched by FAST detector and for search correspondences are used normalized cross correlation. In the framework of 3D reconstruction is implemented rectification. The final shape is saved to VRML format. In the thesis are described parallelization options. The calculation of the correlation is optimized for multiprocessors CPU and there are designed implementations of algorithm to GPU and FPGA too.
83	Zpracování stereo snímků na grafické kartě / GPU accelerated stereo image processing Polák, Jaromir January 2013 (has links) This thesis deals with 3D reconstruction using stereo cameras. This work is to show the usefulness of GPU acceleration for sophisticated algorithm
84	Použití OpenCl v AVG na platformě Windows / Using of OpenCl at AVG in Windows Platform Bajcar, Martin January 2012 (has links) The main topic of this thesis is the practical use of OpenCL at AVG company. AVG is looking for ways to decrease hardware requirement of their security product and also to decrease computation time of some algorithms. Using OpenCL is one way to achieve this requirement. Significant part of this thesis deals with optimization strategies for AMD and NVIDIA graphics cards as they are most common cards among users. Practical part of the thesis describes parallelization of two algorithms, their analysis and implementation. After that, the obtained results are presented and cases in which the use of OpenCL is beneficial are identified. As a part of implementation, library containing various utility functions which can aid programmers to implement OpenCL based code was developed.
85	Akcelerace algoritmů pro hledání triplexů v DNA sekvencích / Acceleration of Algorithms for Triplex Detection in DNA Sequences Weiser, Michal January 2012 (has links) Triplex forms of DNA act as main factors of some important cell functions. However, their positions within genome and their effect on cell functions are not known well. Triplex search algorithms often don't consider many of triplexs features and the possibility of occurrence of errors. In the other hand the complexity of full featured algorithms is extremely high. This paper shows the way to speed up the algorithm that considers all known triplex features. Parallel aproach allows due to CUDA technology acceleration up to 50.
86	Akcelerace kryptografie pomocí GPU / Cryptography Acceleration Using GPU Potěšil, Josef January 2011 (has links) The reader will be familiar with selected concepts of cryptography consited in this work. AES algorithm was selected in conjunction with the description of architecture and software for programming graphic cards (CUDA, OpenCL), in order to create its GPU-accelerated version. This thesis tries to map APIs for communication with crypto-coprocessors, which exist in kernels of Linux/BSD operating systems (CryptoAPI, OCF). It examines this support in the cross-platform OpenSSL library. Subsequently, the work discusses the implementation details, achieved results and integration with OpenSSL library. The conclusion suggests how the developed application could be used and briefly suggests its usage directly by the operating system kernel.
87	Interaktivní simulace chování tkaniny akcelerovaná pomocí GPU / Interactive Cloth Simulation Accelerated by GPU Melichar, Vojtěch January 2016 (has links) This master thesis deals with interactive cloth simulation accelerated by GPU. In the first part there is a description of all technologies used during implementation of a program. The second part discusses various simulation methods. It is mainly focused on particle systems as a most used method. These parts are followed by a design of the program, which is implemented as a part of this thesis. The program was implemented in four variants. The first variant is CPU implementation, which was then optimalized with OpenMP. CUDA implementation is based on these implementations. Last variant implemented in this thesis is optimized CUDA implementation. All these implementations are evaluated from compute complexity point of view and suitability for real time graphics.
88	Improving Performance of a Mixed Reality Application on the Edge with Hardware Acceleration Eriksson, Jesper, Akouri, Christoffer January 2020 (has links) Using specialized hardware to accelerate workloads have the potential to bring great performance lifts in various applications. Using specialized hardware to speed up the slowest executing component in an application will make the whole application execute faster, since it cannot be faster than it's slowest part. This work investigates two modifications to improve an existing virtual reality application with the help of more hardware support. The existing virtual reality application uses a server computer which handles virtual object rendering, these are later sent to the mobile phone, which is the end user. In this project the server part of the application, where the Simultaneous Localization And Mapping (SLAM) library is run was modified to use a Compute Unified Device Architecture (CUDA) accelerated variant. The software encoder and decoder used for the video streaming were modified to use specialized hardware. Small changes were made to the client-side application to allow the latency measurement to work when changing the server-side encoder. Accelerating SLAM with CUDA showed an increase in the number of processed frames each second, and frame processing time, at the cost of latency between the end and edge device. Using the hardware encoder and decoder resulted in no improvement considering latency or processed frames, in fact, the hardware encoders and decoder performed worse than the baseline configuration. The reduced frame processing time indicates that the CUDA platform is beneficial provided that the additional latency that occurred from the implementation is reduced or removed. CUDA Mixed Reality Edge Computer Sciences Datavetenskap (datalogi)
89	Contributions to Parallel Simulation of Equation-Based Models on Graphics Processing Units Stavåker, Kristian January 2011 (has links) In this thesis we investigate techniques and methods for parallel simulation of equation-based, object-oriented (EOO) Modelica models on graphics processing units (GPUs). Modelica is being developed through an international effort via the Modelica Association. With Modelica it is possible to build computationally heavy models; simulating such models however might take a considerable amount of time. Therefor techniques of utilizing parallel multi-core architectures for simulation are desirable. The goal in this work is mainly automatic parallelization of equation-based models, that is, it is up to the compiler and not the end-user modeler to make sure that code is generated that can efficiently utilize parallel multi-core architectures. Not only the code generation process has to be altered but the accompanying run-time system has to be modified as well. Adding explicit parallel language constructs to Modelica is also discussed to some extent. GPUs can be used to do general purpose scientific and engineering computing. The theoretical processing power of GPUs has surpassed that of CPUs due to the highly parallel structure of GPUs. GPUs are, however, only good at solving certain problems of data-parallel nature. In this thesis we relate several contributions, by the author and co-workers, to each other. We conclude that the massively parallel GPU architectures are currently only suitable for a limited set of Modelica models. This might change with future GPU generations. CUDA for instance, the main software platform used in the thesis for general purpose computing on graphics processing units (GPGPU), is changing rapidly and more features are being added such as recursion, function pointers, C++ templates, etc.; however the underlying hardware architecture is still optimized for data-parallelism. Modelica GPU CUDA OpenCL Modeling Simulation Computer Systems Datorsystem
90	Deep Learning with Go Stinson, Derek L. 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Current research in deep learning is primarily focused on using Python as a support language. Go, an emerging language, that has many benefits including native support for concurrency has seen a rise in adoption over the past few years. However, this language is not widely used to develop learning models due to the lack of supporting libraries and frameworks for model development. In this thesis, the use of Go for the development of neural network models in general and convolution neural networks is explored. The proposed study is based on a Go-CUDA implementation of neural network models called GoCuNets. This implementation is then compared to a Go-CPU deep learning implementation that takes advantage of Go's built in concurrency called ConvNetGo. A comparison of these two implementations shows a significant performance gain when using GoCuNets compared to ConvNetGo. Go Golang CUDA Deep Learning Framework Deep Learning GPU

Search results