Global ETD Search

561	Evoluční návrh kolektivních komunikací akcelerovaný pomocí GPU / Evolutionary Design of Collective Communications Accelerated by GPUs Tyrala, Radek January 2012 (has links) This thesis provides an analysis of the application for evolutionary scheduling of collective communications. It proposes possible ways to accelerate the application using general purpose computing on graphics processing units (GPU). This work offers a theoretical overview of systems on a chip, collective communications scheduling and more detailed description of evolutionary algorithms. Further, the work provides a description of the GPU architecture and its memory hierarchy using the OpenCL memory model. Based on the profiling, the work defines a concept for parallel execution of the fitness function. Furthermore, an estimation of the possible level of acceleration is presented. The process of implementation is described with a closer insight into the optimization process. Another important point consists in comparison of the original CPU-based solution and the massively parallel GPU version. As the final point, the thesis proposes distribution of the computation among different devices supported by OpenCL standard. In the conclusion are discussed further advantages, constraints and possibilities of acceleration using distribution on heterogenous computing systems.
562	Fyzikální simulace na GPU / Physics Simulation on GPU Janošík, Ondřej January 2016 (has links) This thesis addresses the issue of rigid body simulation and possibilities of paralellization using GPU. It describes the basics necessary for implementation of basic physics engine for blocks and technologies which can be used for acceleration. In my thesis, I describe approach which allowed me to gradually accellerate physics simulation using OpenCL. Each significant change is described in its own section and includes measurement results with short summary.
563	Konstrukce kD stromu na GPU / Building kD Tree on GPU Bajza, Jakub January 2016 (has links) This term project addresses the construction of kD tree acceleration structures and parallelization of this construction using GPU. At the beginning, there is an introduction of the reader into CUDA platform for parallel programming. There is a decription of generic principles as well as specific features that will be used in this thesis. Following that the reader is put into the issue of acceleration structures for Ray tracing. These structures are described and the kD tree acceleration structure and its variants are portrayed in detail. After that the analysis of chosen kD tree variant is broken down and the problems and issuse of its parallel implementation are adressed. As a part of implementation discription, there is a short descripton of CPU variant and detailed specifications of the CUDA kernels. The testing section brings the results of implementation in form of CPU vs GPU comparison, as well as evaluation of how much the metric set in design was fulfilled. In the end there is a summary of achieved goals and results followed by possible future improvements for the implementation.
564	Paralelizace ultrazvukových simulací na svazku grafických karet / Parallelisation of Ultrasound Simulations on Multi-GPU Clusters Dujíček, Aleš January 2015 (has links) This work is part of the k-Wave project, which is a toolbox designed for time ultrasound simulations in complex and heterogeneous media. The simulation functions are based on the k-space pseudospectral method. The goal of this work is to compute these simulations on graphics cards using local domain decompostion. Thanks to decomposition we could compute these simulations faster, and on larger data grids. The main goal of this work is to achieve efficiency and scalability.
565	Paralelizace Goertzelova algoritmu / Parallel implementation of Goertzel algorithm Skulínek, Zdeněk January 2017 (has links) Technical problems make impossible steadily increase processor's clock frequency. Their power are currently growing due to increasing number of cores. It brings need for new approaches in programming such parallel systems. This thesis shows how to use paralelism in digital signal processing. As an example, it will be presented here implementation of the Geortzel's algorithm using the processing power of the graphics chip.
566	Real-time Rendering with Heterogeneous GPUs Xiao Lei (8803037) 06 May 2020 (has links) <div>Over the years, the performance demand for graphics applications has been steadily increasing. While upgrading the hardware is one direct solution, the emergence of the new low-level and low-overhead graphics APIs like Vulkan also exposed the possibility of improving rendering performance from the bottom of software implementation.</div><div><br></div><div>Most of the recent years’ middle- to high-end personal computers are equipped with both integrated and discrete GPUs. However, with previous graphics APIs, it is hard to put these two heterogeneous GPUs to work concurrently in the same application without tailored driver support.</div><div><br></div><div>This thesis provides an exploration into the utilization of such heterogeneous GPUs in real-time rendering with the help of Vulkan API. This paper first demonstrates the design and implementation details for the proposed heterogeneous GPUs working model. After that, the paper presents the test of two workload offloading strategies: offloading screen space output workload to the integrated GPU and offloading asynchronous computation workload to the integrated GPU.</div><div><br></div>While this study failed to obtain performance improvement through offloading screen space output workload, it is successful in validating that offloading asynchronous computation workload from the discrete GPU to the integrated GPU can improve the overall system performance. This study proves that it is possible to make use of the integrated and discrete GPUs concurrently in the same application with the help of Vulkan. And offloading asynchronous computation workload from the discrete GPU to the integrated GPU can provide up to 3-4% performance improvement with combinations like UHD Graphics 630 + RTX 2070 Max-Q and HD Graphics 630 + GTX 1050. Applied Computer Science Computer Graphics Heterogeneous-GPU Vulkan Graphics Rendering Performance Measurement
567	Deep Learning with Go Derek Leigh Stinson (8812109) 08 May 2020 (has links) Current research in deep learning is primarily focused on using Python as a support language. Go, an emerging language, that has many benefits including native support for concurrency has seen a rise in adoption over the past few years. However, this language is not widely used to develop learning models due to the lack of supporting libraries and frameworks for model development. In this thesis, the use of Go for the development of neural network models in general and convolution neural networks is explored. The proposed study is based on a Go-CUDA implementation of neural network models called GoCuNets. This implementation is then compared to a Go-CPU deep learning implementation that takes advantage of Go's built in concurrency called ConvNetGo. A comparison of these two implementations shows a significant performance gain when using GoCuNets compared to ConvNetGo.<br> Go CUDA GPU Golang Deep Learning Framework
568	Image Space Tensor Field Visualization Using a LIC-like Method Eichelbaum, Sebastian 20 October 2017 (has links) Tensors are of great interest to many applications in engineering and in medical imaging, but a proper analysis and visualization remains challenging. Physics-based visualization of tensor fields has proven to show the main features of symmetric second-order tensor fields, while still displaying the most important information of the data, namely the main directions in medical diffusion tensor data using texture and additional attributes using color-coding, in a continuous representation. Nevertheless, its application and usability remains limited due to its computational expensive and sensitive nature. We introduce a novel approach to compute a fabric-like texture pattern from tensor fields on arbitrary non-selfintersecting surfaces that is motivated by image space line integral convolution (LIC). Our main focus lies on regaining three-dimensionality of the data under user interaction, such as rotation and scaling. We employ a multi-pass rendering approach to estimate proper modification of the LIC noise input texture to support the three-dimensional perception during user interactions. info:eu-repo/classification/ddc/000 ddc:000
569	Robust Real-Time Model Predictive Control for High Degree of Freedom Soft Robots Hyatt, Phillip Edmond 04 June 2020 (has links) This dissertation is focused on the modeling and robust model-based control of high degree-of-freedom (DoF) systems. While most of the contributions are applicable to any difficult-to-model system, this dissertation focuses specifically on applications to large-scale soft robots because their many joints and pressures constitute a high-DoF system and their inherit softness makes them difficult to model accurately. First a joint-angle estimation and kinematic calibration method for soft robots is developed which is shown to decrease the pose prediction error at the end of a 1.5 m robot arm by about 85\%. A novel dynamic modelling approach which can be evaluated within microseconds is then formulated for continuum type soft robots. We show that deep neural networks (DNNs) can be used to approximate soft robot dynamics given training examples from physics-based models like the ones described above. We demonstrate how these machine-learning-based models can be evaluated quickly to perform a form of optimal control called model predictive control (MPC). We describe a method of control trajectory parameterization that enables MPC to be applied to systems with more DoF and with longer prediction horizons than previously possible. We show that this parameterization decreases MPC's sensitivity to model error and drastically reduces MPC solve times. A novel form of MPC is developed based on an evolutionary optimization algorithm that allows the optimization to be parallelized on a computer's graphics processing unit (GPU). We show that this evolutionary MPC (EMPC) can greatly decrease MPC solve times for high DoF systems without large performance losses, especially given a large GPU. We combine the ideas of machine learned DNN models of robot dynamics, with parameterized and parallelized MPC to obtain a nonlinear version of EMPC which can be run at higher rates and find better solutions than many state-of-the-art optimal control methods. Finally we demonstrate an adaptive form of MPC that can compensate for model error or changes in the system to be controlled. This adaptive form of MPC is shown to inherit MPC's robustness to completely unmodeled disturbances and adaptive control's ability to decrease trajectory tracking errors over time. model predictive control optimal control soft robot control optimization GPU MPC Engineering
570	Visualisation et traitements interactifs de grilles régulières 3D haute-résolution virtualisées sur GPU. Application aux données biomédicales pour la microscopie virtuelle en environnement HPC. / Interactive visualisation and processing of high-resolution regular 3D grids virtualised on GPU. Application to biomedical data for virtual microscopy in HPC environment. Courilleau, Nicolas 29 August 2019 (has links) La visualisation de données est un aspect important de la recherche scientifique dans de nombreux domaines.Elle permet d'aider à comprendre les phénomènes observés voire simulés et d'en extraire des informations à des fins notamment de validations expérimentales ou tout simplement pour de la revue de projet.Nous nous intéressons dans le cadre de cette étude doctorale à la visualisation de données volumiques en imagerie médicale et biomédicale, obtenues grâce à des appareils d'acquisition générant des champs scalaires ou vectoriels représentés sous forme de grilles régulières 3D.La taille croissante des données, due à la précision grandissante des appareils d'acquisition, impose d'adapter les algorithmes de visualisation afin de pouvoir gérer de telles volumétries.De plus, les GPUs utilisés en visualisation de données volumiques, se trouvant être particulièrement adaptés à ces problématiques, disposent d'une quantité de mémoire très limitée comparée aux données à visualiser.La question se pose alors de savoir comment dissocier les unités de calculs, permettant la visualisation, de celles de stockage.Les algorithmes se basant sur le principe dit "out-of-core" sont les solutions permettant de gérer de larges ensembles de données volumiques.Dans cette thèse, nous proposons un pipeline complet permettant de visualiser et de traiter, en temps réel sur GPU, des volumes de données dépassant très largement les capacités mémoires des CPU et GPU.L'intérêt de notre pipeline provient de son approche de gestion de données "out-of-core" permettant de virtualiser la mémoire qui se trouve être particulièrement adaptée aux données volumiques.De plus, cette approche repose sur une structure d'adressage virtuel entièrement gérée et maintenue sur GPU.Nous validons notre modèle grâce à plusieurs applications de visualisation et de traitement en temps réel.Tout d'abord, nous proposons un microscope virtuel interactif permettant la visualisation 3D auto-stéréoscopique de piles d'images haute résolution.Puis nous validons l'adaptabilité de notre structure à tous types de données grâce à un microscope virtuel multimodale.Enfin, nous démontrons les capacités multi-rôles de notre structure grâce à une application de visualisation et de traitement concourant en temps réel. / Data visualisation is an essential aspect of scientific research in many fields.It helps to understand observed or even simulated phenomena and to extract information from them for purposes such as experimental validations or solely for project review.The focus given in this thesis is on the visualisation of volume data in medical and biomedical imaging.The acquisition devices used to acquire the data generate scalar or vector fields represented in the form of regular 3D grids.The increasing accuracy of the acquisition devices implies an increasing size of the volume data.Therefore, it requires to adapt the visualisation algorithms in order to be able to manage such volumes.Moreover, visualisation mostly relies on the use of GPUs because they suit well to such problematics.However, they possess a very limited amount of memory compared to the generated volume data.The question then arises as to how to dissociate the calculation units, allowing visualisation, from those of storage.Algorithms based on the so-called "out-of-core" principle are the solutions for managing large volume data sets.In this thesis, we propose a complete GPU-based pipeline allowing real-time visualisation and processing of volume data that are significantly larger than the CPU and GPU memory capacities.The pipeline interest comes from its GPU-based approach of an out-of-core addressing structure, allowing the data virtualisation, which is adequate for volume data management.We validate our approach using different real-time applications of visualisation and processing.First, we propose an interactive virtual microscope allowing 3D auto-stereoscopic visualisation of stacks of high-resolution images.Then, we verify the adaptability of our structure to all data types with a multimodal virtual microscope.Finally, we demonstrate the multi-role capabilities of our structure through a concurrent real-time visualisation and processing application. Out-Of-Core Microscope virtuel Visualisation interactive Multi-Résolution Gpu Interactive visualisation Virtual microscope Multi-Resolution 004.2

Search results