• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 475
  • 88
  • 87
  • 56
  • 43
  • 21
  • 14
  • 14
  • 11
  • 5
  • 5
  • 3
  • 3
  • 3
  • 3
  • Tagged with
  • 989
  • 321
  • 204
  • 184
  • 169
  • 165
  • 154
  • 138
  • 124
  • 104
  • 97
  • 95
  • 93
  • 88
  • 83
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
701

Ray Tracing Bézier Surfaces on GPU

Löw, Joakim January 2006 (has links)
<p>In this report, we show how to implement direct ray tracing of B´ezier surfaces on graphics processing units (GPUs), in particular bicubic rectangular Bézier surfaces and nonparametric cubic Bézier triangles. We use Newton’s method for the rectangular case and show how to use this method to find the ray-surface intersection. For Newton’s method to work we must build a spatial partitioning hierarchy around each surface patch, and in general, hierarchies are essential to speed up the process of ray tracing. We have chosen to use bounding box hierarchies and show how to implement stackless traversal of such a structure on a GPU. For the nonparametric triangular case, we show how to find the wanted intersection by simply solving a cubic polynomial. Because of the limited precision of current GPUs, we also propose a numerical approach to solve the problem, using a one-dimensional Newton search.</p>
702

Simulations Physiques Interactives sur des Architectures Multi-Core et Multi-GPU

Hermann, Everton 30 June 2010 (has links) (PDF)
La simulation physique interactive est une composante clé pour les environnements virtuels. Toutefois, la quantité de calcul ainsi que la complexité du code augmente rapidement avec la variété, le nombre et la taille des objets simulés. Au cours de cette thèse nous avons étudié les différents moyens d'améliorer l'interactivité, et en même temps de minimiser l'impact sur le code de simulation. En premier lieu nous avons développé une nouvelle approche de détection de collisions pour les objets déformables qui est rapide et plus robuste que les approches traditionnelles de détection par proximité. Pour tirer profit des machines multi-core, nous proposons une approche de parallélisation qui repose sur un parallélisme des tâches. Avant l'éxecution d'un pas de temps nous extrayons un graphe de dépendance de tâche qui est partitionné pour définir la répartition des tâches entre les processeurs. Cette approche a un faible impact sur les algorithmes de simulation physique étant donné que le parallélisme est obtenu en changeant uniquement le code d'orchestration du lancement des tâches. Finalement, nous avons étendu nos travaux aux architectures multi-CPU et multi-GPU. L'utilisation de ces ressources de manière efficace et transparente est un enjeu de taille. Nous proposons un schéma de parallélisation pour l'équilibrage dynamique de charge entre plusieurs CPUs et GPUs. Nous nous appuyons sur une approche à deux niveaux associant un partitionement du graphe de tâches et l'équilibrage de charge par l'utilisation du vol de travail guidé par des critères d'affinité entre processeurs. Ces critères visent à limiter les migrations de taches entre les unités de calcul, et de favoriser l' association de petites tâches sur les processeurs et des grandes sur les GPU pour tirer parti de l'hétérogénéité.
703

Efficient Information Visualization of Multivariate and Time-Varying Data

Johansson, Jimmy January 2008 (has links)
Data can be found everywhere, for example in the form of price, size, weight and colour of all products sold by a company, or as time series of daily observations of temperature, precipitation, wind and visibility from thousands of stations. Due to their size and complexity it is intrinsically hard to form a global overview and understanding of them. Information visualization aims at overcoming these difficulties by transforming data into representations that can be more easily interpreted. This thesis presents work on the development of methods to enable efficient information visualization of multivariate and time-varying data sets by conveying information in a clear and interpretable way, and in a reasonable time. The work presented is primarily based on a popular multivariate visualization technique called parallel coordinates but many of the methods can be generalized to apply to other information visualization techniques. A three-dimensional, multi-relational version of parallel coordinates is presented that enables a simultaneous analysis of all pairwise relationships between a single focus variable and all other variables included in the display. This approach permits a more rapid analysis of highly multivariate data sets. Through a number of user studies the multi-relational parallel coordinates technique has been evaluated against standard, two-dimensional parallel coordinates and been found to better support a number of different types of task. High precision density maps and transfer functions are presented as a means to reveal structure in large data displayed in parallel coordinates. These two approaches make it possible to interactively analyse arbitrary regions in a parallel coordinates display without risking the loss of significant structure. Another focus of this thesis relates to the visualization of time-varying, multivariate data. This has been studied both in the specific application area of system identification using volumetric representations, as well as in the general case by the introduction of temporal parallel coordinates. The methods described in this thesis have all been implemented using modern computer graphics hardware which enables the display and manipulation of very large data sets in real time. A wide range of data sets, both synthetically generated and taken from real applications, have been used to test these methods. It is expected that, as long as the data have multivariate properties, they could be employed efficiently.
704

Ray Tracing Bézier Surfaces on GPU

Löw, Joakim January 2006 (has links)
In this report, we show how to implement direct ray tracing of B´ezier surfaces on graphics processing units (GPUs), in particular bicubic rectangular Bézier surfaces and nonparametric cubic Bézier triangles. We use Newton’s method for the rectangular case and show how to use this method to find the ray-surface intersection. For Newton’s method to work we must build a spatial partitioning hierarchy around each surface patch, and in general, hierarchies are essential to speed up the process of ray tracing. We have chosen to use bounding box hierarchies and show how to implement stackless traversal of such a structure on a GPU. For the nonparametric triangular case, we show how to find the wanted intersection by simply solving a cubic polynomial. Because of the limited precision of current GPUs, we also propose a numerical approach to solve the problem, using a one-dimensional Newton search.
705

Real-time Soft Tissue Modelling on GPU for Medical Simulation

Comas, Olivier 16 December 2010 (has links) (PDF)
Modéliser la déformation de structures anatomiques en temps réel est un problème crucial en simulation médicale. En raison des grandes différences existantes dans leur forme et leur constitution, un modèle unique est insuffisant face à la variété des comportements mécaniques. Par conséquent, nous avons identifié deux principaux types de structures: les organes pleins (cerveau, foie, prostate etc.) et les organes creux (colon, vaisseaux sanguins, estomac etc.). Notre réponse à cette problématique est double. Notre première contribution est une implémentation GPU d'un modèle éléments finis qui est non-linéaire, anisotropique et viscoélastique pour les structures pleines. Notre seconde contribution est un environnement pour modéliser en temps réel les structures fines via un modèle parallèlisable et co-rotationnel utilisant des éléments coques et une approche pour mailler une surface complexe avec des éléments coques courbes. Bien que les deux modèles de tissus soient basés sur la mécanique continue pour une meilleure précision, ils sont tous les deux capables de simuler la déformation d'organes en temps réel. Enfin, leur implémentation dans l'environnement open source SOFA permettra la diffusion de ces deux modèles afin de participer à l'amélioration du réalisme des simulateurs médicaux.
706

Simplification Techniques for Interactive Applications

González Ballester, Carlos 09 July 2010 (has links)
Interactive applications with 3D graphics are used everyday in a lot of different fields, such as games, teaching, learning environments and virtual reality. The scenarios showed in interactive applications usually tend to present detailed worlds and characters, being the most realistic as possible. Detailed 3D models require a lot of geometric complexity. But not always the available graphics hardware can handle and manage all this geometry maintaining a realistic frame rate. Simplification methods attempt to solve this problem, by generating simplified versions of the original 3D models. These simplified models present less geometry than the original ones. This simplification has to be done with a reasonable criterion in order to maintain as possible the appearance of the original models. But the geometry is not the only important factor in 3D models. They are also composed of additional attributes that are important for the final aspect of the models for the viewer. In the literature we can find a lot of work presented about simplification. However, there are still several points without an efficient solution. Therefore, this thesis focuses on simplification techniques for 3D models usually used in interactive applications.
707

Models for Parallel Computation in Multi-Core, Heterogeneous, and Ultra Wide-Word Architectures

Salinger, Alejandro January 2013 (has links)
Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a chip being widely available and an increasing number of cores predicted for the future. In addition, the decreasing costs and increasing programmability of Graphic Processing Units (GPUs) have made these an accessible source of parallel processing power in general purpose computing. Among the many research challenges that this scenario has raised are the fundamental problems related to theoretical modeling of computation in these architectures. In this thesis we study several aspects of computation in modern parallel architectures, from modeling of computation in multi-cores and heterogeneous platforms, to multi-core cache management strategies, through the proposal of an architecture that exploits bit-parallelism on thousands of bits. Observing that in practice multi-cores have a small number of cores, we propose a model for low-degree parallelism for these architectures. We argue that assuming a small number of processors (logarithmic in a problem's input size) simplifies the design of parallel algorithms. We show that in this model a large class of divide-and-conquer and dynamic programming algorithms can be parallelized with simple modifications to sequential programs, while achieving optimal parallel speedups. We further explore low-degree-parallelism in computation, providing evidence of fundamental differences in practice and theory between systems with a sublinear and linear number of processors, and suggesting a sharp theoretical gap between the classes of problems that are efficiently parallelizable in each case. Efficient strategies to manage shared caches play a crucial role in multi-core performance. We propose a model for paging in multi-core shared caches, which extends classical paging to a setting in which several threads share the cache. We show that in this setting traditional cache management policies perform poorly, and that any effective strategy must partition the cache among threads, with a partition that adapts dynamically to the demands of each thread. Inspired by the shared cache setting, we introduce the minimum cache usage problem, an extension to classical sequential paging in which algorithms must account for the amount of cache they use. This cache-aware model seeks algorithms with good performance in terms of faults and the amount of cache used, and has applications in energy efficient caching and in shared cache scenarios. The wide availability of GPUs has added to the parallel power of multi-cores, however, most applications underutilize the available resources. We propose a model for hybrid computation in heterogeneous systems with multi-cores and GPU, and describe strategies for generic parallelization and efficient scheduling of a large class of divide-and-conquer algorithms. Lastly, we introduce the Ultra-Wide Word architecture and model, an extension of the word-RAM model, that allows for constant time operations on thousands of bits in parallel. We show that a large class of existing algorithms can be implemented in the Ultra-Wide Word model, achieving speedups comparable to those of multi-threaded computations, while avoiding the more difficult aspects of parallel programming.
708

gNek: A GPU Accelerated Incompressible Navier Stokes Solver

Stilwell, Nichole 16 September 2013 (has links)
This thesis presents a GPU accelerated implementation of a high order splitting scheme with a spectral element discretization for the incompressible Navier Stokes (INS) equations. While others have implemented this scheme on clusters of processors using the Nek5000 code, to my knowledge this thesis is the first to explore its performance on the GPU. This work implements several of the Nek5000 algorithms using OpenCL kernels that efficiently utilize the GPU memory architecture, and achieve massively parallel on chip computations. These rapid computations have the potential to significantly enhance computational fluid dynamics (CFD) simulations that arise in areas such as weather modeling or aircraft design procedures. I present convergence results for several test cases including channel, shear, Kovasznay, and lid-driven cavity flow problems, which achieve the proven convergence results.
709

GPU-accelerated Model Checking of Periodic Self-Suspending Real-Time Tasks

Liberg, Tim, Måhl, Per-Erik January 2012 (has links)
Efficient model checking is important in order to make this type of software verification useful for systems that are complex in their structure. If a system is too large or complex then model checking does not simply scale, i.e., it could take too much time to verify the system. This is one strong argument for focusing on making model checking faster. Another interesting aim is to make model checking so fast that it can be used for predicting scheduling decisions for real-time schedulers at runtime. This of course requires the model checking to complete within a order of milliseconds or even microseconds. The aim is set very high but the results of this thesis will at least give a hint on whether this seems possible or not. The magic card for (maybe) making this possible is called Graphics Processing Unit (GPU). This thesis will investigate if and how a model checking algorithm can be ported and executed on a GPU. Modern GPU architectures offers a high degree of processing power since they are equipped with up to 1000 (NVIDIA GTX 590) or 3000 (NVIDIA Tesla K10) processor cores. The drawback is that they offer poor thread-communication possibilities and memory caches compared to CPU. This makes it very difficult to port CPU programs to GPUs.The example model (system) used in this thesis represents a real-time task scheduler that can schedule up to three periodic self-suspending tasks. The aim is to verify, i.e., find a feasible schedule for these tasks, and do it as fast as possible with the help of the GPU.
710

Data Parallelism For Ray Casting Large Scenes On A Cpu-gpu Cluster

Topcu, Tumer 01 June 2008 (has links) (PDF)
In the last decade, computational power, memory bandwidth and programmability capabilities of graphics processing units (GPU) have rapidly evolved. Therefore, many researches have been performed to use GPUs in advanced graphics rendering. Because of its high degree of parallelism, ray tracing has been one of the rst algorithms studied on GPUs. However, the rendering of large scenes with ray tracing can easily exceed the GPU&#039 / s memory capacity. The algorithm proposed in this work uses a data parallel approach where the scene is partitioned and assigned to CPU-GPU couples in a cluster to overcome this problem. Our algorithm focuses on ray casting which is a special case of ray tracing mainly used in visualization of volumetric data. CPUs are pretty ecient in ow control and branching while GPUs are very fast performing intense oating point operations. Using these facts, the GPUs in the cluster are assigned the task of performing ray casting while the CPUs are responsible for traversing the rays. In the end, we were able to visualize large scenes successfully by utilizing CPU-GPU couples eectively and observed that the performance is highly dependent on the viewing angle as a result of load imbalance.

Page generated in 0.0577 seconds