Global ETD Search

581	Gene-EnvironmentInteraction Analysis UsingGraphic Cards / Analys av genmiljöinteraktion med använding avgrafikkort Berglund, Daniel January 2015 (has links) Genome-wide association studies(GWAS) are used to find associations betweengenetic markers and diseases. One part of GWAS is to study interactions be-tween markers which can play an important role in the risk for the disease. Thesearch for interactions can be computationally intensive. The aim of this thesiswas to improve the performance of software used for gene-environment interac-tion by using parallel programming techniques on graphical processors. A studyof the new programs performance, speedup and efficiency was made using mul-tiple simulated datasets. The program shows significantly better performancecompared with the older program. HPC high performance computing CUDA GPU GWAS gene-environment interaction interaction Computer Sciences Datavetenskap (datalogi)
582	A new approach for Enterprise Application Architecture for Financial Information Systems : An investigation of the architectural implications of adopting serialization and RPC frameworks, NoSQL/hybrid data stores and heterogeneous computing in Financial Information Systems Eriksson, Peter January 2015 (has links) This thesis investigates the architectural implications of adopting serialisation and remote procedure call (RPC) frameworks, NoSQL/hybrid data stores and heterogeneous computing in financial information systems. Each tech- nology and its implications is analysed separately together with its benefits and drawbacks for the system implemen- tor. The investigation shows that all three technologies can help alleviate technical challenges facing financial enter- prises; but they all come at a cost of complexity. / Denna rapport undersöker inverkan av serialiserings- och fjärrproceduranropsramverk (RPC), NoSQL/hybrid data- lagringslöningar samt heterogen beräkning på arkitekturen av finansiella informationssystem. Varje teknologi och dess inverkan analyseras var för sig tillsammans med dess för- och nackdelar för systemutvecklaren. Undersökningen visar att alla tre teknologierna kan av- hjälpa de tekniska utmaningar som finansiella aktörer står inför; men alla medför en komplexitetskostnad. Enterprise Application Architecture database tier serialization gpu Computer Sciences Datavetenskap (datalogi)
583	Aplicación de técnicas de computación paralela para la aceleración de algoritmos de ingeniería Rico, Héctor 02 December 2021 (has links) La utilización de algoritmos de optimización en problemas de ingeniería ha tenido un gran aumento en los últimos años, lo que ha llevado a la proliferación de un gran número de nuevos algoritmos para resolver problemas de optimización. Además, la aparición de nuevas técnicas de paralelización aplicables a estos algoritmos para mejorar su tiempo de convergencia ha hecho que sea objeto de estudio por parte de muchos autores. Dentro de todos los algoritmos centraremos la investigación en dos algoritmos de optimización: Jaya y TLBO (y su versión discreta DTLBO). Una de las principales ventajas de ambos algoritmos sobre otros métodos de optimización es que los primeros no necesitan ajustar parámetros específicos para el problema concreto al que se aplican. En este trabajo se comparan las implementaciones paralelas de Teaching-Learning Based Optimization y Jaya. La paralelización de ambos algoritmos se realiza utilizando técnicas de GPUs manycore. Se crearán diferentes escenarios partiendo de un enfoque teórico utilizando funciones de la literatura actual para la evaluación de algoritmos de optimización y finalizando en la aplicación de dichos algoritmos a problemas reales de optimización de rutas, en nuestro caso aplicándolo al problema del viajante y para problemas de perforación en placas. Los resultados permitirán comparar ambos algoritmos paralelos en cuanto al número de iteraciones y el tiempo necesario para realizarlas para obtener un nivel de error predeterminado. También se analizará la ocupación de recursos de la GPU en cada caso. Optimización Algoritmos metaheurísticos Jaya TLBO CUDA GPU Paralelismo
584	Paralleles konturbasiertes Connected-Component-Labeling für 2D-Bilddaten mit OpenCL und Cuda Wenke, Henning 09 October 2015 (has links) Connected-Component-Labeling (CCL) für 2D-Bilddaten ist ein bekanntes Problem im Bereich der Bildverarbeitung. Ziel ist es, zusammenhängende Pixelgruppen mit gleichen Eigenschaften zu erkennen und mit einem eindeutigen Label zu versehen. Zur Lösung von CCL-Problemen für 2D-Bilddaten werden sowohl sequentielle als auch parallele Algorithmen untersucht. Unter den bekannten Algorithmen gibt es solche, die asymptotisch optimale Eigenschaften besitzen. Speziell für den Bereich der Bildverarbeitung interessant sind außerdem auf Konturierung basierende Algorithmen. Die zusätzlich extrahierten Konturen können z.B. für die Buchstabenerkennung genutzt werden. Seit der jüngeren Vergangenheit werden Grafikprozessoren (GPUs) mit großem Erfolg für allgemeines Computing eingesetzt. So existieren auch mehrere Implementationen von Connected-Component-Labeling-Algorithmen für GPUs, welche im Vergleich mit Varianten für CPUs oft deutlich schneller sind. Diese GPU-basierten Ansätze verarbeiten typischerweise das Pixelgitter direkt. Im Rahmen der vorliegenden Arbeit werden mehrere neue parallele CCL-Algorithmen vorgeschlagen, welche auf Konturen basieren und sowohl für GPUs als auch für Multicore-CPUs geeignet sind. Diese werden experimentell mit Implementationen aus der Literatur unter Verwendung aktueller GPUs und CPUs verglichen. Dabei erreichen in vielen Fällen die vorgeschlagenen Techniken ein besseres Laufzeitverhalten. Das ist auf GPUs insbesondere dann besonders deutlich, wenn sich die evaluierten Datensätze durch einen geringen Anteil von Konturen im Vergleich zur Fläche der Connected-Components auszeichnen. Paralleles Computing Parallele Algorithmen 2D-Bilddaten Konturerkennung GPU Cuda OpenCL Connected-Component-Labeling ddc:000
585	Skeleton Programming for Heterogeneous GPU-based Systems Dastgeer, Usman January 2011 (has links) In this thesis, we address issues associated with programming modern heterogeneous systems while focusing on a special kind of heterogeneous systems that include multicore CPUs and one or more GPUs, called GPU-based systems.We consider the skeleton programming approach to achieve high level abstraction for efficient and portable programming of these GPU-based systemsand present our work on SkePU library which is a skeleton library for these systems. We extend the existing SkePU library with a two-dimensional (2D) data type and skeleton operations and implement several new applications using newly made skeletons. Furthermore, we consider the algorithmic choice present in SkePU and implement support to specify and automatically optimize the algorithmic choice for a skeleton call, on a given platform. To show how to achieve performance, we provide a case-study on optimized GPU-based skeleton implementation for 2D stencil computations and introduce two metrics to maximize resource utilization on a GPU. By devising a mechanism to automatically calculate these two metrics, performance can be retained while porting an application from one GPU architecture to another. Another contribution of this thesis is implementation of the runtime support for the SkePU skeleton library. This is achieved with the help of the StarPUruntime system. By this implementation,support for dynamic scheduling and load balancing for the SkePU skeleton programs is achieved. Furthermore, a capability to do hybrid executionby parallel execution on all available CPUs and GPUs in a system, even for a single skeleton invocation, is developed. SkePU initially supported only data-parallel skeletons. The first task-parallel skeleton (farm) in SkePU is implemented with support for performance-aware scheduling and hierarchical parallel execution by enabling all data parallel skeletons to be usable as tasks inside the farm construct. Experimental evaluations are carried out and presented for algorithmic selection, performance portability, dynamic scheduling and hybrid execution aspects of our work. Skeleton programming GPU programming SkePU performance portability Computer Sciences Datavetenskap (datalogi)
586	Apports des architectures hybrides à l'imagerie profondeur : étude comparative entre CPU, APU et GPU. / Contributions of hybrid architectures to depth imaging : a CPU, APU and GPU comparative study Said, Issam 21 December 2015 (has links) Les compagnies pétrolières s'appuient sur le HPC pour accélérer les algorithmes d'imagerie profondeur. Les grappes de CPU et les accélérateurs matériels sont largement adoptés par l'industrie. Les processeurs graphiques (GPU), avec une grande puissance de calcul et une large bande passante mémoire, ont suscité un vif intérêt. Cependant le déploiement d'applications telle la Reverse Time Migration (RTM) sur ces architectures présente quelques limitations. Notamment, une capacité mémoire réduite, des communications fréquentes entre le CPU et le GPU présentant un possible goulot d'étranglement à cause du bus PCI, et des consommations d'énergie élevées. AMD a récemment lancé l'Accelerated Processing Unit (APU) : un processeur qui fusionne CPU et GPU sur la même puce via une mémoire unifiée. Dans cette thèse, nous explorons l'efficacité de la technologie APU dans un contexte pétrolier, et nous étudions si elle peut surmonter les limitations des solutions basées sur CPU et sur GPU. L'APU est évalué à l'aide d'une suite OpenCL de tests mémoire, applicatifs et d'efficacité énergétique. La faisabilité de l'utilisation hybride de l'APU est explorée. L'efficacité d'une approche par directives de compilation est également étudiée. En analysant une sélection d'applications sismiques (modélisation et RTM) au niveau du noeud et à grande échelle, une étude comparative entre CPU, APU et GPU est menée. Nous montrons la pertinence du recouvrement des entrées-sorties et des communications MPI par le calcul pour les grappes d'APU et de GPU, que les APU délivrent des performances variant entre celles du CPU et celles du GPU, et que l'APU peut être aussi énergétiquement efficace que le GPU. / In an exploration context, Oil and Gas (O&G) companies rely on HPC to accelerate depth imaging algorithms. Solutions based on CPU clusters and hardware accelerators are widely embraced by the industry. The Graphics Processing Units (GPUs), with a huge compute power and a high memory bandwidth, had attracted significant interest.However, deploying heavy imaging workflows, the Reverse Time Migration (RTM) being the most famous, on such hardware had suffered from few limitations. Namely, the lack of memory capacity, frequent CPU-GPU communications that may be bottlenecked by the PCI transfer rate, and high power consumptions. Recently, AMD has launched theAccelerated Processing Unit (APU): a processor that merges a CPU and a GPU on the same die, with promising features notably a unified CPU-GPU memory. Throughout this thesis, we explore how efficiently may the APU technology be applicable in an O&G context, and study if it can overcome the limitations that characterize the CPU and GPU based solutions. The APU is evaluated with the help of memory, applicative and power efficiency OpenCL benchmarks. The feasibility of the hybrid utilization of the APUs is surveyed. The efficiency of a directive based approach is also investigated. By means of a thorough review of a selection of seismic applications (modeling and RTM) on the node level and on the large scale level, a comparative study between the CPU, the APU and the GPU is conducted. We show the relevance of overlapping I/O and MPI communications with computations for the APU and GPUclusters, that APUs deliver performances that range between those of CPUs and those of GPUs, and that the APU can be as power efficient as the GPU. HPC Calcul GPU Architectures hybrides APU Géophysique RTM APU RTM Graphics processors 004
587	Synthèse géométrique temps réel / Real-time geometry synthesis Holländer, Matthias 07 March 2013 (has links) La géométrie numérique en temps réel est un domaîne de recherches émergent en informatique graphique.Pour pouvoir générer des images photo-réalistes de haute définition,beaucoup d'applications requièrent des méthodes souvent prohibitives financièrementet relativement lentes.Parmi ces applications, on peut citer la pré-visualisation d'architectures, la réalisation de films d'animation,la création de publicités ou d'effets spéciaux pour les films dits réalistes.Dans ces cas, il est souvent nécessaire d'utiliser conjointement beaucoup d'ordinateurs possédanteux-mêmes plusieurs unités graphiques (Graphics Processing Units - GPUs).Cependant, certaines applications dites temps-réel ne peuvent s'accomoder de telles techniques, car elles requièrentde pouvoir générer plus de 30 images par seconde pour offrir un confort d'utilisationet une intéraction avec des mondes virtuels 3D riches et réalistes.L'idée principale de cette thèse est d'utiliser la synthèse de géométrie,la géométrie numérique et l'analyse géométrique pourrépondre à des problèmes classiques en informatique graphique,telle que la génération de surfaces de subdivision, l'illumination globaleou encore l'anti-aliasing dans des contextes d'intéraction temps-réel.Nous présentons de nouveaux algorithmes adaptés aux architectures matérielles courantes pour atteindre ce but. / Eal-time geometry synthesis is an emerging topic in computer graphics.Today's interactive 3D applications have to face a variety of challengesto fulfill the consumer's request for more realism and high quality images.Often, visual effects and quality known from offline-rendered feature films or special effects in movie productions are the ultimate goal but hard to achieve in real time.This thesis offers real-time solutions by exploiting the Graphics Processing Unit (GPU)and efficient geometry processing.In particular, a variety of topics related to classical fields in computer graphics such assubdivision surfaces, global illumination and anti-aliasing are discussedand new approaches and techniques are presented. Infographie tridimensionnelle Processeur graphique Rendu photoréaliste 3D computer graphics GPU Rendering
588	Split Latency Allocator: Process Variation-Aware Register Access Latency Boost in a Near-Threshold Graphics Processing Unit Pal, Asmita 01 August 2018 (has links) Over the last decade, Graphics Processing Units (GPUs) have been used extensively in gaming consoles, mobile phones, workstations and data centers, as they have exhibited immense performance improvement over CPUs, in graphics intensive applications. Due to their highly parallel architecture, general purpose GPUs (GPGPUs) have gained the foreground in applications where large data blocks can be processed in parallel. However, the performance improvement is constrained by a large power consumption. Likewise, Near Threshold Computing (NTC) has emerged as an energy-efficient design paradigm. Hence, operating GPUs at NTC seems like a plausible solution to counteract the high energy consumption. This work investigates the challenges associated with NTC operation of GPUs and proposes a low-power GPU design, Split Latency Allocator, to sustain the performance of GPGPU applications. GPU Near Threshold Computing Process Variation Register Access Latency Wavefront Scheduling Computer Engineering
589	Parallel Processing For Adaptive Optics Optical Coherence Tomography (AO-OCT) Image Registration Using GPU Do, Nhan Hieu 08 July 2016 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Adaptive Optics Optical Coherence Tomography (AO-OCT) is a high-speed, high-resolution ophthalmic imaging technique offering detailed 3D analysis of retina structure in vivo. However, AO-OCT volume images are sensitive to involuntary eye movements that occur even during steady fixation and include tremor, drifts, and micro-saccades. To correct eye motion artifacts within a volume and to stabilize a sequence of volumes acquired of the same retina area, we propose a stripe-wise 3D image registration algorithm with phase correlation. In addition, using several ideas such as coarse-to-fine approach, spike noise filtering, pre-computation caching, and parallel processing on a GPU, our approach can register a volume of size 512 x 512 x 512 in less than 6 seconds, which is a 33x speedup as compared to an equivalent CPU version in MATLAB. Moreover, our 3D registration approach is reliable even in the presence of large motions (micro-saccades) that distort the volumes. Such motion was an obstacle for a previous en face approach based on 2D projected images. The thesis also investigates GPU implementations for 3D phase correlation and 2D normalized cross-correlation, which could be useful for other image processing algorithms. 3D Phase Correlation AO-OCT GPU Image Registration Normalized Cross-correlation OCT
590	Realistické zobrazování voxelových scén v reálném čase / Real-Time Photorealistic Rendering of Voxel Scenes Flajšingr, Petr January 2021 (has links) The subject of this thesis is an implementation of realistic rendering of voxel scenes using a graphics card. This work explains the fundamentals of realistic rendering and voxel representation of visual data. It also presents selected hierarchical structures usable for acceleration and describes the desing of a solution focusing on the representation of voxel data and their rendering. The thesis describes libraries created as part of the project and algorithms. It also evaluates time and memory requirements of the application along with graphical output.

Search results