21 |
Estratégia paralela exata para o alinhamento múlltiplo de sequências biológicas utilizando Unidades de Processamento Gráfico (GPU)Lima, Daniel Sundfeld 28 August 2012 (has links)
Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas,
Departamento de Ciência da Computação, 2012. / Submitted by Albânia Cézar de Melo (albania@bce.unb.br) on 2013-04-11T12:42:16Z
No. of bitstreams: 1
2012_DanielSundfeldLima.pdf: 2274332 bytes, checksum: 03f64cd52764929edc5ad78619656562 (MD5) / Approved for entry into archive by Guimaraes Jacqueline(jacqueline.guimaraes@bce.unb.br) on 2013-05-20T14:40:19Z (GMT) No. of bitstreams: 1
2012_DanielSundfeldLima.pdf: 2274332 bytes, checksum: 03f64cd52764929edc5ad78619656562 (MD5) / Made available in DSpace on 2013-05-20T14:40:19Z (GMT). No. of bitstreams: 1
2012_DanielSundfeldLima.pdf: 2274332 bytes, checksum: 03f64cd52764929edc5ad78619656562 (MD5) / O alinhamento múltiplo de sequências biológicas é um problema muito importante em
Biologia Molecular, pois permite que sejam detectadas similaridades e diferenças entre um conjunto de sequências. Esse problema foi provado NP-Difícil e, por essa razão, geralmente algoritmos heurísticos são usados para resolvê-lo. No entanto, a obtenção da solucão ótima é bastante desejada e, por essa razão, existem alguns algoritmos exatos que solucionam esse problema para um número reduzido de sequências. Dentre esses algoritmos, destaca-se o método exato Carrillo-Lipman, que permite reduzir o espaço de busca utilizando um limite inferior e superior. Mesmo com essa redução, o algoritmo com Carrillo-Lipman executa-se em tempo exponencial. Com o objetivo de acelerar a obtenção de resultados,
plataformas computacionais de alto desempenho podem ser utilizadas para resolver o
problema do alinhamento múltiplo. Dentre essas plataformas, destacam-se as Unidades
de Processamento Gráfico (GPU) devido ao seu potencial para paralelismo massivo e
baixo custo. O objetivo dessa dissertação de mestrado é propor e avaliar uma estratégia
paralela para execução do algoritmo Carrillo-Lipman em GPU. A nossa estratégia permite
a exploração do paralelismo em granularidade na, onde o espaço de busca é percorrido
por várias threads em um cubo tridimensional, divido em janelas de processamento que
são diagonais projetadas em duas dimensões. Os resultados obtidos com a comparação de
conjuntos de 3 sequências reais e sintéticas de diversos tamanhos mostram que speedups
de até 8,60x podem ser atingidos com a nossa estratégia. ______________________________________________________________________________ ABSTRACT / Multiple Sequence Alignment is a very important problem in Molecular Biology since
it is able to detect similarities and di erences in a set of sequences. This problem has been proven NP-Hard and, for this reason, heuristic algorithms are usually used to solve it. Nevertheless, obtaining the optimal solution is highly desirable and there are indeed some exact algorithms that solve this problemfor a reduced number of sequences. Carrillo-Lipman is a well-known exact algorithmfor the Multiple Sequence Alignment problemthat is able to reduce the search space by using inferior and superior bounds. Even with this reduction, the Carrillo-Lipman algorithm executes in exponential time. High Performance
Computing (HPC) Platforms can be used in order to produce results faster. Among
the existing HPC platforms, GPUs (Graphics Processing Units) are receiving a lot of
attention due to their massive parallelism and low cost. The goal of this MsC dissertation is to propose and evaluate a parallel strategy to execute the Carrillo-Lipman algorithm in GPU. Our strategy explores parallelism at ne granularity, where the search space is a tridimensional cube, divided on processing windows with bidimensional diagonals, explored by multiple threads. The results obtained when comparing several sets of 3 real and synthetic sequences show that speedups of 8.60x can be obtained with our strategy. Read more
|
22 |
Optimizing Tensor Contractions on GPUsKim, Jinsung 06 November 2019 (has links)
No description available.
|
23 |
[en] MASSIVELY PARALLEL GENETIC PROGRAMMING ON GPUS / [pt] PROGRAMAÇÃO GENÉTICA MACIÇAMENTE PARALELA EM GPUSCLEOMAR PEREIRA DA SILVA 25 February 2015 (has links)
[pt] A Programação Genética permite que computadores resolvam problemas
automaticamente, sem que eles tenham sido programados para tal. Utilizando
a inspiração no princípio da seleção natural de Darwin, uma população
de programas, ou indivíduos, é mantida, modificada baseada em variação
genética, e avaliada de acordo com uma função de aptidão (fitness). A
programação genética tem sido usada com sucesso por uma série de aplicações
como projeto automático, reconhecimento de padrões, controle robótico,
mineração de dados e análise de imagens. Porém, a avaliação da gigantesca
quantidade de indivíduos gerados requer excessiva quantidade de computação,
levando a um tempo de execução inviável para problemas grandes. Este
trabalho explora o alto poder computacional de unidades de processamento
gráfico, ou GPUs, para acelerar a programação genética e permitir a geração
automática de programas para grandes problemas. Propomos duas novas
metodologias para se explorar a GPU em programação genética: compilação em
linguagem intermediária e a criação de indivíduos em código de máquina. Estas
metodologias apresentam vantagens em relação às metodologias tradicionais
usadas na literatura. A utilização de linguagem intermediária reduz etapas de
compilação e trabalha com instruções que estão bem documentadas. A criação
de indivíduos em código de máquina não possui nenhuma etapa de compilação,
mas requer engenharia reversa das instruções que não estão documentadas
neste nível. Nossas metodologias são baseadas em programação genética
linear e inspiradas em computação quântica. O uso de computação quântica
permite uma convergência rápida, capacidade de busca global e inclusão da
história passada dos indivíduos. As metodologias propostas foram comparadas
com as metodologias existentes e apresentaram ganhos consideráveis de
desempenho. Foi observado um desempenho máximo de até 2,74 trilhões de
GPops (operações de programação genética por segundo) para o benchmark
Multiplexador de 20 bits e foi possível estender a programação genética para
problemas que apresentam bases de dados de até 7 milhões de amostras. / [en] Genetic Programming enables computers to solve problems
automatically, without being programmed to it. Using the inspiration in
the Darwin s Principle of natural selection, a population of programs or
individuals is maintained, modified based on genetic variation, and evaluated
according to a fitness function. Genetic programming has been successfully
applied to many different applications such as automatic design, pattern
recognition, robotic control, data mining and image analysis. However, the
evaluation of the huge amount of individuals requires excessive computational
demands, leading to extremely long computational times for large size
problems. This work exploits the high computational power of graphics
processing units, or GPUs, to accelerate genetic programming and to enable
the automatic generation of programs for large problems. We propose two
new methodologies to exploit the power of the GPU in genetic programming:
intermediate language compilation and individuals creation in machine
language. These methodologies have advantages over traditional methods
used in the literature. The use of an intermediate language reduces the
compilation steps, and works with instructions that are well-documented.
The individuals creation in machine language has no compilation step, but
requires reverse engineering of the instructions that are not documented at
this level. Our methodologies are based on linear genetic programming and are
inspired by quantum computing. The use of quantum computing allows rapid
convergence, global search capability and inclusion of individuals past history.
The proposed methodologies were compared against existing methodologies
and they showed considerable performance gains. It was observed a maximum
performance of 2,74 trillion GPops (genetic programming operations per
second) for the 20-bit Multiplexer benchmark, and it was possible to extend
genetic programming for problems that have databases with up to 7 million
samples. Read more
|
24 |
Acceleration of CFD and Data Analysis Using Graphics ProcessorsKhajeh Saeed, Ali 01 February 2012 (has links)
Graphics processing units function well as high performance computing devices for scientific computing. The non-standard processor architecture and high memory bandwidth allow graphics processing units (GPUs) to provide some of the best performance in terms of FLOPS per dollar. Recently these capabilities became accessible for general purpose computations with the CUDA programming environment on NVIDIA GPUs and ATI Stream Computing environment on ATI GPUs. Many applications in computational science are constrained by memory access speeds and can be accelerated significantly by using GPUs as the compute engine. Using graphics processing units as a compute engine gives the personal desktop computer a processing capacity that competes with supercomputers. Graphics Processing Units represent an energy efficient architecture for high performance computing in flow simulations and many other fields. This document reviews the graphic processing unit and its features and limitations.
|
25 |
Algorithmic and Software System Support to Accelerate Data Processing in CPU-GPU Hybrid Computing EnvironmentsWang, Kaibo January 2015 (has links)
No description available.
|
26 |
Accelerating Component-Based Dataflow Middleware with Adaptivity and HeterogeneityHartley, Timothy D. R. 25 July 2011 (has links)
No description available.
|
27 |
Enabling the use of Heterogeneous Computing for BioinformaticsBijanapalli Chakri, Ramakrishna 02 October 2013 (has links)
The huge amount of information in the encoded sequence of DNA and increasing interest in uncovering new discoveries has spurred interest in accelerating the DNA sequencing and alignment processes. The use of heterogeneous systems, that use different types of computational units, has seen a new light in high performance computing in recent years; However expertise in multiple domains and skills required to program these systems is causing an hindrance to bioinformaticians in rapidly deploying their applications into these heterogeneous systems. This work attempts to make an heterogeneous system, Convey HC-1, with an x86-based host processor and FPGA-based co-processor, accessible to bioinformaticians. First, a highly efficient dynamic programming based Smith-Waterman kernel is implemented in hardware, which is able to achieve a peak throughput of 307.2 Giga Cell Updates per Second (GCUPS) on Convey HC-1. A dynamic programming accelerator interface is provided to any application that uses Smith-Waterman. This implementation is also extended to General Purpose Graphics Processing Units (GP-GPUs), which achieved a peak throughput of 9.89 GCUPS on NVIDIA GTX580 GPU. Second, a well known graphical programming tool, LabVIEW is enabled as a programming tool for the Convey HC-1. A connection is established between the graphical interface and the Convey HC-1 to control and monitor the application running on the FPGA-based co-processor. / Master of Science Read more
|
28 |
Paralelização do algoritmo FDK para reconstrução 3D de imagens tomográficas usando unidades gráficas de processamento e CUDA-C / Parallelization of the FDK algotithm for 3D reconstruction of tomographic images using graphic processing units and CUDA-CJoel Sánchez Domínguez 12 January 2012 (has links)
Conselho Nacional de Desenvolvimento Científico e Tecnológico / A obtenção de imagens usando tomografia computadorizada revolucionou o
diagnóstico de doenças na medicina e é usada amplamente em diferentes áreas da pesquisa
científica. Como parte do processo de obtenção das imagens tomográficas tridimensionais um
conjunto de radiografias são processadas por um algoritmo computacional, o mais usado
atualmente é o algoritmo de Feldkamp, David e Kress (FDK). Os usos do processamento
paralelo para acelerar os cálculos em algoritmos computacionais usando as diferentes
tecnologias disponíveis no mercado têm mostrado sua utilidade para diminuir os tempos de
processamento. No presente trabalho é apresentada a paralelização do algoritmo de
reconstrução de imagens tridimensionais FDK usando unidades gráficas de processamento
(GPU) e a linguagem CUDA-C. São apresentadas as GPUs como uma opção viável para
executar computação paralela e abordados os conceitos introdutórios associados à tomografia
computadorizada, GPUs, CUDA-C e processamento paralelo. A versão paralela do algoritmo
FDK executada na GPU é comparada com uma versão serial do mesmo, mostrando maior
velocidade de processamento. Os testes de desempenho foram feitos em duas GPUs de
diferentes capacidades: a placa NVIDIA GeForce 9400GT (16 núcleos) e a placa NVIDIA
Quadro 2000 (192 núcleos). / The imaging using computed tomography has revolutionized the diagnosis of diseases
in medicine and is widely used in different areas of scientific research. As part of the process
to obtained three-dimensional tomographic images a set of x-rays are processed by a
computer algorithm, the most widely used algorithm is Feldkamp, David and Kress (FDK).
The use of parallel processing to speed up calculations on computer algorithms with the
different available technologies, showing their usefulness to decrease processing times. In the
present paper presents the parallelization of the algorithm for three-dimensional image
reconstruction FDK using graphics processing units (GPU) and CUDA-C. GPUs are shown as
a viable option to perform parallel computing and addressed the introductory concepts
associated with computed tomographic, GPUs, CUDA-C and parallel processing. The parallel
version of the FDK algorithm is executed on the GPU and compared to a serial version of the
same, showing higher processing speed. Performance tests were made in two GPUs with
different capacities, the NVIDIA GeForce 9400GT (16 cores) and NVIDIA GeForce 2000
(192 cores). Read more
|
29 |
Paralelização do algoritmo FDK para reconstrução 3D de imagens tomográficas usando unidades gráficas de processamento e CUDA-C / Parallelization of the FDK algotithm for 3D reconstruction of tomographic images using graphic processing units and CUDA-CJoel Sánchez Domínguez 12 January 2012 (has links)
Conselho Nacional de Desenvolvimento Científico e Tecnológico / A obtenção de imagens usando tomografia computadorizada revolucionou o
diagnóstico de doenças na medicina e é usada amplamente em diferentes áreas da pesquisa
científica. Como parte do processo de obtenção das imagens tomográficas tridimensionais um
conjunto de radiografias são processadas por um algoritmo computacional, o mais usado
atualmente é o algoritmo de Feldkamp, David e Kress (FDK). Os usos do processamento
paralelo para acelerar os cálculos em algoritmos computacionais usando as diferentes
tecnologias disponíveis no mercado têm mostrado sua utilidade para diminuir os tempos de
processamento. No presente trabalho é apresentada a paralelização do algoritmo de
reconstrução de imagens tridimensionais FDK usando unidades gráficas de processamento
(GPU) e a linguagem CUDA-C. São apresentadas as GPUs como uma opção viável para
executar computação paralela e abordados os conceitos introdutórios associados à tomografia
computadorizada, GPUs, CUDA-C e processamento paralelo. A versão paralela do algoritmo
FDK executada na GPU é comparada com uma versão serial do mesmo, mostrando maior
velocidade de processamento. Os testes de desempenho foram feitos em duas GPUs de
diferentes capacidades: a placa NVIDIA GeForce 9400GT (16 núcleos) e a placa NVIDIA
Quadro 2000 (192 núcleos). / The imaging using computed tomography has revolutionized the diagnosis of diseases
in medicine and is widely used in different areas of scientific research. As part of the process
to obtained three-dimensional tomographic images a set of x-rays are processed by a
computer algorithm, the most widely used algorithm is Feldkamp, David and Kress (FDK).
The use of parallel processing to speed up calculations on computer algorithms with the
different available technologies, showing their usefulness to decrease processing times. In the
present paper presents the parallelization of the algorithm for three-dimensional image
reconstruction FDK using graphics processing units (GPU) and CUDA-C. GPUs are shown as
a viable option to perform parallel computing and addressed the introductory concepts
associated with computed tomographic, GPUs, CUDA-C and parallel processing. The parallel
version of the FDK algorithm is executed on the GPU and compared to a serial version of the
same, showing higher processing speed. Performance tests were made in two GPUs with
different capacities, the NVIDIA GeForce 9400GT (16 cores) and NVIDIA GeForce 2000
(192 cores). Read more
|
30 |
Δημιουργία, μελέτη και βελτιστοποίηση φωτορεαλιστικών απεικονίσεων πραγματικού χρόνου με χρήση προγραμματιζόμενων επεξεργαστών γραφικώνΣταυρόπουλος, Ασημάκης 22 September 2009 (has links)
Οι προγραμματιζόμενοι επεξεργαστές γραφικών (Graphics Processing Units -
GPUs), είναι πανίσχυροι παράλληλοι επεξεργαστές και πλέον υπάρχουν σε κάθε
σύγχρονο προσωπικό υπολογιστή (PC). Οι GPUs αναλαμβάνουν κι επιταχύνουν την
σχεδίαση δισδιάστατων και τρισδιάστατων γραφικών στην οθόνη του υπολογιστή.
Η εξέλιξή τους είναι τόσο ραγδαία τα τελευταία χρόνια, που πλέον ξεπερνούν
σε πολυπλοκότητα τις σύγχρονες κεντρικές μονάδες επεξεργασίας (CPUs), ενώ
είναι ικανές να επιταχύνουν εκτός από γραφικά κι άλλες απαιτητικές σε
επεξεργαστική ισχύ εφαρμογές, όπως είναι η τεχνητή νοημοσύνη και η
προσομοίωση φυσικών αλληλεπιδράσεων μεταξύ αντικειμένων (συγκρούσεις,
εκρήξεις, προσομοίωση κίνησης υγρών) κ.α.
Σκοπός της συγκεκριμένης εργασίας είναι η δημιουργία, η μελέτη και η
βελτιστοποίηση αλγορίθμων σκίασης με χρήση GPUs. Ο όρος σκίαση (shading)
αναφέρεται στην αλληλεπίδραση του φωτός με τα αντικείμενα ενός εικονικού
περιβάλλοντος. Παρουσιάζονται τα εργαλεία (APIs) και οι γλώσσες
προγραμματισμού των GPUs καθώς και τρόποι βελτιστοποίησης της εκτέλεσης των
σκιάσεων που είναι ένα θέμα μείζονος σημασίας σε προσομοιώσεις πραγματικού
χρόνου. / Graphics processing units (GPUs), are powerful parallel processors and today are found in every modern Personal Computer (PC). The GPUs accelerate the drawing of two and three dimensional graphics on the monitor of the PCs. The evolution of this hardware is very rapid the last decade and today these circuits are more complex than CPUs. They are capable of accelerating many demanding applications except graphics, like Artificial Intelligence and Physics Simulation.
The purpose of this thesis is to implement, study and optimize the execution of shading algorithms that run on GPUs in real time. The term shading refers to the interactions between light and the material of every object in a virtual three dimensional environment. In this thesis we present the tools, the programming languages and techniques for optimizing the execution of the shaders which is a matter of major importance in real time simulations. Read more
|
Page generated in 0.0292 seconds