Spelling suggestions: "subject:"compute dnified device 1rchitecture"" "subject:"compute dnified device 1architecture""
1 |
High performance bioinformatics and computational biology on general-purpose graphics processing unitsLing, Cheng January 2012 (has links)
Bioinformatics and Computational Biology (BCB) is a relatively new multidisciplinary field which brings together many aspects of the fields of biology, computer science, statistics, and engineering. Bioinformatics extracts useful information from biological data and makes these more intuitive and understandable by applying principles of information sciences, while computational biology harnesses computational approaches and technologies to answer biological questions conveniently. Recent years have seen an explosion of the size of biological data at a rate which outpaces the rate of increases in the computational power of mainstream computer technologies, namely general purpose processors (GPPs). The aim of this thesis is to explore the use of off-the-shelf Graphics Processing Unit (GPU) technology in the high performance and efficient implementation of BCB applications in order to meet the demands of biological data increases at affordable cost. The thesis presents detailed design and implementations of GPU solutions for a number of BCB algorithms in two widely used BCB applications, namely biological sequence alignment and phylogenetic analysis. Biological sequence alignment can be used to determine the potential information about a newly discovered biological sequence from other well-known sequences through similarity comparison. On the other hand, phylogenetic analysis is concerned with the investigation of the evolution and relationships among organisms, and has many uses in the fields of system biology and comparative genomics. In molecular-based phylogenetic analysis, the relationship between species is estimated by inferring the common history of their genes and then phylogenetic trees are constructed to illustrate evolutionary relationships among genes and organisms. However, both biological sequence alignment and phylogenetic analysis are computationally expensive applications as their computing and memory requirements grow polynomially or even worse with the size of sequence databases. The thesis firstly presents a multi-threaded parallel design of the Smith- Waterman (SW) algorithm alongside an implementation on NVIDIA GPUs. A novel technique is put forward to solve the restriction on the length of the query sequence in previous GPU-based implementations of the SW algorithm. Based on this implementation, the difference between two main task parallelization approaches (Inter-task and Intra-task parallelization) is presented. The resulting GPU implementation matches the speed of existing GPU implementations while providing more flexibility, i.e. flexible length of sequences in real world applications. It also outperforms an equivalent GPPbased implementation by 15x-20x. After this, the thesis presents the first reported multi-threaded design and GPU implementation of the Gapped BLAST with Two-Hit method algorithm, which is widely used for aligning biological sequences heuristically. This achieved up to 3x speed-up improvements compared to the most optimised GPP implementations. The thesis then presents a multi-threaded design and GPU implementation of a Neighbor-Joining (NJ)-based method for phylogenetic tree construction and multiple sequence alignment (MSA). This achieves 8x-20x speed up compared to an equivalent GPP implementation based on the widely used ClustalW software. The NJ method however only gives one possible tree which strongly depends on the evolutionary model used. A more advanced method uses maximum likelihood (ML) for scoring phylogenies with Markov Chain Monte Carlo (MCMC)-based Bayesian inference. The latter was the subject of another multi-threaded design and GPU implementation presented in this thesis, which achieved 4x-8x speed up compared to an equivalent GPP implementation based on the widely used MrBayes software. Finally, the thesis presents a general evaluation of the designs and implementations achieved in this work as a step towards the evaluation of GPU technology in BCB computing, in the context of other computer technologies including GPPs and Field Programmable Gate Arrays (FPGA) technology.
|
2 |
Αξιοποίηση υπολογιστικών πόρωνΣίψας, Κωνσταντίνος 13 December 2010 (has links)
Στα πλαίσια αυτής της εργασίας θα εξετάσουμε την δυνατότητα αξιοποίησης της μονάδας επεξεργασίας γραφικών (GPU) για την εκτέλεση ενός αλγορίθμου πολλαπλασιασμού πίνακα-διανύσματος και τριών αλγορίθμων ταξινόμησης και το κατά πόσο είναι δυνατό να επιταχυνθεί η εκτέλεση του κώδικα αυτού. Η αρχιτεκτονική που μελετήθηκε και αναλύεται στην εργασία ονομάζεται Tesla και αναπτύχθηκε από την εταιρία Nvidia, το μοντέλο και το περιβάλλον ανάπτυξης ονομάζονται Cuda (Compute Unified Device Architecture). / In context of this diploma thesis the capability of exploiting the graphics processing unit (GPU) to execute and accelerate an algorithm for matrix vector multiplication and three sorting algorithms was examined. The architecture which was examined and described in this diploma thesis is Tesla and it was created by Nvidia. The CUDA (Compute Unified Device Architecture) programming environment was used to implement the algorithms.
|
3 |
Determinação de autovalores e autovetores de matrizes tridiagonais simétricas usando CUDARocha, Lindomar José 04 August 2015 (has links)
Dissertação (mestrado)–Universidade de Brasília, Universidade UnB de Planaltina, Programa de Pós-Graduação em Ciência de Materiais, 2015. / Submitted by Fernanda Percia França (fernandafranca@bce.unb.br) on 2015-12-15T17:59:17Z
No. of bitstreams: 1
2015_LindomarJoséRocha.pdf: 1300687 bytes, checksum: f028dc5aba5d9f92f1b2ee949e3e3a3d (MD5) / Approved for entry into archive by Raquel Viana(raquelviana@bce.unb.br) on 2016-02-29T22:14:44Z (GMT) No. of bitstreams: 1
2015_LindomarJoséRocha.pdf: 1300687 bytes, checksum: f028dc5aba5d9f92f1b2ee949e3e3a3d (MD5) / Made available in DSpace on 2016-02-29T22:14:44Z (GMT). No. of bitstreams: 1
2015_LindomarJoséRocha.pdf: 1300687 bytes, checksum: f028dc5aba5d9f92f1b2ee949e3e3a3d (MD5) / Diversos ramos do conhecimento humano fazem uso de autovalores e autovetores, dentre eles têm-se Física, Engenharia, Economia, etc. A determinação desses autovalores e autovetores pode ser feita utilizando diversas rotinas computacionais, porém umas mais rápidas que outras nesse senário de ganho de velocidade aparece a opção de se usar a computação paralela de forma mais especifica a CUDA da Nvidia é uma opção que oferece um ganho de velocidade significativo, nesse modelo as rotinas são executadas na GPU onde se tem diversos núcleos de processamento. Dada a tamanha importância dos autovalores e autovetores o objetivo desse trabalho é determinar rotinas que possam efetuar o cálculos dos mesmos com matrizes tridiagonais simétricas reais de maneira mais rápida e segura, através de computação paralela com uso da CUDA. Objetivo esse alcançado através da combinação de alguns métodos numéricos para a obtenção dos autovalores e um alteração no método da iteração inversa utilizado na determinação dos autovetores. Temos feito uso de rotinas LAPACK para comparar com as nossas rotinas desenvolvidas em CUDA. De acordo com os resultados, a rotina desenvolvida em CUDA tem a vantagem clara de velocidade quer na precisão simples ou dupla, quando comparado com o estado da arte das rotinas de CPU a partir da biblioteca LAPACK. ______________________________________________________________________________________________ ABSTRACT / Severa branches of human knowledge make use of eigenvalues and eigenvectors, among them we have physics, engineering, economics, etc. The determination of these eigenvalues and eigenvectors can be using various computational routines, som faster than others in this speed increase scenario appears the option to use the parallel computing more specifically the Nvidia’s CUDA is an option that provides a gain of significant speed, this model the routines are performed on the GPU which has several processing cores. Given the great importance of the eigenvalues and eigenvectors the objective of this study is to determine routines that can perform the same calculations with real symmetric tridiagonal matrices more quickly and safely, through parallel computing with use of CUDA. Objective that achieved by some combination of numerical methods to obtain the eigenvalues and a change in the method of inverse iteration used to determine of the eigenvectors, which was used LAPACK routines to compare with routine developed in CUDA. According to the results of the routine developed in CUDA has marked superiority with single or double precision, in the question speed regarding the routines of LAPACK.
|
4 |
Resolução numérica de escoamentos compressíveis empregando um método de partículas livre de malhas e o processamento em paralelo (CUDA) / Numerical resolution of compressible flows employing a mesfree particle method and CUDAJosecley Fialho Góes 25 August 2011 (has links)
Os métodos numéricos convencionais, baseados em malhas, têm sido amplamente
aplicados na resolução de problemas da Dinâmica dos Fluidos Computacional.
Entretanto, em problemas de escoamento de fluidos que envolvem superfícies livres,
grandes explosões, grandes deformações, descontinuidades, ondas de choque etc., estes
métodos podem apresentar algumas dificuldades práticas quando da resolução destes
problemas. Como uma alternativa viável, existem os métodos de partículas livre de
malhas. Neste trabalho é feita uma introdução ao método Lagrangeano de partículas,
livre de malhas, Smoothed Particle Hydrodynamics (SPH) voltado para a simulação numérica
de escoamentos de fluidos newtonianos compressíveis e quase-incompressíveis.
Dois códigos numéricos foram desenvolvidos, uma versão serial e outra em paralelo,
empregando a linguagem de programação C/C++ e a Compute Unified Device Architecture
(CUDA), que possibilita o processamento em paralelo empregando os núcleos das
Graphics Processing Units (GPUs) das placas de vídeo da NVIDIA Corporation. Os resultados
numéricos foram validados e a eficiência computacional avaliada considerandose
a resolução dos problemas unidimensionais Shock Tube e Blast Wave e bidimensional
da Cavidade (Shear Driven Cavity Problem). / The conventional mesh-based numerical methods have been widely applied
to solving problems in Computational Fluid Dynamics. However, in problems involving
fluid flow free surfaces, large explosions, large deformations, discontinuities,
shock waves etc. these methods suffer from some inherent difficulties which limit
their applications to solving these problems. Meshfree particle methods have emerged
as an alternative to the conventional grid-based methods. This work introduces
the Smoothed Particle Hydrodynamics (SPH), a meshfree Lagrangian particle method
to solve compressible flows. Two numerical codes have been developed, serial and
parallel versions, using the Programming Language C/C++ and Compute Unified Device
Architecture (CUDA). CUDA is NVIDIAs parallel computing architecture that
enables dramatic increasing in computing performance by harnessing the power of
the Graphics Processing Units (GPUs). The numerical results were validated and the
speedup evaluated for the Shock Tube and Blast Wave one-dimensional problems and
Shear Driven Cavity Problem.
|
5 |
Desenvolvimento de um simulador numérico empregando o método Smoothed Particle Hydrodynamics para a resolução de escoamentos incompressíveis. Implementação computacional em paralelo (CUDA) / Numerical modelling of incompressible flows with the smoothed particles hydrodynamics method. Implementation of parallel numerical algorithms using CUDAMarciana Lima Góes 30 August 2012 (has links)
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Neste trabalho, foi desenvolvido um simulador numérico baseado no método
livre de malhas Smoothed Particle Hydrodynamics (SPH) para a resolução de escoamentos
de fluidos newtonianos incompressíveis. Diferentemente da maioria das versões
existentes deste método, o código numérico faz uso de uma técnica iterativa na determinação
do campo de pressões. Este procedimento emprega a forma diferencial de
uma equação de estado para um fluido compressível e a equação da continuidade a
fim de que a correção da pressão seja determinada. Uma versão paralelizada do simulador
numérico foi implementada usando a linguagem de programação C/C++ e a
Compute Unified Device Architecture (CUDA) da NVIDIA Corporation. Foram simulados
três problemas, o problema unidimensional do escoamento de Couette e os problemas
bidimensionais do escoamento no interior de uma Cavidade (Shear Driven Cavity
Problem) e da Quebra de Barragem (Dambreak). / In this work a numerical simulator was developed based on the mesh-free
Smoothed Particle Hydrodynamics (SPH) method to solve incompressible newtonian
fluid flows. Unlike most existing versions of this method, the numerical code uses an
iterative technique in the pressure field determination. This approach employs a differential
state equation for a compressible fluid and the continuity equation to calculate
the pressure correction. A parallel version of the numerical code was implemented
using the Programming Language C/C++ and Compute Unified Device Architecture
(CUDA) from the NVIDIA Corporation. The numerical results were validated and the
speed-up evaluated for an one-dimensional Couette flow and two-dimensional Shear
Driven Cavity and Dambreak problems.
|
6 |
Desenvolvimento de um simulador numérico empregando o método Smoothed Particle Hydrodynamics para a resolução de escoamentos incompressíveis. Implementação computacional em paralelo (CUDA) / Numerical modelling of incompressible flows with the smoothed particles hydrodynamics method. Implementation of parallel numerical algorithms using CUDAMarciana Lima Góes 30 August 2012 (has links)
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Neste trabalho, foi desenvolvido um simulador numérico baseado no método
livre de malhas Smoothed Particle Hydrodynamics (SPH) para a resolução de escoamentos
de fluidos newtonianos incompressíveis. Diferentemente da maioria das versões
existentes deste método, o código numérico faz uso de uma técnica iterativa na determinação
do campo de pressões. Este procedimento emprega a forma diferencial de
uma equação de estado para um fluido compressível e a equação da continuidade a
fim de que a correção da pressão seja determinada. Uma versão paralelizada do simulador
numérico foi implementada usando a linguagem de programação C/C++ e a
Compute Unified Device Architecture (CUDA) da NVIDIA Corporation. Foram simulados
três problemas, o problema unidimensional do escoamento de Couette e os problemas
bidimensionais do escoamento no interior de uma Cavidade (Shear Driven Cavity
Problem) e da Quebra de Barragem (Dambreak). / In this work a numerical simulator was developed based on the mesh-free
Smoothed Particle Hydrodynamics (SPH) method to solve incompressible newtonian
fluid flows. Unlike most existing versions of this method, the numerical code uses an
iterative technique in the pressure field determination. This approach employs a differential
state equation for a compressible fluid and the continuity equation to calculate
the pressure correction. A parallel version of the numerical code was implemented
using the Programming Language C/C++ and Compute Unified Device Architecture
(CUDA) from the NVIDIA Corporation. The numerical results were validated and the
speed-up evaluated for an one-dimensional Couette flow and two-dimensional Shear
Driven Cavity and Dambreak problems.
|
7 |
Novas Abordagens Sequencial e Paralela da meta-heurística C-GRASP Aplicadas à Otimização Global ContínuaAndrade, Lisieux Marie Marinho dos Santos 08 August 2013 (has links)
Made available in DSpace on 2015-05-14T12:36:40Z (GMT). No. of bitstreams: 1
arquivototal.pdf: 2336902 bytes, checksum: 41580878008a0f84da693637a48ceb33 (MD5)
Previous issue date: 2013-08-08 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / The present work deals with the Continuous Global Optimization Problem, in its minimization form, by testing two approaches for the Continuous Greedy Randomized Adaptive Search Procedure (C-GRASP). The development of the first method - sequential and hybrid - comes from the deficiency of current approaches to provide a good neighborhood space exploration. Being constructed from the combination of two meta-heuristics, standard C-GRASP and Continuous General Variable Neighborhood Search (C-GVNS), as a strategy to achieving symmetric trades of neighborhood structures, it performed efficiently in the computational tests that were taken. The second procedure arises from the large consume of time when using high dimension functions with the standard C-GRASP construction procedure. As the optimization problems have a high dimensionality increase, it s preferable to have two parallel versions of the optimization method in order to handle bigger problems. Thus, for this new procedure developed, it was used the Compute Unified Device Architecture (CUDA), which provided promising acceleration regarding the processing time, based on the experiments performed. / O presente trabalho aborda o Problema de Otimização Global Contínua, em sua forma de minimização, através de duas abordagens para o procedimento Continuous Greedy Randomized Adaptive Search Procedure (C-GRASP). A elaboração do primeiro método, sequencial e híbrido, parte da deficiência presente nas abordagens atuais, em promover boa exploração no espaço de vizinhança. Sendo constituída da combinação de duas meta-heurísticas, C-GRASP padrão e Continuous General Variable Neighborhood Search (C-GVNS). Como estratégia para a realização de trocas sistemática de estruturas de vizinhanças, mostrou-se eficiente aos testes computacionais realizados. O segundo procedimento elaborado parte do grande consumo de tempo ao utilizar funções com alta dimensão, pelo procedimento de construção do método C-GRASP padrão. Como os problemas de otimização possuem crescimento elevado de dimensionalidade, é desejável ter versões paralelas do método de otimização para lidar com os problemas maiores. Desta forma, para o novo procedimento elaborado foi empregado a plataforma de computação paralela Compute Unified Device Architecture (CUDA), que, conforme verificado nos experimentos realizados, promoveu promissora aceleração quanto ao tempo de processamento.
|
8 |
Resolução numérica de escoamentos compressíveis empregando um método de partículas livre de malhas e o processamento em paralelo (CUDA) / Numerical resolution of compressible flows employing a mesfree particle method and CUDAJosecley Fialho Góes 25 August 2011 (has links)
Os métodos numéricos convencionais, baseados em malhas, têm sido amplamente
aplicados na resolução de problemas da Dinâmica dos Fluidos Computacional.
Entretanto, em problemas de escoamento de fluidos que envolvem superfícies livres,
grandes explosões, grandes deformações, descontinuidades, ondas de choque etc., estes
métodos podem apresentar algumas dificuldades práticas quando da resolução destes
problemas. Como uma alternativa viável, existem os métodos de partículas livre de
malhas. Neste trabalho é feita uma introdução ao método Lagrangeano de partículas,
livre de malhas, Smoothed Particle Hydrodynamics (SPH) voltado para a simulação numérica
de escoamentos de fluidos newtonianos compressíveis e quase-incompressíveis.
Dois códigos numéricos foram desenvolvidos, uma versão serial e outra em paralelo,
empregando a linguagem de programação C/C++ e a Compute Unified Device Architecture
(CUDA), que possibilita o processamento em paralelo empregando os núcleos das
Graphics Processing Units (GPUs) das placas de vídeo da NVIDIA Corporation. Os resultados
numéricos foram validados e a eficiência computacional avaliada considerandose
a resolução dos problemas unidimensionais Shock Tube e Blast Wave e bidimensional
da Cavidade (Shear Driven Cavity Problem). / The conventional mesh-based numerical methods have been widely applied
to solving problems in Computational Fluid Dynamics. However, in problems involving
fluid flow free surfaces, large explosions, large deformations, discontinuities,
shock waves etc. these methods suffer from some inherent difficulties which limit
their applications to solving these problems. Meshfree particle methods have emerged
as an alternative to the conventional grid-based methods. This work introduces
the Smoothed Particle Hydrodynamics (SPH), a meshfree Lagrangian particle method
to solve compressible flows. Two numerical codes have been developed, serial and
parallel versions, using the Programming Language C/C++ and Compute Unified Device
Architecture (CUDA). CUDA is NVIDIAs parallel computing architecture that
enables dramatic increasing in computing performance by harnessing the power of
the Graphics Processing Units (GPUs). The numerical results were validated and the
speedup evaluated for the Shock Tube and Blast Wave one-dimensional problems and
Shear Driven Cavity Problem.
|
9 |
Otimização de multidões em jogos digitais utilizando CUDABardella, Tiago Ungaro 19 October 2015 (has links)
Made available in DSpace on 2016-03-15T19:38:03Z (GMT). No. of bitstreams: 1
TIAGO UNGARO BARDELLA.pdf: 2553991 bytes, checksum: f8e6ba33f7c930ee81f6b64116f495ff (MD5)
Previous issue date: 2015-10-19 / The history of digital games shows, since the beginning, games which uses many types of enemy models to confront and many types of characters to control, like Real-Time Strategy games, for example. These huge amount of models into an important scene are called crowds. The crowds needs a high computer performance and specific algorithms in their interaction control to avoid immersion loss into a game by problems which may
happen if the crowds are not treated accordingly. With the popularization of graphic board languages like NVIDIA CUDA, new algorithms were created to easily increase the performance of crowds in digital games and their overwhelming superiority compared to the methods used in linear programming were proved in many researches. The goal of this work is to use these GPU techniques as base to implement a new API using CUDA
language that will present better performance and simplicity compared to the others algorithms on the area of crowds in digital games. After the project conclusion, the created
API turned easier the crowd treatment to digital game developers using Unity3D integrated with API TBX, that now only need to include a DLL in the project instead creating na algorithm for crowd treatment from the beginning, which takes a huge amount of time from development. / O histórico dos jogos digitais apresenta, desde seu princípio, jogos que utilizam diversos modelos de inimigos para enfrentar ou diversos modelos de personagens para controlar, como os jogos Real-Time Strategy por exemplo. Essas grandes quantidades de modelos que compõem uma cena importante são chamadas de multidões. As multidões necessitam de um alto poder computacional e algoritmos específicos para seu tratamento para evitar a perda de imersão dentro de um jogo pelos problemas que podem acontecer caso as multidões não sejam tratadas adequadamente. Com o surgimento de linguagens de placas
gráficas como a NVIDIA CUDA, novos algoritmos foram criados para melhor trabalhar com o desempenho de multidões em jogos digitais e sua superioridade em comparação com os métodos utilizados em programação sequencial foi comprovada em diversos estudos. O objetivo deste trabalho é se basear nestas técnicas de GPU para implementar uma nova API usando tecnologia CUDA que visa melhorar os algoritmos existentes para
tratamento de multidões em jogos digitais em termos de desempenho e simplicidade de implementação. Com a conclusão do projeto, a API criada facilitou o tratamento de multidões para desenvolvedores de jogos digitais com a game engine Unity3D integrada com a API TBX de simulação de multidões, que agora apenas precisam incluir uma DLL em seu projeto ao invés de criar um algoritmo próprio de tratamento de multidões do início,
o que demanda tempo de desenvolvimento.
|
10 |
Analysis of GPU-based convolution for acoustic wave propagation modeling with finite differences: Fortran to CUDA-C step-by-stepSadahiro, Makoto 04 September 2014 (has links)
By projecting observed microseismic data backward in time to when fracturing occurred, it is possible to locate the fracture events in space, assuming a correct velocity model. In order to achieve this task in near real-time, a robust computational system to handle backward propagation, or Reverse Time Migration (RTM), is required. We can then test many different velocity models for each run of the RTM. We investigate the use of a Graphics Processing Unit (GPU) based system using Compute Unified Device Architecture for C (CUDA-C) as the programming language. Our preliminary results show a large improvement in run-time over conventional programming methods based on conventional Central Processing Unit (CPU) computing with Fortran. Considerable room for improvement still remains. / text
|
Page generated in 0.0889 seconds