• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • Tagged with
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Analysis of Hardware Usage Of Shuffle Instruction Based Performance Optimization in the Blinds-II Image Quality Assessment Algorithm

January 2017 (has links)
abstract: With the advent of GPGPU, many applications are being accelerated by using CUDA programing paradigm. We are able to achieve around 10x -100x speedups by simply porting the application on to the GPU and running the parallel chunk of code on its multi cored SIMT (Single instruction multiple thread) architecture. But for optimal performance it is necessary to make sure that all the GPU resources are efficiently used, and the latencies in the application are minimized. For this, it is essential to monitor the Hardware usage of the algorithm and thus diagnose the compute and memory bottlenecks in the implementation. In the following thesis, we will be analyzing the mapping of CUDA implementation of BLIINDS-II algorithm on the underlying GPU hardware, and come up with a Kepler architecture specific solution of using shuffle instruction via CUB library to tackle the two major bottlenecks in the algorithm. Experiments were conducted to convey the advantage of using shuffle instru3ction in algorithm over only using shared memory as a buffer to global memory. With the new implementation of BLIINDS-II algorithm using CUB library, a speedup of around 13.7% was achieved. / Dissertation/Thesis / Masters Thesis Engineering 2017
2

Paralelização do algoritmo FDK para reconstrução 3D de imagens tomográficas usando unidades gráficas de processamento e CUDA-C / Parallelization of the FDK algotithm for 3D reconstruction of tomographic images using graphic processing units and CUDA-C

Joel Sánchez Domínguez 12 January 2012 (has links)
Conselho Nacional de Desenvolvimento Científico e Tecnológico / A obtenção de imagens usando tomografia computadorizada revolucionou o diagnóstico de doenças na medicina e é usada amplamente em diferentes áreas da pesquisa científica. Como parte do processo de obtenção das imagens tomográficas tridimensionais um conjunto de radiografias são processadas por um algoritmo computacional, o mais usado atualmente é o algoritmo de Feldkamp, David e Kress (FDK). Os usos do processamento paralelo para acelerar os cálculos em algoritmos computacionais usando as diferentes tecnologias disponíveis no mercado têm mostrado sua utilidade para diminuir os tempos de processamento. No presente trabalho é apresentada a paralelização do algoritmo de reconstrução de imagens tridimensionais FDK usando unidades gráficas de processamento (GPU) e a linguagem CUDA-C. São apresentadas as GPUs como uma opção viável para executar computação paralela e abordados os conceitos introdutórios associados à tomografia computadorizada, GPUs, CUDA-C e processamento paralelo. A versão paralela do algoritmo FDK executada na GPU é comparada com uma versão serial do mesmo, mostrando maior velocidade de processamento. Os testes de desempenho foram feitos em duas GPUs de diferentes capacidades: a placa NVIDIA GeForce 9400GT (16 núcleos) e a placa NVIDIA Quadro 2000 (192 núcleos). / The imaging using computed tomography has revolutionized the diagnosis of diseases in medicine and is widely used in different areas of scientific research. As part of the process to obtained three-dimensional tomographic images a set of x-rays are processed by a computer algorithm, the most widely used algorithm is Feldkamp, David and Kress (FDK). The use of parallel processing to speed up calculations on computer algorithms with the different available technologies, showing their usefulness to decrease processing times. In the present paper presents the parallelization of the algorithm for three-dimensional image reconstruction FDK using graphics processing units (GPU) and CUDA-C. GPUs are shown as a viable option to perform parallel computing and addressed the introductory concepts associated with computed tomographic, GPUs, CUDA-C and parallel processing. The parallel version of the FDK algorithm is executed on the GPU and compared to a serial version of the same, showing higher processing speed. Performance tests were made in two GPUs with different capacities, the NVIDIA GeForce 9400GT (16 cores) and NVIDIA GeForce 2000 (192 cores).
3

Paralelização do algoritmo FDK para reconstrução 3D de imagens tomográficas usando unidades gráficas de processamento e CUDA-C / Parallelization of the FDK algotithm for 3D reconstruction of tomographic images using graphic processing units and CUDA-C

Joel Sánchez Domínguez 12 January 2012 (has links)
Conselho Nacional de Desenvolvimento Científico e Tecnológico / A obtenção de imagens usando tomografia computadorizada revolucionou o diagnóstico de doenças na medicina e é usada amplamente em diferentes áreas da pesquisa científica. Como parte do processo de obtenção das imagens tomográficas tridimensionais um conjunto de radiografias são processadas por um algoritmo computacional, o mais usado atualmente é o algoritmo de Feldkamp, David e Kress (FDK). Os usos do processamento paralelo para acelerar os cálculos em algoritmos computacionais usando as diferentes tecnologias disponíveis no mercado têm mostrado sua utilidade para diminuir os tempos de processamento. No presente trabalho é apresentada a paralelização do algoritmo de reconstrução de imagens tridimensionais FDK usando unidades gráficas de processamento (GPU) e a linguagem CUDA-C. São apresentadas as GPUs como uma opção viável para executar computação paralela e abordados os conceitos introdutórios associados à tomografia computadorizada, GPUs, CUDA-C e processamento paralelo. A versão paralela do algoritmo FDK executada na GPU é comparada com uma versão serial do mesmo, mostrando maior velocidade de processamento. Os testes de desempenho foram feitos em duas GPUs de diferentes capacidades: a placa NVIDIA GeForce 9400GT (16 núcleos) e a placa NVIDIA Quadro 2000 (192 núcleos). / The imaging using computed tomography has revolutionized the diagnosis of diseases in medicine and is widely used in different areas of scientific research. As part of the process to obtained three-dimensional tomographic images a set of x-rays are processed by a computer algorithm, the most widely used algorithm is Feldkamp, David and Kress (FDK). The use of parallel processing to speed up calculations on computer algorithms with the different available technologies, showing their usefulness to decrease processing times. In the present paper presents the parallelization of the algorithm for three-dimensional image reconstruction FDK using graphics processing units (GPU) and CUDA-C. GPUs are shown as a viable option to perform parallel computing and addressed the introductory concepts associated with computed tomographic, GPUs, CUDA-C and parallel processing. The parallel version of the FDK algorithm is executed on the GPU and compared to a serial version of the same, showing higher processing speed. Performance tests were made in two GPUs with different capacities, the NVIDIA GeForce 9400GT (16 cores) and NVIDIA GeForce 2000 (192 cores).

Page generated in 0.0202 seconds