Return to search

Methods for improving performance of particle tracking and image registration in computational lung modeling using multi-core CPUs And GPUs

Graphics Processing Units (GPUs) have grown in popularity beyond the original video game enthusiast audience. They have been embraced by the high-performance computing community due to their high computational throughput, low cost, low energy demands, wide availability, and ability to dramatically improve application performance. In addition, as hybrid computing continues into mainstream applications, the use of GPUs will continue to grow. However, due to architectural difference between the CPU and GPU, adapting CPU-based scientific computing applications to fully exploit the potential speedup that GPUs offer is a non-trivial task. Algorithms must be designed with the architecture benefits and limitations in mind in order to unlock the full performance gains afforded by the use GPU. In this work, we develop fast GPU methods to improve the performance of two important components in computational lung modeling - image registration and particle tracking. We first propose a novel method for multi-level mass-preserving deformable image registration. The strength of this method is that it allows for flexibility of choice for the similarity criteria to be used by the registration method, making possible the implementation of simple and complex similarity measures on the GPU with excellent performance results. The method is tested using three similarity criteria for registering two CT lung datasets - the commonly used sum of squared intensity differences (SSD), the sum of squared tissue value differences (SSTVD), and a symmetric version of SSTVD currently being developed by our research group. The GPU method is validated against a previously validated single-threaded CPU counterpart using six healthy human subjects, and demonstrated strong agreement of results. Separately, three GPU methods were developed for tracking particle trajectories and deposition efficiencies in the human airway tree, including a multiple-GPU method. Though parallelization was straightforward, the complex geometry of the lungs and use of an unstructured mesh provided challenges that were addressed by the GPU methods. The results of the GPU methods were tested for various numbers of particles and compared to a previously validated single-threaded CPU version and demonstrated dramatic speedup over the single-threaded CPU version and 12-threaded CPU versions.

Identiferoai:union.ndltd.org:uiowa.edu/oai:ir.uiowa.edu:etd-6855
Date01 December 2014
CreatorsEllingwood, Nathan David
ContributorsLin, Ching-Long
PublisherUniversity of Iowa
Source SetsUniversity of Iowa
LanguageEnglish
Detected LanguageEnglish
Typedissertation
Formatapplication/pdf
SourceTheses and Dissertations
RightsCopyright © 2014 Nathan David Ellingwood

Page generated in 0.0017 seconds