The first component of this work is a parallel algorithm for constructing non-uniform octree meshes for finite element computations. Prior to octree meshing, the linear octree data structure must be constructed and a constraint known as "2:1 balancing" must be enforced; parallel algorithms for these two subproblems are also presented. The second component of this work is a parallel matrix-free geometric multigrid algorithm for solving elliptic partial differential equations (PDEs) using these octree meshes. The last component of this work is a parallel multiscale Gauss Newton optimization algorithm for solving the elastic image registration problem. The registration problem is discretized using finite elements on octree meshes and the parallel geometric multigrid algorithm is used as a preconditioner in the Conjugate Gradient (CG) algorithm to solve the linear system of equations formed in each Gauss Newton iteration.
Several ideas were used to reduce the overhead for constructing the octree meshes. These include (a) a way to lower communication costs by reducing the number of synchronizations and reducing the communication message size, (b) a way to reduce the number of searches required to build element-to-vertex mappings, and (c) a compression scheme to reduce the memory footprint of the entire data structure. To our knowledge, the multigrid algorithm presented in this work is the only matrix-free multiplicative geometric multigrid implementation for solving finite element equations on octree meshes using thousands of processors. The proposed registration algorithm is also unique; it is a combination of many different ideas: adaptivity, parallelism, fast optimization algorithms, and fast linear solvers.
All the algorithms were implemented in C++ using the Message Passing Interface (MPI) standard and were built on top of the PETSc library from Argonne National Laboratory. The multigrid implementation has been released as an open source software: Dendro. Several numerical experiments were performed to test the performance of the algorithms. These experiments were performed on a variety of NSF TeraGrid platforms. Our largest run was a highly-nonuniform, 8-billion-unknown, elasticity calculation on 32,000 processors.
Identifer | oai:union.ndltd.org:GATECH/oai:smartech.gatech.edu:1853/29702 |
Date | 24 June 2009 |
Creators | Sampath, Rahul Srinivasan |
Publisher | Georgia Institute of Technology |
Source Sets | Georgia Tech Electronic Thesis and Dissertation Archive |
Detected Language | English |
Type | Dissertation |
Page generated in 0.0742 seconds