Global ETD Search

151	Shared and distributed memory parallel algorithms to solve big data problems in biological, social network and spatial domain applications Sharma, Rahil 01 December 2016 (has links) Big data refers to information which cannot be processed and analyzed using traditional approaches and tools, due to 4 V's - sheer Volume, Velocity at which data is received and processed, and data Variety and Veracity. Today massive volumes of data originate in domains such as geospatial analysis, biological and social networks, etc. Hence, scalable algorithms for effcient processing of this massive data is a signicant challenge in the field of computer science. One way to achieve such effcient and scalable algorithms is by using shared & distributed memory parallel programming models. In this thesis, we present a variety of such algorithms to solve problems in various above mentioned domains. We solve five problems that fall into two categories. The first group of problems deals with the issue of community detection. Detecting communities in real world networks is of great importance because they consist of patterns that can be viewed as independent components, each of which has distinct features and can be detected based upon network structure. For example, communities in social networks can help target users for marketing purposes, provide user recommendations to connect with and join communities or forums, etc. We develop a novel sequential algorithm to accurately detect community structures in biological protein-protein interaction networks, where a community corresponds with a functional module of proteins. Generally, such sequential algorithms are computationally expensive, which makes them impractical to use for large real world networks. To address this limitation, we develop a new highly scalable Symmetric Multiprocessing (SMP) based parallel algorithm to detect high quality communities in large subsections of social networks like Facebook and Amazon. Due to the SMP architecture, however, our algorithm cannot process networks whose size is greater than the size of the RAM of a single machine. With the increasing size of social networks, community detection has become even more difficult, since network size can reach up to hundreds of millions of vertices and edges. Processing such massive networks requires several hundred gigabytes of RAM, which is only possible by adopting distributed infrastructure. To address this, we develop a novel hybrid (shared + distributed memory) parallel algorithm to efficiently detect high quality communities in massive Twitter and .uk domain networks. The second group of problems deals with the issue of effciently processing spatial Light Detection and Ranging (LiDAR) data. LiDAR data is widely used in forest and agricultural crop studies, landscape classification, 3D urban modeling, etc. Technological advancements in building LiDAR sensors have enabled highly accurate and dense LiDAR point clouds resulting in massive data volumes, which pose computing issues with processing and storage. We develop the first published landscape driven data reduction algorithm, which uses the slope-map of the terrain as a filter to reduce the data without sacrificing its accuracy. Our algorithm is highly scalable and adopts shared memory based parallel architecture. We also develop a parallel interpolation technique that is used to generate highly accurate continuous terrains, i.e. Digital Elevation Models (DEMs), from discrete LiDAR point clouds. Algorithms Big Data Analytics Distributed Systems High Performance Parallel Computing Computer Sciences
152	Parallel computing techniques for computed tomography Deng, Junjun 01 May 2011 (has links) X-ray computed tomography is a widely adopted medical imaging method that uses projections to recover the internal image of a subject. Since the invention of X-ray computed tomography in the 1970s, several generations of CT scanners have been developed. As 3D-image reconstruction increases in popularity, the long processing time associated with these machines has to be significantly reduced before they can be practically employed in everyday applications. Parallel computing is a computer science computing technique that utilizes multiple computer resources to process a computational task simultaneously; each resource computes only a part of the whole task thereby greatly reducing computation time. In this thesis, we use parallel computing technology to speed up the reconstruction while preserving the image quality. Three representative reconstruction algorithms--namely, Katsevich, EM, and Feldkamp algorithms--are investigated in this work. With the Katsevich algorithm, a distributed-memory PC cluster is used to conduct the experiment. This parallel algorithm partitions and distributes the projection data to different computer nodes to perform the computation. Upon completion of each sub-task, the results are collected by the master computer to produce the final image. This parallel algorithm uses the same reconstruction formula as the sequential counterpart, which gives an identical image result. The parallelism of the iterative CT algorithm uses the same PC cluster as in the first one. However, because it is based on a local CT reconstruction algorithm, which is different from the sequential EM algorithm, the image results are different with the sequential counterpart. Moreover, a special strategy using inhomogeneous resolution was used to further speed up the computation. The results showed that the image quality was largely preserved while the computational time was greatly reduced. Unlike the two previous approaches, the third type of parallel implementation uses a shared-memory computer. Three major accelerating methods--SIMD (Single instruction, multiple data), multi-threading, and OS (ordered subsets)--were employed to speed up the computation. Initial investigations showed that the image quality was comparable to those of the conventional approach though the computation speed was significantly increased. Computed Tomography Katsevich Algorithm Local Iterative Algorithm Parallel Computing SIMD Applied Mathematics
153	Three Environmental Fluid Dynamics Papers Furtak-Cole, Eden 01 May 2018 (has links) Three papers are presented, applying computational fluid dynamics methods to fluid flows in the geosciences. In the first paper, a numerical method is developed for single phase potential flow in the subsurface. For a class of monotonically advancing flows, the method provides a computational savings as compared to classical methods and can be applied to problems such as forced groundwater recharge. The second paper investigates the shear stress reducing action of an erosion control roughness array. Incompressible Naiver-Stokes simulations are performed for multiple wind angles to understand the changing aerodynamics of individual and grouped roughness elements. In the third paper, a 1D analytical flow model is compared with multiphase Navier-Stokes simulations in a parabolic fissure. Sampling the numerical results allows the isolation of flow factors such as surface tension, which are difficult to measure in physical experiments. Fluid Dynamics Parallel Computing Fast Marching Methods Atmospheric Science Groundwater Mathematics
154	Optimisation of Urban Water Supply Headworks Systems Using Probabilistic Search Methods and Parallel Computing Cui, Lijie January 2003 (has links) Realistic optimisation of the operation and planning of urban water supply headworks systems requires that the issues of complexity and stochastic forcing be addressed. The only reliable way of accomplishing this is to use simulation models in conjunction with the Monte Carlo method which generates multiple hydro-climate replicates. However, such models do not easily interface with traditional optimisation methods. Probabilistic search methods such as the genetic algorithm (GA) and the shuffled complex evolution method (SCE) can be coupled to a generalised simulation model and thus accommodate complexity as well as stochastic inputs. However, optimisation of complex urban water supply systems is computationally intractable if Monte Carlo methods have to be used. This study first compared the GA and the SCE method using a simple case study. Both methods were found to cope well with the piecewise flat objective function surface typical of the headworks optimisation problem. This is because they have the inherent capability of vigorously exploring beyond the domain of a flat region. The SCE method is recommended especially when fast location of a good solution is desired. Nonetheless, the GA was preferred due to its inherent parallelism. Two methods were then explored to improve computational efficiency and turnaround time: parallel computing and replicate compression. The Sydney headworks system was used as a case study to investigate the key aspects of a full-scale headworks optimisation. It was concluded that the speedup was nearly proportional to the number of processors employed. Replicate compression can very significantly reduce the computational turnaround time for Monte Carlo simulation; unfortunately, this conclusion must be tempered by the limitation that the objective function depends on penalties arising from restrictions only. Critical analysis of the GA results suggested the optimised results were sound. The case study demonstrated the feasibility of parallel GA to identify near-optimal solutions for a complex system subject to stochastic forcing. / PhD Doctorate parallel computing Optimisation Water Supply genetic algorithms Monte Carlo simulation models water resources
155	Design, development and implementation of a parallel algorithm for computed tomography using algebraic reconstruction technique Melvin, Cameron 05 October 2007 (has links) This project implements a parallel algorithm for Computed Tomography based on the Algebraic Reconstruction Technique (ART) algorithm. This technique for reconstructing pictures from projections is useful for applications such as Computed Tomography (CT or CAT). The algorithm requires fewer views, and hence less radiation, to produce an image of comparable or better quality. However, the approach is not widely used because of its computationally intensive nature in comparison with rival technologies. A faster ART algorithm could reduce the amount of radiation needed for CT imaging by producing a better image with fewer projections. A reconstruction from projections version of the ART algorithm for two dimensions was implemented in parallel using the Message Passing Interface (MPI) and OpenMP extensions for C. The message passing implementation did not result in faster reconstructions due to prohibitively long and variant communication latency. The shared memory implementation produced positive results, showing a clear computational advantage for multiple processors and measured efficiency ranging from 60-95%. Consistent with the literature, image quality proved to be significantly better compared to the industry standard Filtered Backprojection algorithm especially when reconstructing from fewer projection angles. / October 2006 computed tomograpy parallel computing algebraic reconstruction Technique ART westgrid CT algorithm reconstruction from projections medical imaging
156	Fast Stochastic Global Optimization Methods and Their Applications to Cluster Crystallization and Protein Folding Zhan, Lixin January 2005 (has links) Two global optimization methods are proposed in this thesis. They are the multicanonical basin hopping (MUBH) method and the basin paving (BP) method. <br /><br /> The MUBH method combines the basin hopping (BH) method, which can be used to efficiently map out an energy landscape associated with local minima, with the multicanonical Monte Carlo (MUCA) method, which encourages the system to move out of energy traps during the computation. It is found to be more efficient than the original BH method when applied to the Lennard-Jones systems containing 150-185 particles. <br /><br /> The asynchronous multicanonical basin hopping (AMUBH) method, a parallelization of the MUBH method, is also implemented using the message passing interface (MPI) to take advantage of the full usage of multiprocessors in either a homogeneous or a heterogeneous computational environment. AMUBH, MUBH and BH are used together to find the global minimum structures for Co nanoclusters with system size <em>N</em>≤200. <br /><br /> The BP method is based on the BH method and the idea of the energy landscape paving (ELP) strategy. In comparison with the acceptance scheme of the ELP method, moving towards the low energy region is enhanced and no low energy configuration may be missed during the simulation. The applications to both the pentapeptide Met-enkephalin and the villin subdomain HP-36 locate new configurations having energies lower than those determined previously. <br /><br /> The MUBH, BP and BH methods are further employed to search for the global minimum structures of several proteins/peptides using the ECEPP/2 and ECEPP/3 force fields. These two force fields may produce global minima with different structures. The present study indicates that the global minimum determination from ECEPP/3 prefers helical structures. Also discussed in this thesis is the effect of the environment on the formation of beta hairpins. Physics & Astronomy Monte Carlo Multicanonical Basin Hopping Basin Paving Protein Folding Parallel Computing
157	Development of a Parallel Computational Framework to Solve Flow and Transport in Integrated Surface-Subsurface Hydrologic Systems Hwang, Hyoun-Tae January 2012 (has links) HydroGeoSphere (HGS) is a 3D control-volume finite element hydrologic model describing fully-integrated surface-subsurface water flow and solute and thermal energy transport. Because the model solves tightly-coupled highly-nonlinear partial differential equations, often applied at regional and continental scales (for example, to analyze the impact of climate change on water resources), high performance computing (HPC) is essential. The target parallelization includes the composition of the Jacobian matrix for the iterative linearization method and the sparse-matrix solver, preconditioned BiCGSTAB. The Jacobian matrix assembly is parallelized by using a static scheduling scheme with taking account into data racing conditions, which may occur during the matrix construction. The parallelization of the solver is achieved by partitioning the domain into equal-size sub-domains, with an efficient reordering scheme. The computational flow of the BiCGSTAB solver is also modified to reduce the parallelization overhead and to be suitable for parallel architectures. The parallelized model is tested on several benchmark cases that include linear and nonlinear problems involving various domain sizes and degrees of hydrologic complexity. The performance is evaluated in terms of computational robustness and efficiency, using standard scaling performance measures. Simulation profiling results indicate that the efficiency becomes higher for three situations: 1) with an increasing number of nodes/elements in the mesh because the work load per CPU decreases with increasing the number of nodes, which reduces the relative portion of parallel overhead in total computing time., 2) for increasingly nonlinear transient simulations because this makes the coefficient matrix diagonal dominance, and 3) with domains of irregular geometry that increases condition number. These characteristics are promising for the large-scale analysis of water resource problems that involve integrated surface-subsurface flow regimes. Large-scale real-world simulations illustrate the importance of node reordering, which is associated with the process of the domain partitioning. With node reordering, super-scalarable parallel speedup was obtained when compared to a serial simulation performed with natural node ordering. The results indicate that the number of iterations increases as the number of threads increases due to the increased number of elements in the off-diagonal blocks in the coefficient matrix. In terms of the privatization scheme, the parallel efficiency with privatization was higher than that with the shared scheme for most of simulations performed. parallel computing integrated surface-subsurface model climate change water resources canadian landmass Earth Sciences
158	Parallel Performance Analysis of The Finite Element-Spherical Harmonics Radiation Transport Method Pattnaik, Aliva 21 November 2006 (has links) In this thesis, the parallel performance of the finite element-spherical harmonics (FE-PN) method implemented in the general-purpose radiation transport code EVENT is studied both analytically and empirically. EVENT solves the coupled set of space-angle discretized FE-PN equations using a parallel block-Jacobi domain decomposition method. As part of the analytical study, the thesis presents complexity results for EVENT when solving for a 3D criticality benchmark radiation transport problem in parallel. The empirical analysis is concerned with the impact of the main algorithmic factors affecting performance. Firstly, EVENT supports two solution strategies, namely MOD (Moments Over Domains) and DOM (Domains Over Moments), to solve the transport equation in parallel. The two strategies differ in the way they solve the multi-level space-angle coupled systems of equations. The thesis presents empirical evidence of which of the two solution strategies is more efficient. Secondly, different preconditioners are used in the Preconditioned Conjugate Gradient (PCG) inside EVENT. Performance of EVENT is compared when using three preconditioners, namely diagonal, SSOR(Symmetric Successive Over-Relaxation) and ILU. The other two factors, angular and spatial resolutions of the problem affect both the performance and precision of EVENT. The thesis presents comparative results on EVENTs performance as these two resolutions are increased. From the empirical performance study of EVENT, a bottleneck is identified that limits the improvement in performance as number of processors used by EVENT is increased. In some experiments, it is observed that uneven assignment of computational load among processors causes a significant portion of the total time being spent in synchronization among processors. The thesis presents two indicators that identify when such inefficiency occur; and in such a case, a load rebalancing strategy is applied that computes a new partition of the problem so that each partition corresponds to equal amount of computational load. Load rebalancing Radiation transport Spherical harmonics Finite element Parallel computing Spherical harmonics Radiative transfer Mathematical models
159	The Comparison of Using MATLAB, C++ and Parallel Computing for Proton Echo Planar Spectroscopic Imaging Reconstruction Tai, Chia-Hsing 10 July 2012 (has links) Proton echo planar spectroscopic imaging(PEPSI) is a novel and rapid technique of magnetic resonance spectroscopic imaging(MRSI). To analyze the metabolite in PEPSI by using LCModel, an automatic reconstruction system is necessary. Recently, many researches use graphic processing unit(GPU) to accelerate imaging reconstruction, and Compute Unified Device Architecture(CUDA) is developed by C language, so the programmers can write the program in parallel computing easily. PEPSI data acquisition includes non water suppression and water suppression scans, each scan contains odd and even echoes, these two data are reconstructed separately. The image reconstruction contains k-space filter, time-domain filter, three-dimension fast Fourier transform(FFT), phase correction and combine odd and even data. We use MATLAB, C++ and parallel computing to implement PEPSI reconstruction, and parallel computing applied CUDA which proposed by NVIDIA. In our study, the averaged non water suppression spectroscopic imaging executed by three different programming language are almost the same. In our data scale, the execution time of parallel computing is faster than MATLAB and C++, especially in the FFT step. Therefore, we simulated and compared the performance of one- to three-dimension FFT. Our result shows that accelerating performance of GPU depends on the number of data points according to the performance of FFT and the execution time of single coil PEPSI reconstruction. While the amount of data points is larger than 65536, as demonstrated in our study, parallel computing contribute in terms of computational acceleration. Parallel Computing Reconstruction Graphic Processing Unit Proton Echo Planar Spectroscopic Imaging Magnetic Resonance Spectroscopic Imaging
160	Hp-spectral Methods for Structural Mechanics and Fluid Dynamics Problems Ranjan, Rakesh 2010 May 1900 (has links) We consider the usage of higher order spectral element methods for the solution of problems in structures and fluid mechanics areas. In structures applications we study different beam theories, with mixed and displacement based formulations, consider the analysis of plates subject to external loadings, and large deformation analysis of beams with continuum based formulations. Higher order methods alleviate the problems of locking that have plagued finite element method applications to structures, and also provide for spectral accuracy of the solutions. For applications in computational fluid dynamics areas we consider the driven cavity problem with least squares based finite element methods. In the context of higher order methods, efficient techniques need to be devised for the solution of the resulting algebraic systems of equations and we explore the usage of element by element bi-orthogonal conjugate gradient solvers for solving problems effectively along with domain decomposition algorithms for fluid problems. In the context of least squares finite element methods we also explore the usage of Multigrid techniques to obtain faster convergence of the the solutions for the problems of interest. Applications of the traditional Lagrange based finite element methods with the Penalty finite element method are presented for modelling porous media flow problems. Finally, we explore applications to some CFD problems namely, the flow past a cylinder and forward facing step. hp/spectral element methods beams plates computational fluid dynanmics parallel computing

Search results