Global ETD Search

131	Shared and distributed memory parallel algorithms to solve big data problems in biological, social network and spatial domain applications Sharma, Rahil 01 December 2016 (has links) Big data refers to information which cannot be processed and analyzed using traditional approaches and tools, due to 4 V's - sheer Volume, Velocity at which data is received and processed, and data Variety and Veracity. Today massive volumes of data originate in domains such as geospatial analysis, biological and social networks, etc. Hence, scalable algorithms for effcient processing of this massive data is a signicant challenge in the field of computer science. One way to achieve such effcient and scalable algorithms is by using shared & distributed memory parallel programming models. In this thesis, we present a variety of such algorithms to solve problems in various above mentioned domains. We solve five problems that fall into two categories. The first group of problems deals with the issue of community detection. Detecting communities in real world networks is of great importance because they consist of patterns that can be viewed as independent components, each of which has distinct features and can be detected based upon network structure. For example, communities in social networks can help target users for marketing purposes, provide user recommendations to connect with and join communities or forums, etc. We develop a novel sequential algorithm to accurately detect community structures in biological protein-protein interaction networks, where a community corresponds with a functional module of proteins. Generally, such sequential algorithms are computationally expensive, which makes them impractical to use for large real world networks. To address this limitation, we develop a new highly scalable Symmetric Multiprocessing (SMP) based parallel algorithm to detect high quality communities in large subsections of social networks like Facebook and Amazon. Due to the SMP architecture, however, our algorithm cannot process networks whose size is greater than the size of the RAM of a single machine. With the increasing size of social networks, community detection has become even more difficult, since network size can reach up to hundreds of millions of vertices and edges. Processing such massive networks requires several hundred gigabytes of RAM, which is only possible by adopting distributed infrastructure. To address this, we develop a novel hybrid (shared + distributed memory) parallel algorithm to efficiently detect high quality communities in massive Twitter and .uk domain networks. The second group of problems deals with the issue of effciently processing spatial Light Detection and Ranging (LiDAR) data. LiDAR data is widely used in forest and agricultural crop studies, landscape classification, 3D urban modeling, etc. Technological advancements in building LiDAR sensors have enabled highly accurate and dense LiDAR point clouds resulting in massive data volumes, which pose computing issues with processing and storage. We develop the first published landscape driven data reduction algorithm, which uses the slope-map of the terrain as a filter to reduce the data without sacrificing its accuracy. Our algorithm is highly scalable and adopts shared memory based parallel architecture. We also develop a parallel interpolation technique that is used to generate highly accurate continuous terrains, i.e. Digital Elevation Models (DEMs), from discrete LiDAR point clouds. Algorithms Big Data Analytics Distributed Systems High Performance Parallel Computing Computer Sciences
132	Parallel computing techniques for computed tomography Deng, Junjun 01 May 2011 (has links) X-ray computed tomography is a widely adopted medical imaging method that uses projections to recover the internal image of a subject. Since the invention of X-ray computed tomography in the 1970s, several generations of CT scanners have been developed. As 3D-image reconstruction increases in popularity, the long processing time associated with these machines has to be significantly reduced before they can be practically employed in everyday applications. Parallel computing is a computer science computing technique that utilizes multiple computer resources to process a computational task simultaneously; each resource computes only a part of the whole task thereby greatly reducing computation time. In this thesis, we use parallel computing technology to speed up the reconstruction while preserving the image quality. Three representative reconstruction algorithms--namely, Katsevich, EM, and Feldkamp algorithms--are investigated in this work. With the Katsevich algorithm, a distributed-memory PC cluster is used to conduct the experiment. This parallel algorithm partitions and distributes the projection data to different computer nodes to perform the computation. Upon completion of each sub-task, the results are collected by the master computer to produce the final image. This parallel algorithm uses the same reconstruction formula as the sequential counterpart, which gives an identical image result. The parallelism of the iterative CT algorithm uses the same PC cluster as in the first one. However, because it is based on a local CT reconstruction algorithm, which is different from the sequential EM algorithm, the image results are different with the sequential counterpart. Moreover, a special strategy using inhomogeneous resolution was used to further speed up the computation. The results showed that the image quality was largely preserved while the computational time was greatly reduced. Unlike the two previous approaches, the third type of parallel implementation uses a shared-memory computer. Three major accelerating methods--SIMD (Single instruction, multiple data), multi-threading, and OS (ordered subsets)--were employed to speed up the computation. Initial investigations showed that the image quality was comparable to those of the conventional approach though the computation speed was significantly increased. Computed Tomography Katsevich Algorithm Local Iterative Algorithm Parallel Computing SIMD Applied Mathematics
133	Three Environmental Fluid Dynamics Papers Furtak-Cole, Eden 01 May 2018 (has links) Three papers are presented, applying computational fluid dynamics methods to fluid flows in the geosciences. In the first paper, a numerical method is developed for single phase potential flow in the subsurface. For a class of monotonically advancing flows, the method provides a computational savings as compared to classical methods and can be applied to problems such as forced groundwater recharge. The second paper investigates the shear stress reducing action of an erosion control roughness array. Incompressible Naiver-Stokes simulations are performed for multiple wind angles to understand the changing aerodynamics of individual and grouped roughness elements. In the third paper, a 1D analytical flow model is compared with multiphase Navier-Stokes simulations in a parabolic fissure. Sampling the numerical results allows the isolation of flow factors such as surface tension, which are difficult to measure in physical experiments. Fluid Dynamics Parallel Computing Fast Marching Methods Atmospheric Science Groundwater Mathematics
134	Design, development and implementation of a parallel algorithm for computed tomography using algebraic reconstruction technique Melvin, Cameron 05 October 2007 (has links) This project implements a parallel algorithm for Computed Tomography based on the Algebraic Reconstruction Technique (ART) algorithm. This technique for reconstructing pictures from projections is useful for applications such as Computed Tomography (CT or CAT). The algorithm requires fewer views, and hence less radiation, to produce an image of comparable or better quality. However, the approach is not widely used because of its computationally intensive nature in comparison with rival technologies. A faster ART algorithm could reduce the amount of radiation needed for CT imaging by producing a better image with fewer projections. A reconstruction from projections version of the ART algorithm for two dimensions was implemented in parallel using the Message Passing Interface (MPI) and OpenMP extensions for C. The message passing implementation did not result in faster reconstructions due to prohibitively long and variant communication latency. The shared memory implementation produced positive results, showing a clear computational advantage for multiple processors and measured efficiency ranging from 60-95%. Consistent with the literature, image quality proved to be significantly better compared to the industry standard Filtered Backprojection algorithm especially when reconstructing from fewer projection angles. / October 2006 computed tomograpy parallel computing algebraic reconstruction Technique ART westgrid CT algorithm reconstruction from projections medical imaging
135	Fast Stochastic Global Optimization Methods and Their Applications to Cluster Crystallization and Protein Folding Zhan, Lixin January 2005 (has links) Two global optimization methods are proposed in this thesis. They are the multicanonical basin hopping (MUBH) method and the basin paving (BP) method. <br /><br /> The MUBH method combines the basin hopping (BH) method, which can be used to efficiently map out an energy landscape associated with local minima, with the multicanonical Monte Carlo (MUCA) method, which encourages the system to move out of energy traps during the computation. It is found to be more efficient than the original BH method when applied to the Lennard-Jones systems containing 150-185 particles. <br /><br /> The asynchronous multicanonical basin hopping (AMUBH) method, a parallelization of the MUBH method, is also implemented using the message passing interface (MPI) to take advantage of the full usage of multiprocessors in either a homogeneous or a heterogeneous computational environment. AMUBH, MUBH and BH are used together to find the global minimum structures for Co nanoclusters with system size <em>N</em>≤200. <br /><br /> The BP method is based on the BH method and the idea of the energy landscape paving (ELP) strategy. In comparison with the acceptance scheme of the ELP method, moving towards the low energy region is enhanced and no low energy configuration may be missed during the simulation. The applications to both the pentapeptide Met-enkephalin and the villin subdomain HP-36 locate new configurations having energies lower than those determined previously. <br /><br /> The MUBH, BP and BH methods are further employed to search for the global minimum structures of several proteins/peptides using the ECEPP/2 and ECEPP/3 force fields. These two force fields may produce global minima with different structures. The present study indicates that the global minimum determination from ECEPP/3 prefers helical structures. Also discussed in this thesis is the effect of the environment on the formation of beta hairpins. Physics & Astronomy Monte Carlo Multicanonical Basin Hopping Basin Paving Protein Folding Parallel Computing
136	Parallel Performance Analysis of The Finite Element-Spherical Harmonics Radiation Transport Method Pattnaik, Aliva 21 November 2006 (has links) In this thesis, the parallel performance of the finite element-spherical harmonics (FE-PN) method implemented in the general-purpose radiation transport code EVENT is studied both analytically and empirically. EVENT solves the coupled set of space-angle discretized FE-PN equations using a parallel block-Jacobi domain decomposition method. As part of the analytical study, the thesis presents complexity results for EVENT when solving for a 3D criticality benchmark radiation transport problem in parallel. The empirical analysis is concerned with the impact of the main algorithmic factors affecting performance. Firstly, EVENT supports two solution strategies, namely MOD (Moments Over Domains) and DOM (Domains Over Moments), to solve the transport equation in parallel. The two strategies differ in the way they solve the multi-level space-angle coupled systems of equations. The thesis presents empirical evidence of which of the two solution strategies is more efficient. Secondly, different preconditioners are used in the Preconditioned Conjugate Gradient (PCG) inside EVENT. Performance of EVENT is compared when using three preconditioners, namely diagonal, SSOR(Symmetric Successive Over-Relaxation) and ILU. The other two factors, angular and spatial resolutions of the problem affect both the performance and precision of EVENT. The thesis presents comparative results on EVENTs performance as these two resolutions are increased. From the empirical performance study of EVENT, a bottleneck is identified that limits the improvement in performance as number of processors used by EVENT is increased. In some experiments, it is observed that uneven assignment of computational load among processors causes a significant portion of the total time being spent in synchronization among processors. The thesis presents two indicators that identify when such inefficiency occur; and in such a case, a load rebalancing strategy is applied that computes a new partition of the problem so that each partition corresponds to equal amount of computational load. Load rebalancing Radiation transport Spherical harmonics Finite element Parallel computing Spherical harmonics Radiative transfer Mathematical models
137	Hp-spectral Methods for Structural Mechanics and Fluid Dynamics Problems Ranjan, Rakesh 2010 May 1900 (has links) We consider the usage of higher order spectral element methods for the solution of problems in structures and fluid mechanics areas. In structures applications we study different beam theories, with mixed and displacement based formulations, consider the analysis of plates subject to external loadings, and large deformation analysis of beams with continuum based formulations. Higher order methods alleviate the problems of locking that have plagued finite element method applications to structures, and also provide for spectral accuracy of the solutions. For applications in computational fluid dynamics areas we consider the driven cavity problem with least squares based finite element methods. In the context of higher order methods, efficient techniques need to be devised for the solution of the resulting algebraic systems of equations and we explore the usage of element by element bi-orthogonal conjugate gradient solvers for solving problems effectively along with domain decomposition algorithms for fluid problems. In the context of least squares finite element methods we also explore the usage of Multigrid techniques to obtain faster convergence of the the solutions for the problems of interest. Applications of the traditional Lagrange based finite element methods with the Penalty finite element method are presented for modelling porous media flow problems. Finally, we explore applications to some CFD problems namely, the flow past a cylinder and forward facing step. hp/spectral element methods beams plates computational fluid dynanmics parallel computing
138	Performance Analysis Of Stacked Generalization Ozay, Mete 01 September 2008 (has links) (PDF) Stacked Generalization (SG) is an ensemble learning technique, which aims to increase the performance of individual classifiers by combining them under a hierarchical architecture. This study consists of two major parts. In the first part, the performance of Stacked Generalization technique is analyzed with respect to the performance of the individual classifiers and the content of the training data. In the second part, based on the findings for a new class of algorithms, called Meta-Fuzzified Yield Value (Meta-FYV) is introduced. The first part introduces and verifies two hypotheses by a set of controlled experiments to assure the performance gain for SG. The learning mechanisms of SG to achieve high performance are explored and the relationship between the performance of the individual classifiers and that of SG is investigated. It is shown that if the samples in the training set are correctly classified by at least one base layer classifier, then, the generalization performance of the SG is increased, compared to the performance of the individual classifiers. In the second hypothesis, the effect of the spurious samples, which are not correctly labeled by any of the base layer classifiers, is investigated. In the second part of the thesis, six theorems are constructed based on the analysis of the feature spaces and the stacked generalization architecture. Based on the theorems and hypothesis, a new class of SG algorithms is proposed. The experiments are performed on both Corel data and synthetically generated data, using parallel programming techniques, on a high performance cluster.
139	Parallel Closet+ Algorithm For Finding Frequent Closed Itemsets Sen, Tayfun 01 July 2009 (has links) (PDF) Data mining is proving itself to be a very important field as the data available is increasing exponentially, thanks to first computerization and now internetization. On the other hand, cluster computing systems made up of commodity hardware are becoming widespread, along with the multicore processor architectures. This high computing power is synthesized with data mining to process huge amounts of data and to reach information and knowledge. Frequent itemset mining is a special subtopic of data mining because it is an integral part of many types of data mining tasks. Often this task is a prerequisite for many other data mining algorithms, most notably algorithms in the association rule mining area. For this reason, it is studied heavily in the literature. In this thesis, a parallel implementation of CLOSET+, a frequent closed itemset mining algorithm, is presented. The CLOSET+ algorithm has been modified to run on multiple processors simultaneously, in order to obtain results faster. Open MPI and Boost libraries have been used for the communication between different processes and the program has been tested on different inputs and parameters. Experimental results show that the algorithm exhibits high speedup and eficiency for dense data when the support value is higher than a determined value. Proposed parallel algorithm could prove to be useful for application areas where fast response is needed for low to medium number of frequent closed itemsets. A particular application area is the Web where online applications have similar requirements. QA Computer Software 76.75-76.765
140	Massive Crowd Simulation With Parallel Processing Yilmaz, Erdal 01 February 2010 (has links) (PDF) This thesis analyzes how parallel processing with Graphics Processing Unit (GPU) could be used for massive crowd simulation, not only in terms of rendering but also the computational power that is required for realistic simulation. The extreme population in massive crowd simulation introduces an extra computational load, which is quite difficult to meet by using Central Processing Unit (CPU) resources only. The thesis shows the specific methods and approaches that maximize the throughput of GPU parallel computing, while using GPU as the main processor for massive crowd simulation. The methodology introduced in this thesis makes it possible to simulate and visualize hundreds of thousands of virtual characters in real-time. In order to achieve two orders of magnitude speedups by using GPU parallel processing, various stream compaction and effective memory access approaches were employed. To simulate crowd behavior, fuzzy logic functionality on the GPU has been implemented from scratch. This implementation is capable of computing more than half billion fuzzy inferences per second. QA Computer Software 76.75-76.765

Search results