Global ETD Search

11	Parallel Algorithm for Memory Efficient Pairwise and Multiple Genome Alignment in Distributed Environment Ahmed, Nova 20 December 2004 (has links) The genome sequence alignment problems are very important ones from the computational biology perspective. These problems deal with large amount of data which is memory intensive as well as computation intensive. In the literature, two separate algorithms have been studied and improved – one is a Pairwise sequence alignment algorithm which aligns pairs of genome sequences with memory reduction and parallelism for the computation and the other one is the multiple sequence alignment algorithm that aligns multiple genome sequences and this algorithm is also parallelized efficiently so that the workload of the alignment program is well distributed. The parallel applications can be launched on different environments where shared memory is very well suited for these kinds of applications. But shared memory environment has the limitation of memory usage as well as scalability also these machines are very costly. A better approach is to use the cluster of computers and the cluster environment can be further enhanced to a grid environment so that the scalability can be improved introducing multiple clusters. Here the grid environment is studied as well as the shared memory and cluster environment for the two applications. It can be stated that for carefully designed algorithms the grid environment is comparable for its performance to other distributed environments and it sometimes outperforms the others in terms of the limitations of resources the other distributed environments have. pairwise sequence alignment multiple sequence alignment parallel algorithm memory efficient pairwise alignment Computer Sciences
12	Hierarchical Matrix Techniques on Massively Parallel Computers Izadi, Mohammad 11 December 2012 (has links) (PDF) Hierarchical matrix (H-matrix) techniques can be used to efficiently treat dense matrices. With an H-matrix, the storage requirements and performing all fundamental operations, namely matrix-vector multiplication, matrix-matrix multiplication and matrix inversion can be done in almost linear complexity. In this work, we tried to gain even further speedup for the H-matrix arithmetic by utilizing multiple processors. Our approach towards an H-matrix distribution relies on the splitting of the index set. The main results achieved in this work based on the index-wise H-distribution are: A highly scalable algorithm for the H-matrix truncation and matrix-vector multiplication, a scalable algorithm for the H-matrix matrix multiplication, a limited scalable algorithm for the H-matrix inversion for a large number of processors. Hierarchische Matrizen parallelen Algorithmus Distributed-Memory-Systeme Hierarchical matrices parallel algorithm Distributed-Memory-System ddc:000
13	Efficient Nonlinear Optimization with Rigorous Models for Large Scale Industrial Chemical Processes Zhu, Yu 2011 May 1900 (has links) Large scale nonlinear programming (NLP) has proven to be an effective framework for obtaining profit gains through optimal process design and operations in chemical engineering. While the classical SQP and Interior Point methods have been successfully applied to solve many optimization problems, the focus of both academia and industry on larger and more complicated problems requires further development of numerical algorithms which can provide improved computational efficiency. The primary purpose of this dissertation is to develop effective problem formulations and an advanced numerical algorithms for efficient solution of these challenging problems. As problem sizes increase, there is a need for tailored algorithms that can exploit problem specific structure. Furthermore, computer chip manufacturers are no longer focusing on increased clock-speeds, but rather on hyperthreading and multi-core architectures. Therefore, to see continued performance improvement, we must focus on algorithms that can exploit emerging parallel computing architectures. In this dissertation, we develop an advanced parallel solution strategy for nonlinear programming problems with block-angular structure. The effectiveness of this and modern off-the-shelf tools are demonstrated on a wide range of problem classes. Here, we treat optimal design, optimal operation, dynamic optimization, and parameter estimation. Two case studies (air separation units and heat-integrated columns) are investigated to deal with design under uncertainty with rigorous models. For optimal operation, this dissertation takes cryogenic air separation units as a primary case study and focuses on formulations for handling uncertain product demands, contractual constraints on customer satisfaction levels, and variable power pricing. Multiperiod formulations provide operating plans that consider inventory to meet customer demands and improve profits. In the area of dynamic optimization, optimal reference trajectories are determined for load changes in an air separation process. A multiscenario programming formulation is again used, this time with large-scale discretized dynamic models. Finally, to emphasize a different decomposition approach, we address a problem with significant spatial complexity. Unknown water demands within a large scale city-wide distribution network are estimated. This problem provides a different decomposition mechanism than the multiscenario or multiperiod problems; nevertheless, our parallel approach provides effective speedup. nonlinear optimization air separation design under uncertainty parallel algorithm water network rigorous model
14	On the design of architecture-aware algorithms for emerging applications Kang, Seunghwa 30 January 2011 (has links) This dissertation maps various kernels and applications to a spectrum of programming models and architectures and also presents architecture-aware algorithms for different systems. The kernels and applications discussed in this dissertation have widely varying computational characteristics. For example, we consider both dense numerical computations and sparse graph algorithms. This dissertation also covers emerging applications from image processing, complex network analysis, and computational biology. We map these problems to diverse multicore processors and manycore accelerators. We also use new programming models (such as Transactional Memory, MapReduce, and Intel TBB) to address the performance and productivity challenges in the problems. Our experiences highlight the importance of mapping applications to appropriate programming models and architectures. We also find several limitations of current system software and architectures and directions to improve those. The discussion focuses on system software and architectural support for nested irregular parallelism, Transactional Memory, and hybrid data transfer mechanisms. We believe that the complexity of parallel programming can be significantly reduced via collaborative efforts among researchers and practitioners from different domains. This dissertation participates in the efforts by providing benchmarks and suggestions to improve system software and architectures. MapReduce Nested parallelism Parallel algorithm Algorithm engineering Performance tuning GPU Transactional memory Algorithms Parallel algorithms Multiprocessors
15	High-performance algorithms and software for large-scale molecular simulation Liu, Xing 08 June 2015 (has links) Molecular simulation is an indispensable tool in many different disciplines such as physics, biology, chemical engineering, materials science, drug design, and others. Performing large-scale molecular simulation is of great interest to biologists and chemists, because many important biological and pharmaceutical phenomena can only be observed in very large molecule systems and after sufficiently long time dynamics. On the other hand, molecular simulation methods usually have very steep computational costs, which limits current molecular simulation studies to relatively small systems. The gap between the scale of molecular simulation that existing techniques can handle and the scale of interest has become a major barrier for applying molecular simulation to study real-world problems. In order to study large-scale molecular systems using molecular simulation, it requires developing highly parallel simulation algorithms and constantly adapting the algorithms to rapidly changing high performance computing architectures. However, many existing algorithms and codes for molecular simulation are from more than a decade ago, which were designed for sequential computers or early parallel architectures. They may not scale efficiently and do not fully exploit features of today's hardware. Given the rapid evolution in computer architectures, the time has come to revisit these molecular simulation algorithms and codes. In this thesis, we demonstrate our approach to addressing the computational challenges of large-scale molecular simulation by presenting both the high-performance algorithms and software for two important molecular simulation applications: Hartree-Fock (HF) calculations and hydrodynamics simulations, on highly parallel computer architectures. The algorithms and software presented in this thesis have been used by biologists and chemists to study some problems that were unable to solve using existing codes. The parallel techniques and methods developed in this work can be also applied to other molecular simulation applications. High-performance computing Parallel algorithm Distributed computing Heterogenous computing Quantum chemistry Stokesian dynamics Brownian dynamics
16	Stochastic Learning Algorithms With Improved Speed Performance Arvind, M T 11 1900 (has links) (PDF) No description available. Machine Learning Pattern Recognition Adaptive Control Systems Learning Automata Parallel Algorithm Computer Science
17	Parallel Computing Applications in Large-Scale Power System Operations Wang, Chunheng 12 August 2016 (has links) Electrical energy is the basic necessity for the economic development of human societies. In recent decades, the electricity industry is undergoing enormous changes, which have evolved into a large-scale and competitive industry. The integration of volatile renewable energy, and the emergence of transmission switching (TS) techniques bring great challenges to the existing power system operations problems, especially security-constrained unit commitment (SCUC) solution engines. In order to deal with the uncertainty of volatile renewable energy, scenario-based stochastic optimization approach has been widely employed to ensure the reliability and economic of power systems, in which each scenario would represent a possible system situation. Meanwhile, the emergence of TS techniques allows the system operators to change the topology of transmission systems in order to improve economic benefits by mitigating transmission congestion. However, with the introduction of extra scenarios and decision variables, the complexity of the SCUC model increases dramatically and more computational efforts are required, which might make the power system operation problems difficult to solve and even intractable. Therefore, an advanced solution technique is urgently needed to solve both stochastic SCUC problems and TS-based SCUC problems in an effective and fast way. In this dissertation, a decomposition framework is presented for the optimal operation of the large-scale power system, which decomposes the original large-size power system optimization problem into smaller-size and tractable subproblems, and solves these decomposed subproblems in a parallel manner with the help of high performance computing techniques. Numerical case studies on a modified I 118-bus system and a practical 1168-bus system demonstrate the effectiveness and efficiency of the proposed approach which will offer the power system a secure and economic operation under various uncertainties and contingencies. unit commitment stochastic programming power transmission switching power system optimization parallel algorithm
18	Attacks On Difficult Instances Of Graph Isomorphism: Sequential And Parallel Algorithms Tener, Greg 01 January 2009 (has links) The graph isomorphism problem has received a great deal of attention on both theoretical and practical fronts. However, a polynomial algorithm for the problem has yet to be found. Even so, the best of the existing algorithms perform well in practice; so well that it is challenging to find hard instances for them. The most efficient algorithms, for determining if a pair of graphs are isomorphic, are based on the individualization-refinement paradigm, pioneered by Brendan McKay in 1981 with his algorithm nauty. Nauty and various improved descendants of nauty, such as bliss and saucy, solve the graph isomorphism problem by determining a canonical representative for each of the graphs. The graphs are isomorphic if and only if their canonical representatives are identical. These algorithms also detect the symmetries in a graph which are used to speed up the search for the canonical representative--an approach that performs well in practice. Yet, several families of graphs have been shown to exist which are hard for nauty-like algorithms. This dissertation investigates why these graph families pose difficulty for individualization-refinement algorithms and proposes several techniques for circumventing these limitations. The first technique we propose addresses a fundamental problem pointed out by Miyazaki in 1993. He constructed a family of colored graphs which require exponential time for nauty (and nauty's improved descendants). We analyze Miyazaki's construction to determine the source of difficulty and identify a solution. We modify the base individualization-refinement algorithm by exploiting the symmetries discovered in a graph to guide the search for its canonical representative. This is accomplished with the help of a novel data structure called a guide tree. As a consequence, colored Miyazaki graphs are processed in polynomial time--thus obviating the only known exponential upper-bound on individualization-refinement algorithms (which has stood for the last 16 years). The preceding technique can only help if a graph has enough symmetry to exploit. It cannot be used for another family of hard graphs that have a high degree of regularity, but possess few actual symmetries. To handle these instances, we introduce an adaptive refinement method which utilizes the guide-tree data structure of the preceding technique to use a stronger vertex-invariant, but only when needed. We show that adaptive refinement is very effective, and it can result in dramatic speedups. We then present a third technique ideally suited for large graphs with a preponderance of sparse symmetries. A method was devised by Darga et al. for dealing with these large and highly symmetric graphs, which can reduce runtime by an order of magnitude. We explain the method and show how to incorporate it into our algorithm. Finally, we develop and implement a parallel algorithm for detecting the symmetries in, and finding a canonical representative of a graph. Our novel parallel algorithm divides the search for the symmetries and canonical representative among each processor, allowing for a high degree of scalability. The parallel algorithm is benchmarked on the hardest problem instances, and shown to be effective in subdividing the search space. canonical labeling graph isomorphism graph automorphism symmetry parallel algorithm Computer Sciences Engineering
19	Accuracy Study of a Free Particle Using Quantum Trajectory Method on Message Passing Architecture Vadapalli, Ravi K 13 December 2002 (has links) Bhom's hydrodynamic formulation (or quantum fluid dynamics) is an attractive approach since, it connects classical and quantum mechanical theories of matter through Hamilton-Jacobi (HJ) theory, and quantum potential. Lopreore and Wyatt derived and implemented one-dimensional quantum trajectory method (QTM), a new wave-packet approach, for solving hydrodynamic equations of motion on serial computing environment. Brook et al. parallelized the QTM on shared memory computing environment using a partially implicit method, and conducted accuracy study of a free particle. These studies exhibited a strange behavior of the relative error for the probability density referred to as the transient effect. In the present work, numerical experiments of Brook et al. were repeated with a view to identify the physical origin of the transient effect and its resolution. The present work used the QTM implemented on a distributed memory computing environment using MPI. The simulation is guided by an explicit scheme. Free Particle MPI wave-packet methods hydrodynamic formulation meshless methods Quantum Trajectory Method Parallel Algorithm
20	GPU-Specfic Kalman Filtering and Retrodiction for Large-Scale Target Tracking Tager, Sean 10 1900 (has links) <p>In the field of Tracking and Data Fusion most, if not all, computations executed by a computer are carried out serially. The sole part of the process that is not entirely serial is the collection of data from multiple sensors, which can be executed in parallel. However, once the data is to be filtered the most likely candidate is a serial algorithm. This is due in large part to the algorithms themselves that have been developed over the last several decades for use on conventional computers that have been left void of parallel computing capabilities, until now. With the arrival of graphical processing units, or GPUs, the tracking community is in a favourable position to exploit the functionality of parallel processing in order to track a growing number of targets. The problem, however, begins with the sheer labour of having to convert all the pre-existing serial tracking algorithms into parallel ones. This is clearly a daunting task when one considers the extent to which the tracking community has gone to develop modern day filters such as Alpha Beta filters, Probabilistic Data Association filters, Interacting Multiple Model filters, and several dozen, if not hundred, variants of the aforementioned. It is most likely that these filters will find some kind of a parallelization in the near future as ever more sensors are dispersed throughout society and even more targets are monitored with these sensors. The volume of targets then becomes simply too unmanageable for a serial algorithm and more focus is placed iv on parallel ones. Yet, before the parallel algorithms can be utilized they have to be derived. It is the derivation of these parallel algorithms which is the focus of this thesis. However, it should be made clear that it would be impossible to formulate a parallelization for every filter found in the literature, and so the goal here is to direct the attention onto one filter in particular, the Kalman filter.</p> / Master of Applied Science (MASc) Retrodiction Parallel Algorithm Large Scale Target Prefix Sum Signal Processing Signal Processing

Search results