• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 74
  • 16
  • 7
  • 5
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 146
  • 146
  • 57
  • 23
  • 21
  • 21
  • 19
  • 19
  • 19
  • 17
  • 16
  • 16
  • 15
  • 15
  • 14
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Parallel multipliers for modular arithmetic

Sanu, Moboluwaji Olusegun. January 1900 (has links) (PDF)
Thesis (Ph. D.)--University of Texas at Austin, 2005. / Vita. Includes bibliographical references.
32

Insights from the parallel implementation of efficient algorithms for the fractional calculus

Banks, Nicola E. January 2015 (has links)
This thesis concerns the development of parallel algorithms to solve fractional differential equations using a numerical approach. The methodology adopted is to adapt existing numerical schemes and to develop prototype parallel programs using the MatLab Parallel Computing Toolbox (MPCT). The approach is to build on existing insights from parallel implementation of ordinary differential equations methods and to test a range of potential candidates for parallel implementation in the fractional case. As a consequence of the work, new insights on the use of MPCT for prototyping are presented, alongside conclusions and algorithms for the effective implementation of parallel methods for the fractional calculus. The principal parallel approaches considered in the work include: - A Runge-Kutta Method for Ordinary Differential Equations including the application of an adapted Richardson Extrapolation Scheme - An implementation of the Diethelm-Chern Algorithm for Fractional Differential Equations - A parallel version of the well-established Fractional Adams Method for Fractional Differential Equations - The adaptation for parallel implementation of Lubich's Fractional Multistep Method for Fractional Differential Equations An important aspect of the work is an improved understanding of the comparative diffi culty of using MPCT for obtaining fair comparisons of parallel implementation. We present details of experimental results which are not satisfactory, and we explain how the problems may be overcome to give meaningful experimental results. Therefore, an important aspect of the conclusions of this work is the advice for other users of MPCT who may be planning to use the package as a prototyping tool for parallel algorithm development: by understanding how implicit multithreading operates, controls can be put in place to allow like-for-like performance comparisons between sequential and parallel programs.
33

Neural computation of all eigenpairs of a matrix with real eigenvalues

Perlepes, Serafim Theodore 01 January 1999 (has links)
No description available.
34

Scalable Algorithms for Delaunay Mesh Generation

Slatton, Andrew G. January 2014 (has links)
No description available.
35

HPC-based Parallel Algorithms for Generating Random Networks and Some Other Network Analysis Problems

Alam, Md Maksudul 06 December 2016 (has links)
The advancement of modern technologies has resulted in an explosive growth of complex systems, such as the Internet, biological, social, and various infrastructure networks, which have, in turn, contributed to the rise of massive networks. During the past decade, analyzing and mining of these networks has become an emerging research area with many real-world applications. The most relevant problems in this area include: collecting and managing networks, modeling and generating random networks, and developing network mining algorithms. In the era of big data, speed is not an option anymore for the effective analysis of these massive systems, it is an absolute necessity. This motivates the need for parallel algorithms on modern high-performance computing (HPC) systems including multi-core, distributed, and graphics processor units (GPU) based systems. In this dissertation, we present distributed memory parallel algorithms for generating massive random networks and a novel GPU-based algorithm for index searching. This dissertation is divided into two parts. In Part I, we present parallel algorithms for generating massive random networks using several widely-used models. We design and develop a novel parallel algorithm for generating random networks using the preferential-attachment model. This algorithm can generate networks with billions of edges in just a few minutes using a medium-sized computing cluster. We develop another parallel algorithm for generating random networks with a given sequence of expected degrees. We also design a new a time and space efficient algorithmic method to generate random networks with any degree distributions. This method has been applied to generate random networks using other popular network models, such as block two-level Erdos-Renyi and stochastic block models. Parallel algorithms for network generation pose many nontrivial challenges such as dependency on edges, avoiding duplicate edges, and load balancing. We applied novel techniques to deal with these challenges. All of our algorithms scale very well to a large number of processors and provide almost linear speed-up. Dealing with a large number of networks collected from a variety of fields requires efficient management systems such as graph databases. Finding a record in those databases is very critical and typically is the main bottleneck for performance. In Part II of the dissertation, we develop a GPU-based parallel algorithm for index searching. Our algorithm achieves the fastest throughput ever reported in the literature for various benchmarks. / Ph. D.
36

Mapping parallel graph algorithms to throughput-oriented architectures

McLaughlin, Adam 07 January 2016 (has links)
The stagnant performance of single core processors, increasing size of data sets, and variety of structure in information has made the domain of parallel and high-performance computing especially crucial. Graphics Processing Units (GPUs) have recently become an exciting alternative to traditional CPU architectures for applications in this domain. Although GPUs are designed for rendering graphics, research has found that the GPU architecture is well-suited to algorithms that search and analyze unstructured, graph-based data, offering up to an order of magnitude greater memory bandwidth over their CPU counterparts. This thesis focuses on GPU graph analysis from the perspective that algorithms should be efficient on as many classes of graphs as possible, rather than being specialized to a specific class, such as social networks or road networks. Using betweenness centrality, a popular analytic used to find prominent entities of a network, as a motivating example, we show how parallelism, distributed computing, hybrid and on-line algorithms, and dynamic algorithms can all contribute to substantial improvements in the performance and energy-efficiency of these computations. We further generalize this approach and provide an abstraction that can be applied to a whole class of graph algorithms that require many simultaneous breadth-first searches. Finally, to show that our findings can be applied in real-world scenarios, we apply these techniques to the problem of verifying that a multiprocessor complies with its memory consistency model.
37

Exploiting parallelism in decomposition methods for constraint satisfaction

Akatov, Dmitri January 2010 (has links)
Constraint Satisfaction Problems (CSPs) are NP-complete in general, however, there are many tractable subclasses that rely on the restriction of the structure of their underlying hypergraphs. It is a well-known fact, for instance, that CSPs whose underlying hypergraph is acyclic are tractable. Trying to define “nearly acyclic” hypergraphs led to the definition of various hypergraph decomposition methods. An important member in this class is the hypertree decomposition method, introduced by Gottlob et al. It possesses the property that CSPs falling into this class can be solved efficiently, and that hypergraphs in this class can be recognized efficiently as well. Apart from polynomial tractability, complexity analysis has shown, that both afore-mentioned problems lie in the low complexity class LOGCFL and are thus moreover efficiently parallelizable. A parallel algorithm has been proposed for the “evaluation problem”, however all algorithms for the “recognition problem” presented to date are sequential. The main contribution of this dissertation is the creation of an object oriented programming library including a task scheduler which allows the parallelization of a whole range of computational problems, fulfilling certain complexity-theoretic restrictions. This library merely requires the programmer to provide the implementation of several classes and methods, representing a general alternating algorithm, while the mechanics of the task scheduler remain hidden. In particular, we use this library to create an efficient parallel algorithm, which computes hypertree decompositions of a fixed width. Another result of a more theoretical nature is the definition of a new type of decomposition method, called Balanced Decompositions. Solving CSPs of bounded balanced width and recognizing such hypergraphs is only quasi-polynomial, however still parallelizable to a certain extent. A complexity-theoretic analysis leads to the definition of a new complexity class hierarchy, called the DC-hierarchy, with the first class in this hierarchy, DC1 , precisely capturing the complexity of solving CSPs of bounded balanced width.
38

Metaheuristická metóda mravčej kolónie pri riešení kombinatorických optimalizačných úloh / Solving the combinatorial optimization problems with the Ant Colony Optimization metaheuristic method

Chu, Andrej January 2005 (has links)
The Ant Colony Optimization belongs into the metaheuristic methods category and it has been developing quite recently. So far it has shown its capabalities to over-perform other metaheuristic methods in quality of the solutions. This work brings analysis of the possible applications of the method on the classical optimization combinatorial problems -- traveling salesman problem, vehicle routing problem, knapsack problem, generalized assignment problem and maximal clique problem. It also deals with the practical experiments with application on several optimization problems and analysis of the time and memory complexity of such algorithms. The last part of the work is dedicated to the possibility of parallelization of the algorithm, which was result of the application of the ACO method on the traveling salesman problem. It brings analysis of the crucial operations and data synchronization issues, as well as practical example and demonstration of the parallelized version of the algorithm.
39

Aplicações de computação paralela em otimização contínua / Applications of parallel computing in continuous optimization

Abrantes, Ricardo Luiz de Andrade 22 February 2008 (has links)
No presente trabalho, estudamos alguns conceitos relacionados ao desenvolvimento de programas paralelos, algumas formas de aplicar computação paralela em métodos de otimização contínua e dois métodos que envolvem o uso de otimização. O primeiro método que apresentamos, chamado PUMA (Pointwise Unconstrained Minimization Approach), recupera constantes óticas e espessuras de filmes finos a partir de valores de transmitância. O problema de recuperação é modelado como um problema inverso e resolvido com auxílio de um método de otimização. Através da paralelização do PUMA viabilizamos a recuperação empírica de constantes e espessuras de sistemas compostos por até dois filmes sobrepostos. Relatamos aqui os resultados obtidos e discutimos o desempenho da versão paralela e a qualidade dos resultados obtidos. O segundo método estudado tem o objetivo de obter configurações iniciais de moléculas para simulações de dinâmica molecular e é chamado PACKMOL. O problema de obter uma configuração inicial de moléculas é modelado como um problema de empacotamento e resolvido com o auxílio de um método de otimização. Construímos uma versão paralela do PACKMOL e mostramos os ganhos de desempenho obtidos com a paralelização. / In this work we studied some concepts of parallel programming, some ways of using parallel computing in continuous optimization methods and two optimization methods. The first method we present is called PUMA (Pointwise Unconstrained Minimization Approach), and it retrieves optical constants and thicknesses of thin films from transmitance data. The problem of retrieve thickness and optical constants is modeled as an inverse problem and solved with aid of an optimization method. Through the paralelization of PUMA we managed to retrieve optical constants and thicknesses of thin films in structures with one and two superposed films. We describe some results and discuss the performance of the parallel PUMA and the quality of the retrievals. The second studied method is used to build an initial configuration of molecules for molecular dynamics simulations and it is called PACKMOL. The problem of create an initial configuration of molecules is modeled as a packing problem and solved with aid of an optimization method. We developed a parallel version of PACKMOL and we show the obtained performance gains.
40

Deadline-ordered parallel iterative matching with QoS guarantee.

January 2000 (has links)
by Lui Hung Ngai. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 56-[59]). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Thesis Overview --- p.3 / Chapter 2 --- Background & Related work --- p.4 / Chapter 2.1 --- Scheduling problem in ATM switch --- p.4 / Chapter 2.2 --- Traffic Scheduling in output-buffered switch --- p.5 / Chapter 2.3 --- Traffic Scheduling in Input buffered Switch --- p.16 / Chapter 3 --- Deadline-ordered Parallel Iterative Matching (DLPIM) --- p.22 / Chapter 3.1 --- Introduction --- p.22 / Chapter 3.2 --- Switch model --- p.23 / Chapter 3.3 --- Deadline-ordered Parallel Iterative Matching (DLPIM) --- p.24 / Chapter 3.3.1 --- Motivation --- p.24 / Chapter 3.3.2 --- Algorithm --- p.26 / Chapter 3.3.3 --- An example of DLPIM --- p.28 / Chapter 3.4 --- Simulation --- p.30 / Chapter 4 --- DLPIM with static scheduling algorithm --- p.41 / Chapter 4.1 --- Introduction --- p.41 / Chapter 4.2 --- Static scheduling algorithm --- p.42 / Chapter 4.3 --- DLPIM with static scheduling algorithm --- p.48 / Chapter 4.4 --- An example of DLPIM with static scheduling algorithm --- p.50 / Chapter 5 --- Conclusion --- p.54 / Bibliography --- p.56

Page generated in 0.0683 seconds