• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 74
  • 16
  • 7
  • 5
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 146
  • 146
  • 57
  • 23
  • 21
  • 21
  • 19
  • 19
  • 19
  • 17
  • 16
  • 16
  • 15
  • 15
  • 14
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Algorithm Design Using Spectral Graph Theory

Peng, Richard 01 August 2013 (has links)
Spectral graph theory is the interplay between linear algebra and combinatorial graph theory. Laplace’s equation and its discrete form, the Laplacian matrix, appear ubiquitously in mathematical physics. Due to the recent discovery of very fast solvers for these equations, they are also becoming increasingly useful in combinatorial optimization, computer vision, computer graphics, and machine learning. In this thesis, we develop highly efficient and parallelizable algorithms for solving linear systems involving graph Laplacian matrices. These solvers can also be extended to symmetric diagonally dominant matrices and M-matrices, both of which are closely related to graph Laplacians. Our algorithms build upon two decades of progress on combinatorial preconditioning, which connects numerical and combinatorial algorithms through spectral graph theory. They in turn rely on tools from numerical analysis, metric embeddings, and random matrix theory. We give two solver algorithms that take diametrically opposite approaches. The first is motivated by combinatorial algorithms, and aims to gradually break the problem into several smaller ones. It represents major simplifications over previous solver constructions, and has theoretical running time comparable to sorting. The second is motivated by numerical analysis, and aims to rapidly improve the algebraic connectivity of the graph. It is the first highly efficient solver for Laplacian linear systems that parallelizes almost completely. Our results improve the performances of applications of fast linear system solvers ranging from scientific computing to algorithmic graph theory. We also show that these solvers can be used to address broad classes of image processing tasks, and give some preliminary experimental results.
52

Análisis de rendimiento y optimización de algoritmos paralelos Best-First Search sobre multicore y cluster de multicore

Sanz, Victoria María January 2015 (has links)
El objetivo general de esta tesis se centra en la investigación y desarrollo de algoritmos paralelos de búsqueda en grafos best-first search para arquitecturas multicore y cluster de multicore, que mejoran los existentes y se utilizan para resolver problemas de optimización combinatoria y de planificación, acompañado de un análisis de rendimiento (speedup, eficiencia, escalabilidad) de los mismos. La temática propuesta es de interés en la actualidad por la complejidad computacional de dichos algoritmos de búsqueda y las posibilidades que brindan las arquitecturas mencionadas. Los algoritmos presentados en esta tesis pueden aplicarse para resolver problemas reales como planificación de rutas óptimas, navegación automática de un robot o vehículo, alineamiento óptimo de secuencias, entre otros. Los temas de investigación derivados son múltiples y se refieren tanto a la paralelización de algoritmos sobre (a) arquitecturas de memoria compartida, como son los multicore (b) arquitecturas de memoria distribuida, como son los clusters (c) y también sobre arquitecturas híbridas, tal es el caso de los clusters de multicore. El aporte de la tesis es el desarrollo de dos algoritmos paralelos best-first-search propios, uno apto para su ejecución sobre máquinas de memoria compartida (multicore) y otro apto para máquinas de memoria distribuida (cluster), basados en el algoritmo HDA* (Hash Distributed A*), en los cuales se incluyen técnicas originales que optimizan su rendimiento. Asimismo, se presenta un análisis de rendimiento de los algoritmos desarrollados a medida que escala la carga de trabajo y la arquitectura paralela subyacente. Para finalizar, se compara la memoria consumida por ambos algoritmos y el rendimiento alcanzado cuando se los ejecuta sobre una máquina multicore; estos análisis presentan originalidad en el área. Los resultados arrojados indican que se obtendría un beneficio al convertir HDA* en una aplicación híbrida, cuando la arquitectura subyacente es un cluster de multicore, por lo que se sientan las bases para éste algoritmo híbrido.
53

Integrated compiler optimizations for tensor contractions

Gao, Xiaoyang, January 2008 (has links)
Thesis (Ph. D.)--Ohio State University, 2008. / Title from first page of PDF file. Includes bibliographical references (p. 140-144).
54

Architectures and algorithms for high performance switching

Prakash, Amit, Aziz, Adnan, January 2004 (has links) (PDF)
Thesis (Ph. D.)--University of Texas at Austin, 2004. / Supervisor: Adnan Aziz. Vita. Includes bibliographical references. Also available from UMI.
55

Parallelization of light scattering spectroscopy and its integration with computational grid environments

Paladugula, Jithendar. January 2004 (has links)
Thesis (M.S.)--University of Florida, 2004. / Title from title page of source document. Document formatted into pages; contains 74 pages. Includes vita. Includes bibliographical references.
56

Performance of MIMO space-time coding algorithms on a parallel DSP test platform /

Neal, Beau C., January 2007 (has links) (PDF)
Thesis (M.S.)--Brigham Young University. Dept. of Electrical and Computer Engineering, 2007. / Includes bibliographical references (p. 119-121).
57

Optimal dimensional synthesis of planar parallel manipulators with respect to workspaces

Hay, Alexander Morrison. January 2004 (has links)
Thesis (Ph.D.(Mechanical Engineering))--University of Pretoria, 2003. / Summaries in Afrikaans and English. Includes bibliographical references.
58

Développement de schémas de découplage pour la résolution de systèmes dynamiques sur architecture de calcul distribuée / Development of decoupled numerical scheme in solving dynamical systems on parallel computing architecture

Pham, Duc Toan 30 September 2010 (has links)
Nous nous intéressons dans ce mémoire à des méthodes de parallélisation par découplage du système dynamique. Plusieurs applications numériques de nos jours conduisent à des systèmes dynamiques de grande taille et nécessitent des méthodes de parallélisation en conséquence pour pouvoir être résolues sur les machines de calcul à plusieurs processeurs. Notre but est de trouver une méthode numérique à la fois consistante et stable pour réduire le temps de la résolution numérique. La première approche consiste à découpler le système dynamique en sous-systèmes contenant des sous-ensembles de variables indépendants et à remplacer les termes de couplage par l’extrapolation polynomiale. Une telle méthode a été introduite sous le nom de schéma C (p, q, j), nous améliorons ce schéma en introduisant la possibilité à utiliser des pas de temps adaptatifs. Cependant, notre étude montre que cette méthode de découplage ne peut satisfaire les propriétés numériques que sous des conditions très strictes et ne peut donc pas s’appliquer aux problèmes raides présentant des couplages forts entre les sous-systèmes. Afin de pouvoir répondre à cette problématique de découplage des systèmes fortement couplés, on introduit le deuxième axe de recherche, dont l’outil principal est la réduction d’ordre du modèle. L’idée est de remplacer le couplage entre les sous-ensembles de variables du système par leurs représentations sous forme réduite. Ces sous-systèmes peuvent être distribués sur une architecture de calcul parallèle. Notre analyse du schéma de découplage résultant nous conduit à définir un critère mathématique pour la mise à jour des bases réduites entre les sous-systèmes. La méthode de réduction d’ordre du modèle utilisée est fondée sur la décomposition orthogonale aux valeurs propres (POD). Cependant, ne disposant pas à priori des données requises pour la construction de la base réduite, nous proposons alors un algorithme de construction incrémentale de la base réduite permettant de représenter le maximum des dynamiques des solutions présentes dans l’intervalle de simulation. Nous avons appliqué la méthode proposée sur les différents systèmes dynamiques tels que l’exemple provenant d’une EDP et celui provenant de l’équation de Navier Stokes. La méthode proposée montre l’avantage de l’utilisation de l’algorithme de découplage basé sur la réduction d’ordre. Les solutions numériques sont obtenues avec une bonne précision comparées à celle obtenue par une méthode de résolution classique tout en restant très performante selon le nombre de sous-systèmes définis. / In this thesis, we are interested in parallelization algorithm for solving dynamical systems. Many industrial applications nowadays lead to large systems of huge number of variables. A such dynamical system requires parallel method in order to be solved on parallel computers. Our goal is to find a robust numerical method satisfying stability and consistency properties and suitable to be implemented in parallel machines. The first method developed in this thesis consists in decoupling dynamical system into independent subsystems and using polynomial extrapolation for coupled terms between subsystems. Such a method is called C(p; q; j).We have extended this numerical scheme to adaptive time steps. However, this method admits poor numerical properties and therefore cannot be applied in solving stiff systems with strong coupling terms.When dealing with systems whose variables are strongly coupled, contrary to the technique of using extrapolation for coupled terms, one may suggest to use reduced order models to replace those terms and solve separately each independent subsystems. Thus, we introduced the second approach consisting in using order reduction technique in decoupling dynamical systems. The order reduction method uses the Proper Orthogonal Decomposition. Therefore, when constructing reduced order models, we do not have all the solutions required for the POD basis, then we developed a technique of updating the POD during the simulation process. This method is applied successfully to solve different examples of dynamical systems : one example of stiff ODE provided from PDE and the other was the ODE system provided from the Nervier-Stokes equations. As a result, we have proposed a robust method of decoupling dynamical system based on reduced order technique. We have obtained good approximations to the reference solution with appropriated precision. Moreover, we obtained a great performance when solving the problem on parallel computers.
59

Parallel algorithms of timetable generation / Parallella algoritmer för att generera scheman.

Antkowiak, Łukasz January 2013 (has links)
Context: Most of the problem of generating timetable for a school belongs to the class of NP-hard problems. Complexity and practical value makes this kind of problem interesting for parallel computing. Objectives: This paper focuses on Class-Teacher problem with weighted time slots and proofs that it is NP-complete problem. Branch and bound scheme and two approaches to distribute the simulated annealing are proposed. Empirical evaluation of described methods is conducted in an elementary school computer laboratory. Methods: Simulated annealing implementation described in literature are adapted for the problem, and prepared for execution in distributed systems. Empirical evaluation is conducted with the real data from Polish elementary school. Results: Proposed branch and bound scheme scales nearly logarithmically with the number of nodes in computing cluster. Proposed parallel simulated annealing models tends to increase solution quality. Conclusions: Despite a significant increase in computing power, computer laboratories are still unprepared for heavy computation. Proposed branch and bound method is infeasible with the real instances. Parallel Moves approach tends to provide better solution at the beginning of execution, but the Multiple Independent Runs approach outruns it after some time. / Sammanhang: De flesta problem med att generera scheman för en skola tillhör klassen av NP-svårt problemen. Komplexitet och praktiskt värde gör att den här typen av problemen forskas med särskild uppmärksamhet på en parallell bearbetning.   Syfte: Detta dokument fokusarar på Klass-Lärare problem med vikter för enskilda tidsluckor och på att visa var ett NP-svårt problem är fullständigt. Branch and bound scheman och två metoder för att distribuera en simulerad glödgning algoritm presenterades. En empirisk analys av beskrivna metoder gjordes i datorlaboratorium i en grundskola. Metod: Implementering av en simulerad glödgning algoritm som beskrivs i litteraturen blev anpassad till ett utvalt problem och distribuerade system. Empirisk utvärdering genomförs med verkliga data från polska grundskolan Resultat: Föreslagit Branch and bound system graderar nästan logaritmiskt antal noder i ett datorkluster. Den simulerade glödgning algoritmen som föreslagits förbättrar lösningarnas kvalitet. Slutsatser: Trots att en betydande ökning med beräkningskraft är inte datasalar i skolor anpassad till avancerade beräkningar. Användning av den Branch and Bound föreslagna metoden till praktiska problem är omöjlig i praktiken. En annan föreslagen metod Parallel Moves ger bättre resultat i början av utförandet men Multiple Independent Runs hittar bättre lösningar efter en viss tid.
60

Bibliotheken zur Entwicklung paralleler Algorithmen

Haase, G., Hommel, T., Meyer, A., Pester, M. 30 October 1998 (has links) (PDF)
The purpose of this paper is to supply a summary of library subroutines and functions for parallel MIMD computers. The subroutines have been developed at the University of Chemnitz during a period of the last five years. In detail, they are concerned with vector operations, inter-processor communication and simple graphic output to workstations. One of the most valuable features is the machine-independence of the communication subroutines proposed in this paper for a hypercube topology of the parallel processors (excepting a kernel of only two primitive system-dependend operations). They were implemented and tested for different hardware and operating systems including transputer, nCube, KSR, PVM. The vector subroutines are optimized by the use of C language and enrolled loops (BLAS1-like). The paper includes hints for using the libraries with both Fortran and C programs.

Page generated in 0.0723 seconds