Global ETD Search

31	Méthodes hybrides pour la résolution de grands systèmes linéaires creux sur calculateurs parallèles / The solution of large sparse linear systems on parallel computers using a hybrid implementation of the block Cimmino method Zenadi, Mohamed 18 December 2013 (has links) Nous nous intéressons à la résolution en parallèle de système d’équations linéaires creux et de large taille. Le calcul de la solution d’un tel type de système requiert un grand espace mémoire et une grande puissance de calcul. Il existe deux principales méthodes de résolution de systèmes linéaires. Soit la méthode est directe et de ce fait est rapide et précise, mais consomme beaucoup de mémoire. Soit elle est itérative, économe en mémoire, mais assez lente à atteindre une solution de qualité suffisante. Notre travail consiste à combiner ces deux techniques pour créer un solveur hybride efficient en consommation mémoire tout en étant rapide et robuste. Nous essayons ensuite d’améliorer ce solveur en introduisant une nouvelle méthode pseudo directe qui contourne certains inconvénients de la méthode précédente. Dans les premiers chapitres nous examinons les méthodes de projections par lignes, en particulier la méthode Cimmino en bloc, certains de leurs aspects numériques et comment ils affectent la convergence. Ensuite, nous analyserons l’accélération de ces techniques avec la méthode des gradients conjugués et comment cette accélération peut être améliorée avec une version en bloc du gradient conjugué. Nous regarderons ensuite comment le partitionnement du système linéaire affecte lui aussi la convergence et comment nous pouvons améliorer sa qualité. Finalement, nous examinerons l’implantation en parallèle du solveur hybride, ses performances ainsi que les améliorations possible. Les deux derniers chapitres introduisent une amélioration à ce solveur hybride, en améliorant les propriétés numériques du système linéaire, de sorte à avoir une convergence en une seule itération et donc un solveur pseudo direct. Nous commençons par examiner les propriétés numériques du système résultants, analyser la solution parallèle et comment elle se comporte face au solveur hybride et face à un solveur direct. Finalement, nous introduisons de possible amélioration au solveur pseudo direct. Ce travail a permis d’implanter un solveur hybride "ABCD solver" (Augmented Block Cimmino Distributed solver) qui peut soit fonctionner en mode itératif ou en mode pseudo direct. / We are interested in solving large sparse systems of linear equations in parallel. Computing the solution of such systems requires a large amount of memory and computational power. The two main ways to obtain the solution are direct and iterative approaches. The former achieves this goal fast but with a large memory footprint while the latter is memory friendly but can be slow to converge. In this work we try first to combine both approaches to create a hybrid solver that can be memory efficient while being fast. Then we discuss a novel approach that creates a pseudo-direct solver that compensates for the drawback of the earlier approach. In the first chapters we take a look at row projection techniques, especially the block Cimmino method and examine some of their numerical aspects and how they affect the convergence. We then discuss the acceleration of convergence using conjugate gradients and show that a block version improves the convergence. Next, we see how partitioning the linear system affects the convergence and show how to improve its quality. We finish by discussing the parallel implementation of the hybrid solver, discussing its performance and seeing how it can be improved. The last two chapters focus on an improvement to this hybrid solver. We try to improve the numerical properties of the linear system so that we converge in a single iteration which results in a pseudo-direct solver. We first discuss the numerical properties of the new system, see how it works in parallel and see how it performs versus the iterative version and versus a direct solver. We finally consider some possible improvements to the solver. This work led to the implementation of a hybrid solver, our "ABCD solver" (Augmented Block Cimmino Distributed solver), that can either work in a fully iterative mode or in a pseudo-direct mode. Matrices creuses Méthodes hybrides Partitionnement Hypergraphes Calcul haute performance Calcul parallèle Sparse matrices Iterative methods for linear systems Direct methods for linear systems Hybrid methods Partitioning Hypergraphs High-performance computing Parallel computing
32	Solveurs multifrontaux exploitant des blocs de rang faible : complexité, performance et parallélisme / Block low-rank multifrontal solvers : complexity, performance, and scalability Mary, Théo 24 November 2017 (has links) Nous nous intéressons à l'utilisation d'approximations de rang faible pour réduire le coût des solveurs creux directs multifrontaux. Parmi les différents formats matriciels qui ont été proposés pour exploiter la propriété de rang faible dans les solveurs multifrontaux, nous nous concentrons sur le format Block Low-Rank (BLR) dont la simplicité et la flexibilité permettent de l'utiliser facilement dans un solveur multifrontal algébrique et généraliste. Nous présentons différentes variantes de la factorisation BLR, selon comment les mises à jour de rang faible sont effectuées, et comment le pivotage numérique est géré. D'abord, nous étudions la complexité théorique du format BLR qui, contrairement à d'autres formats comme les formats hiérarchiques, était inconnue jusqu'à présent. Nous prouvons que la complexité théorique de la factorisation multifrontale BLR est asymptotiquement inférieure à celle du solveur de rang plein. Nous montrons ensuite comment les variantes BLR peuvent encore réduire cette complexité. Nous étayons nos bornes de complexité par une étude expérimentale. Après avoir montré que les solveurs multifrontaux BLR peuvent atteindre une faible complexité, nous nous intéressons au problème de la convertir en gains de performance réels sur les architectures modernes. Nous présentons d'abord une factorisation BLR multithreadée, et analysons sa performance dans des environnements multicœurs à mémoire partagée. Nous montrons que les variantes BLR sont cruciales pour exploiter efficacement les machines multicœurs en améliorant l'intensité arithmétique et la scalabilité de la factorisation. Nous considérons ensuite à la factorisation BLR sur des architectures à mémoire distribuée. Les algorithmes présentés dans cette thèse ont été implémentés dans le solveur MUMPS. Nous illustrons l'utilisation de notre approche dans trois applications industrielles provenant des géosciences et de la mécanique des structures. Nous comparons également notre solveur avec STRUMPACK, basé sur des approximations Hierarchically Semi-Separable. Nous concluons cette thèse en rapportant un résultat sur un problème de très grande taille (130 millions d'inconnues) qui illustre les futurs défis posés par le passage à l'échelle des solveurs multifrontaux BLR. / We investigate the use of low-rank approximations to reduce the cost of sparse direct multifrontal solvers. Among the different matrix representations that have been proposed to exploit the low-rank property within multifrontal solvers, we focus on the Block Low-Rank (BLR) format whose simplicity and flexibility make it easy to use in a general purpose, algebraic multifrontal solver. We present different variants of the BLR factorization, depending on how the low-rank updates are performed and on the constraints to handle numerical pivoting. We first investigate the theoretical complexity of the BLR format which, unlike other formats such as hierarchical ones, was previously unknown. We prove that the theoretical complexity of the BLR multifrontal factorization is asymptotically lower than that of the full-rank solver. We then show how the BLR variants can further reduce that complexity. We provide an experimental study with numerical results to support our complexity bounds. After proving that BLR multifrontal solvers can achieve a low complexity, we turn to the problem of translating that low complexity in actual performance gains on modern architectures. We first present a multithreaded BLR factorization, and analyze its performance in shared-memory multicore environments on a large set of real-life problems. We put forward several algorithmic properties of the BLR variants necessary to efficiently exploit multicore systems by improving the arithmetic intensity and the scalability of the BLR factorization. We then move on to the distributed-memory BLR factorization, for which additional challenges are identified and addressed. The algorithms presented throughout this thesis have been implemented within the MUMPS solver. We illustrate the use of our approach in three industrial applications coming from geosciences and structural mechanics. We also compare our solver with the STRUMPACK package, based on Hierarchically Semi-Separable approximations. We conclude this thesis by reporting results on a very large problem (130 millions of unknowns) which illustrates future challenges posed by BLR multifrontal solvers at scale. Matrices creuses Systèmes linéaires creux Méthodes directes Méthode multifrontale Approximations de rang-faible Calcul haute performance Calcul parallèle Sparse matrices Direct methods for linear systems Multifrontal method Low-rank approximations High-performance computing Parallel computing Partial differential equations

Search results

Méthodes hybrides pour la résolution de grands systèmes linéaires creux sur calculateurs parallèles / The solution of large sparse linear systems on parallel computers using a hybrid implementation of the block Cimmino method

Solveurs multifrontaux exploitant des blocs de rang faible : complexité, performance et parallélisme / Block low-rank multifrontal solvers : complexity, performance, and scalability