Spelling suggestions: "subject:"gmres"" "subject:"mmres""
41 |
A parallel iterative solver for large sparse linear systems enhanced with randomization and GPU accelerator, and its resilience to soft errors / Un solveur parallèle itératif pour les grands systèmes linéaires creux, amélioré par la randomisation et l'utilisation des accélérateurs GPU, et sa résilience aux fautes logiciellesJamal, Aygul 28 September 2017 (has links)
Dans cette thèse de doctorat, nous abordons trois défis auxquels sont confrontés les solveurs d'algèbres linéaires dans la perspective des futurs systèmes exascale: accélérer la convergence en utilisant des techniques innovantes au niveau algorithmique, en profitant des accélérateurs GPU (Graphics Processing Units) pour améliorer le calcul sur plusieurs systèmes, en évaluant l'impact des erreurs due à l'augmentation du parallélisme dans les superordinateurs. Nous nous intéressons à l'étude des méthodes permettant d'accélérer la convergence et le temps d'exécution des solveurs itératifs pour les grands systèmes linéaires creux. Le solveur plus spécifiquement considéré dans ce travail est le “parallel Algebraic Recursive Multilevel Solver (pARMS)” qui est un soldeur parallèle sur mémoire distribuée basé sur les méthodes de sous-espace de Krylov.Tout d'abord, nous proposons d'intégrer une technique de randomisation appelée “Random Butterfly Transformations (RBT)” qui a été proposée avec succès pour éliminer le coût du pivotage dans la résolution des systèmes linéaires denses. Notre objectif est d'appliquer cette technique dans le préconditionneur ARMS de pARMS pour résoudre plus efficacement le dernier système Complément de Schur dans l'application du processus à multi-niveaux récursif. En raison de l'importance considérable du dernier Complément de Schur pour certains problèmes de test, nous proposons également d'utiliser une variante creux de RBT suivie d'un solveur direct creux (SuperLU). Les résultats expérimentaux sur certaines matrices de la collection de Davis montrent une amélioration de la convergence et de la précision par rapport aux implémentations existantes.Ensuite, nous illustrons comment une approche non intrusive peut être appliquée pour implémenter des calculs GPU dans le solveur pARMS, plus particulièrement pour la phase de préconditionnement locale qui représente une partie importante du temps pour la résolution. Nous comparons les solveurs purement CPU avec les solveurs hybrides CPU / GPU sur plusieurs problèmes de test issus d'applications physiques. Les résultats de performance du solveur hybride CPU / GPU utilisant le préconditionnement ARMS combiné avec RBT, ou le préconditionnement ILU(0), montrent un gain de performance jusqu'à 30% sur les problèmes de test considérés dans nos expériences.Enfin, nous étudions l'effet des défaillances logicielles variable sur la convergence de la méthode itérative flexible GMRES (FGMRES) qui est couramment utilisée pour résoudre le système préconditionné dans pARMS. Le problème ciblé dans nos expériences est un problème elliptique PDE sur une grille régulière. Nous considérons deux types de préconditionneurs: une factorisation LU incomplète à double seuil (ILUT) et le préconditionneur ARMS combiné avec randomisation RBT. Nous considérons deux modèle de fautes logicielles différentes où nous perturbons la multiplication du vecteur matriciel et la phase de préconditionnement, et nous comparons leur impact potentiel sur la convergence. / In this PhD thesis, we address three challenges faced by linear algebra solvers in the perspective of future exascale systems: accelerating convergence using innovative techniques at the algorithm level, taking advantage of GPU (Graphics Processing Units) accelerators to enhance the performance of computations on hybrid CPU/GPU systems, evaluating the impact of errors in the context of an increasing level of parallelism in supercomputers. We are interested in studying methods that enable us to accelerate convergence and execution time of iterative solvers for large sparse linear systems. The solver specifically considered in this work is the parallel Algebraic Recursive Multilevel Solver (pARMS), which is a distributed-memory parallel solver based on Krylov subspace methods.First we integrate a randomization technique referred to as Random Butterfly Transformations (RBT) that has been successfully applied to remove the cost of pivoting in the solution of dense linear systems. Our objective is to apply this method in the ARMS preconditioner to solve more efficiently the last Schur complement system in the application of the recursive multilevel process in pARMS. The experimental results show an improvement of the convergence and the accuracy. Due to memory concerns for some test problems, we also propose to use a sparse variant of RBT followed by a sparse direct solver (SuperLU), resulting in an improvement of the execution time.Then we explain how a non intrusive approach can be applied to implement GPU computing into the pARMS solver, more especially for the local preconditioning phase that represents a significant part of the time to compute the solution. We compare the CPU-only and hybrid CPU/GPU variant of the solver on several test problems coming from physical applications. The performance results of the hybrid CPU/GPU solver using the ARMS preconditioning combined with RBT, or the ILU(0) preconditioning, show a performance gain of up to 30% on the test problems considered in our experiments.Finally we study the effect of soft fault errors on the convergence of the commonly used flexible GMRES (FGMRES) algorithm which is also used to solve the preconditioned system in pARMS. The test problem in our experiments is an elliptical PDE problem on a regular grid. We consider two types of preconditioners: an incomplete LU factorization with dual threshold (ILUT), and the ARMS preconditioner combined with RBT randomization. We consider two soft fault error modeling approaches where we perturb the matrix-vector multiplication and the application of the preconditioner, and we compare their potential impact on the convergence of the solver.
|
42 |
Parallélisme et robustesse des solveurs hybrides pour grands systèmes linéaires : Application à l'optimisation en dynamique des fluidesNuentsa Wakam, Désiré 07 December 2011 (has links) (PDF)
Cette thèse présente un ensemble de routines pour la résolution des grands systèmes linéaires creuses sur des architectures parallèles. Les approches proposées s'inscrivent dans un schéma hybride combinant les méthodes directes et itératives à travers l'utilisation des techniques de décomposition de domaine. Dans un tel schéma, le problème initial est divisé en sous-problèmes en effectuant un partitionnement du graphe de la matrice coefficient du système. Les méthodes de Schwarz sont ensuite utilisées comme outils de préconditionnements des méthodes de Krylov basées sur GMRES. Nous nous intéressons tout d'abord au schéma utilisant un préconditionneur de Schwarz multiplicatif. Nous définissons deux niveaux de parallélisme: le premier est associé à GMRES préconditionné sur le système global et le second est utilisé pour résoudre les sous-systèmes à l'aide d'une méthode directe parallèle. Nous montrons que ce découpage permet de garantir une certaine robustesse à la méthode en limitant le nombre total de sous-domaines. De plus, cette approche permet d'utiliser plus efficacement tous les processeurs alloués sur un noeud de calcul. Nous nous intéressons ensuite à la convergence et au parallélisme de GMRES qui est utilisée comme accélerateur global dans l'approche hybride. L'observation générale est que le nombre global d'itérations, et donc le temps de calcul global, augmente avec le nombre de partitions. Pour réduire cet effet, nous proposons plusieurs versions de GMRES basés sur la déflation. Les techniques de déflation proposées utilisent soit un préconditionnement adaptatif soit une base augmentée. Nous montrons l'utilité de ces approches dans leur capacité à limiter l'influence du choix d'une taille de base de Krylov adaptée, et donc à éviter une stagnation de la méthode hybride globale. De plus, elles permettent de réduire considérablement le coût mémoire, le temps de calcul ainsi que le nombre de messages échangés par les différents processeurs. Les performances de ces méthodes sont démontrées numériquement sur des systèmes linéaires de grande taille provenant de plusieurs champs d'application, et principalement de l'optimisation de certains paramètres de conception en dynamique des fluides.
|
43 |
Numerical tools for the large eddy simulation of incompressible turbulent flows and application to flows over re-entry capsules/Outils numériques pour la simulation des grandes échelles d'écoulements incompressibles turbulents et application aux écoulements autour de capsules de rentréeRasquin, Michel 29 April 2010 (has links)
The context of this thesis is the numerical simulation of turbulent flows at moderate Reynolds numbers and the improvement of the capabilities of an in-house 3D unsteady and incompressible flow solver called SFELES to simulate such flows.
In addition to this abstract, this thesis includes five other chapters.
The second chapter of this thesis presents the numerical methods implemented in the two CFD solvers used as part of this work, namely SFELES and PHASTA.
The third chapter concentrates on the implementation of a new library called FlexMG. This library allows the use of various types of iterative solvers preconditioned by algebraic multigrid methods, which require much less memory to solve linear systems than a direct sparse LU solver available in SFELES. Multigrid is an iterative procedure that relies on a series of increasingly coarser approximations of the original 'fine' problem. The underlying concept is the following: low wavenumber errors on fine grids become high wavenumber errors on coarser levels, which can be effectively removed by applying fixed-point methods on coarser levels.
Two families of algebraic multigrid preconditioners have been implemented in FlexMG, namely smooth aggregation-type and non-nested finite element-type. Unlike pure gridless multigrid, both of these families use the information contained in the initial fine mesh. A hierarchy of coarse meshes is also needed for the non-nested finite element-type multigrid so that our approaches can be considered as hybrid. Our aggregation-type multigrid is smoothed with either a constant or a linear least square fitting function, whereas the non-nested finite element-type multigrid is already smooth by construction. All these multigrid preconditioners are tested as stand-alone solvers or coupled with a GMRES (Generalized Minimal RESidual) method. After analyzing the accuracy of the solutions obtained with our solvers on a typical test case in fluid mechanics (unsteady flow past a circular cylinder at low Reynolds number), their performance in terms of convergence rate, computational speed and memory consumption is compared with the performance of a direct sparse LU solver as a reference. Finally, the importance of using smooth interpolation operators is also underlined in this work.
The fourth chapter is devoted to the study of subgrid scale models for the large eddy simulation (LES) of turbulent flows.
It is well known that turbulence features a cascade process by which kinetic energy is transferred from the large turbulent scales to the smaller ones. Below a certain size, the smallest structures are dissipated into heat because of the effect of the viscous term in the Navier-Stokes equations.
In the classical formulation of LES models, all the resolved scales are used to model the contribution of the unresolved scales. However, most of the energy exchanges between scales are local, which means that the energy of the unresolved scales derives mainly from the energy of the small resolved scales.
In this fourth chapter, constant-coefficient-based Smagorinsky and WALE models are considered under different formulations. This includes a classical version of both the Smagorinsky and WALE models and several scale-separation formulations, where the resolved velocity field is filtered in order to separate the small turbulent scales from the large ones. From this separation of turbulent scales, the strain rate tensor and/or the eddy viscosity of the subgrid scale model is computed from the small resolved scales only. One important advantage of these scale-separation models is that the dissipation they introduce through their subgrid scale stress tensor is better controlled compared to their classical version, where all the scales are taken into account without any filtering. More precisely, the filtering operator (based on a top hat filter in this work) allows the decomposition u' = u - ubar, where u is the resolved velocity field (large and small resolved scales), ubar is the filtered velocity field (large resolved scales) and u' is the small resolved scales field.
At last, two variational multiscale (VMS) methods are also considered.
The philosophy of the variational multiscale methods differs significantly from the philosophy of the scale-separation models. Concretely, the discrete Navier-Stokes equations have to be projected into two disjoint spaces so that a set of equations characterizes the evolution of the large resolved scales of the flow, whereas another set governs the small resolved scales.
Once the Navier-Stokes equations have been projected into these two spaces associated with the large and small scales respectively, the variational multiscale method consists in adding an eddy viscosity model to the small scales equations only, leaving the large scales equations unchanged. This projection is obvious in the case of a full spectral discretization of the Navier-Stokes equations, where the evolution of the large and small scales is governed by the equations associated with the low and high wavenumber modes respectively. This projection is more complex to achieve in the context of a finite element discretization.
For that purpose, two variational multiscale concepts are examined in this work.
The first projector is based on the construction of aggregates, whereas the second projector relies on the implementation of hierarchical linear basis functions.
In order to gain some experience in the field of LES modeling, some of the above-mentioned models were implemented first in another code called PHASTA and presented along with SFELES in the second chapter.
Finally, the relevance of our models is assessed with the large eddy simulation of a fully developed turbulent channel flow at a low Reynolds number under statistical equilibrium. In addition to the analysis of the mean eddy viscosity computed for all our LES models, comparisons in terms of shear stress, root mean square velocity fluctuation and mean velocity are performed with a fully resolved direct numerical simulation as a reference.
The fifth chapter of the thesis focuses on the numerical simulation of the 3D turbulent flow over a re-entry Apollo-type capsule at low speed with SFELES. The Reynolds number based on the heat shield is set to Re=10^4 and the angle of attack is set to 180º, that is the heat shield facing the free stream. Only the final stage of the flight is considered in this work, before the splashdown or the landing, so that the incompressibility hypothesis in SFELES is still valid.
Two LES models are considered in this chapter, namely a classical and a scale-separation version of the WALE model. Although the capsule geometry is axisymmetric, the flow field in its wake is not and induces unsteady forces and moments acting on the capsule. The characterization of the phenomena occurring in the wake of the capsule and the determination of their main frequencies are essential to ensure the static and dynamic stability during the final stage of the flight.
Visualizations by means of 3D isosurfaces and 2D slices of the Q-criterion and the vorticity field confirm the presence of a large meandering recirculation zone characterized by a low Strouhal number, that is St≈0.15.
Due to the detachment of the flow at the shoulder of the capsule, a resulting annular shear layer appears. This shear layer is then affected by some Kelvin-Helmholtz instabilities and ends up rolling up, leading to the formation of vortex rings characterized by a high frequency. This vortex shedding depends on the Reynolds number so that a Strouhal number St≈3 is detected at Re=10^4.
Finally, the analysis of the force and moment coefficients reveals the existence of a lateral force perpendicular to the streamwise direction in the case of the scale-separation WALE model, which suggests that the wake of the capsule may have some
preferential orientations during the vortex shedding. In the case of the classical version of the WALE model, no lateral force has been observed so far so that the mean flow is thought to be still axisymmetric after 100 units of non-dimensional physical time.
Finally, the last chapter of this work recalls the main conclusions drawn from the previous chapters.
|
44 |
Numerical tools for the large eddy simulation of incompressible turbulent flows and application to flows over re-entry capsules / Outils numériques pour la simulation des grandes échelles d'écoulements incompressibles turbulents et application aux écoulements autour de capsules de rentréeRasquin, Michel 29 April 2010 (has links)
The context of this thesis is the numerical simulation of turbulent flows at moderate Reynolds numbers and the improvement of the capabilities of an in-house 3D unsteady and incompressible flow solver called SFELES to simulate such flows.<p>In addition to this abstract, this thesis includes five other chapters.<p><p>The second chapter of this thesis presents the numerical methods implemented in the two CFD solvers used as part of this work, namely SFELES and PHASTA.<p><p>The third chapter concentrates on the implementation of a new library called FlexMG. This library allows the use of various types of iterative solvers preconditioned by algebraic multigrid methods, which require much less memory to solve linear systems than a direct sparse LU solver available in SFELES. Multigrid is an iterative procedure that relies on a series of increasingly coarser approximations of the original 'fine' problem. The underlying concept is the following: low wavenumber errors on fine grids become high wavenumber errors on coarser levels, which can be effectively removed by applying fixed-point methods on coarser levels.<p>Two families of algebraic multigrid preconditioners have been implemented in FlexMG, namely smooth aggregation-type and non-nested finite element-type. Unlike pure gridless multigrid, both of these families use the information contained in the initial fine mesh. A hierarchy of coarse meshes is also needed for the non-nested finite element-type multigrid so that our approaches can be considered as hybrid. Our aggregation-type multigrid is smoothed with either a constant or a linear least square fitting function, whereas the non-nested finite element-type multigrid is already smooth by construction. All these multigrid preconditioners are tested as stand-alone solvers or coupled with a GMRES (Generalized Minimal RESidual) method. After analyzing the accuracy of the solutions obtained with our solvers on a typical test case in fluid mechanics (unsteady flow past a circular cylinder at low Reynolds number), their performance in terms of convergence rate, computational speed and memory consumption is compared with the performance of a direct sparse LU solver as a reference. Finally, the importance of using smooth interpolation operators is also underlined in this work.<p><p>The fourth chapter is devoted to the study of subgrid scale models for the large eddy simulation (LES) of turbulent flows.<p>It is well known that turbulence features a cascade process by which kinetic energy is transferred from the large turbulent scales to the smaller ones. Below a certain size, the smallest structures are dissipated into heat because of the effect of the viscous term in the Navier-Stokes equations.<p>In the classical formulation of LES models, all the resolved scales are used to model the contribution of the unresolved scales. However, most of the energy exchanges between scales are local, which means that the energy of the unresolved scales derives mainly from the energy of the small resolved scales.<p>In this fourth chapter, constant-coefficient-based Smagorinsky and WALE models are considered under different formulations. This includes a classical version of both the Smagorinsky and WALE models and several scale-separation formulations, where the resolved velocity field is filtered in order to separate the small turbulent scales from the large ones. From this separation of turbulent scales, the strain rate tensor and/or the eddy viscosity of the subgrid scale model is computed from the small resolved scales only. One important advantage of these scale-separation models is that the dissipation they introduce through their subgrid scale stress tensor is better controlled compared to their classical version, where all the scales are taken into account without any filtering. More precisely, the filtering operator (based on a top hat filter in this work) allows the decomposition u' = u - ubar, where u is the resolved velocity field (large and small resolved scales), ubar is the filtered velocity field (large resolved scales) and u' is the small resolved scales field. <p>At last, two variational multiscale (VMS) methods are also considered.<p>The philosophy of the variational multiscale methods differs significantly from the philosophy of the scale-separation models. Concretely, the discrete Navier-Stokes equations have to be projected into two disjoint spaces so that a set of equations characterizes the evolution of the large resolved scales of the flow, whereas another set governs the small resolved scales. <p>Once the Navier-Stokes equations have been projected into these two spaces associated with the large and small scales respectively, the variational multiscale method consists in adding an eddy viscosity model to the small scales equations only, leaving the large scales equations unchanged. This projection is obvious in the case of a full spectral discretization of the Navier-Stokes equations, where the evolution of the large and small scales is governed by the equations associated with the low and high wavenumber modes respectively. This projection is more complex to achieve in the context of a finite element discretization. <p>For that purpose, two variational multiscale concepts are examined in this work.<p>The first projector is based on the construction of aggregates, whereas the second projector relies on the implementation of hierarchical linear basis functions.<p>In order to gain some experience in the field of LES modeling, some of the above-mentioned models were implemented first in another code called PHASTA and presented along with SFELES in the second chapter.<p>Finally, the relevance of our models is assessed with the large eddy simulation of a fully developed turbulent channel flow at a low Reynolds number under statistical equilibrium. In addition to the analysis of the mean eddy viscosity computed for all our LES models, comparisons in terms of shear stress, root mean square velocity fluctuation and mean velocity are performed with a fully resolved direct numerical simulation as a reference.<p><p>The fifth chapter of the thesis focuses on the numerical simulation of the 3D turbulent flow over a re-entry Apollo-type capsule at low speed with SFELES. The Reynolds number based on the heat shield is set to Re=10^4 and the angle of attack is set to 180º, that is the heat shield facing the free stream. Only the final stage of the flight is considered in this work, before the splashdown or the landing, so that the incompressibility hypothesis in SFELES is still valid.<p>Two LES models are considered in this chapter, namely a classical and a scale-separation version of the WALE model. Although the capsule geometry is axisymmetric, the flow field in its wake is not and induces unsteady forces and moments acting on the capsule. The characterization of the phenomena occurring in the wake of the capsule and the determination of their main frequencies are essential to ensure the static and dynamic stability during the final stage of the flight. <p>Visualizations by means of 3D isosurfaces and 2D slices of the Q-criterion and the vorticity field confirm the presence of a large meandering recirculation zone characterized by a low Strouhal number, that is St≈0.15.<p>Due to the detachment of the flow at the shoulder of the capsule, a resulting annular shear layer appears. This shear layer is then affected by some Kelvin-Helmholtz instabilities and ends up rolling up, leading to the formation of vortex rings characterized by a high frequency. This vortex shedding depends on the Reynolds number so that a Strouhal number St≈3 is detected at Re=10^4.<p>Finally, the analysis of the force and moment coefficients reveals the existence of a lateral force perpendicular to the streamwise direction in the case of the scale-separation WALE model, which suggests that the wake of the capsule may have some <p>preferential orientations during the vortex shedding. In the case of the classical version of the WALE model, no lateral force has been observed so far so that the mean flow is thought to be still axisymmetric after 100 units of non-dimensional physical time.<p><p>Finally, the last chapter of this work recalls the main conclusions drawn from the previous chapters. / Doctorat en Sciences de l'ingénieur / info:eu-repo/semantics/nonPublished
|
45 |
On numerical resilience in linear algebra / Conception d'algorithmes numériques pour la résilience en algèbre linéaireZounon, Mawussi 01 April 2015 (has links)
Comme la puissance de calcul des systèmes de calcul haute performance continue de croître, en utilisant un grand nombre de cœurs CPU ou d’unités de calcul spécialisées, les applications hautes performances destinées à la résolution des problèmes de très grande échelle sont de plus en plus sujettes à des pannes. En conséquence, la communauté de calcul haute performance a proposé de nombreuses contributions pour concevoir des applications tolérantes aux pannes. Cette étude porte sur une nouvelle classe d’algorithmes numériques de tolérance aux pannes au niveau de l’application qui ne nécessite pas de ressources supplémentaires, à savoir, des unités de calcul ou du temps de calcul additionnel, en l’absence de pannes. En supposant qu’un mécanisme distinct assure la détection des pannes, nous proposons des algorithmes numériques pour extraire des informations pertinentes à partir des données disponibles après une pannes. Après l’extraction de données, les données critiques manquantes sont régénérées grâce à des stratégies d’interpolation pour constituer des informations pertinentes pour redémarrer numériquement l’algorithme. Nous avons conçu ces méthodes appelées techniques d’Interpolation-restart pour des problèmes d’algèbre linéaire numérique tels que la résolution de systèmes linéaires ou des problèmes aux valeurs propres qui sont indispensables dans de nombreux noyaux scientifiques et applications d’ingénierie. La résolution de ces problèmes est souvent la partie dominante; en termes de temps de calcul, des applications scientifiques. Dans le cadre solveurs linéaires du sous-espace de Krylov, les entrées perdues de l’itération sont interpolées en utilisant les entrées disponibles sur les nœuds encore disponibles pour définir une nouvelle estimation de la solution initiale avant de redémarrer la méthode de Krylov. En particulier, nous considérons deux politiques d’interpolation qui préservent les propriétés numériques clés de solveurs linéaires bien connus, à savoir la décroissance monotone de la norme-A de l’erreur du gradient conjugué ou la décroissance monotone de la norme résiduelle de GMRES. Nous avons évalué l’impact du taux de pannes et l’impact de la quantité de données perdues sur la robustesse des stratégies de résilience conçues. Les expériences ont montré que nos stratégies numériques sont robustes même en présence de grandes fréquences de pannes, et de perte de grand volume de données. Dans le but de concevoir des solveurs résilients de résolution de problèmes aux valeurs propres, nous avons modifié les stratégies d’interpolation conçues pour les systèmes linéaires. Nous avons revisité les méthodes itératives de l’état de l’art pour la résolution des problèmes de valeurs propres creux à la lumière des stratégies d’Interpolation-restart. Pour chaque méthode considérée, nous avons adapté les stratégies d’Interpolation-restart pour régénérer autant d’informations spectrale que possible. Afin d’évaluer la performance de nos stratégies numériques, nous avons considéré un solveur parallèle hybride (direct/itérative) pleinement fonctionnel nommé MaPHyS pour la résolution des systèmes linéaires creux, et nous proposons des solutions numériques pour concevoir une version tolérante aux pannes du solveur. Le solveur étant hybride, nous nous concentrons dans cette étude sur l’étape de résolution itérative, qui est souvent l’étape dominante dans la pratique. Les solutions numériques proposées comportent deux volets. A chaque fois que cela est possible, nous exploitons la redondance de données entre les processus du solveur pour effectuer une régénération exacte des données en faisant des copies astucieuses dans les processus. D’autre part, les données perdues qui ne sont plus disponibles sur aucun processus sont régénérées grâce à un mécanisme d’interpolation. / As the computational power of high performance computing (HPC) systems continues to increase by using huge number of cores or specialized processing units, HPC applications are increasingly prone to faults. This study covers a new class of numerical fault tolerance algorithms at application level that does not require extra resources, i.e., computational unit or computing time, when no fault occurs. Assuming that a separate mechanism ensures fault detection, we propose numerical algorithms to extract relevant information from available data after a fault. After data extraction, well chosen part of missing data is regenerated through interpolation strategies to constitute meaningful inputs to numerically restart the algorithm. We have designed these methods called Interpolation-restart techniques for numerical linear algebra problems such as the solution of linear systems or eigen-problems that are the inner most numerical kernels in many scientific and engineering applications and also often ones of the most time consuming parts. In the framework of Krylov subspace linear solvers the lost entries of the iterate are interpolated using the available entries on the still alive nodes to define a new initial guess before restarting the Krylov method. In particular, we consider two interpolation policies that preserve key numerical properties of well-known linear solvers, namely the monotony decrease of the A-norm of the error of the conjugate gradient or the residual norm decrease of GMRES. We assess the impact of the fault rate and the amount of lost data on the robustness of the resulting linear solvers.For eigensolvers, we revisited state-of-the-art methods for solving large sparse eigenvalue problems namely the Arnoldi methods, subspace iteration methods and the Jacobi-Davidson method, in the light of Interpolation-restart strategies. For each considered eigensolver, we adapted the Interpolation-restart strategies to regenerate as much spectral information as possible. Through intensive experiments, we illustrate the qualitative numerical behavior of the resulting schemes when the number of faults and the amount of lost data are varied; and we demonstrate that they exhibit a numerical robustness close to that of fault-free calculations. In order to assess the efficiency of our numerical strategies, we have consideredan actual fully-featured parallel sparse hybrid (direct/iterative) linear solver, MaPHyS, and we proposed numerical remedies to design a resilient version of the solver. The solver being hybrid, we focus in this study on the iterative solution step, which is often the dominant step in practice. The numerical remedies we propose are twofold. Whenever possible, we exploit the natural data redundancy between processes from the solver toperform an exact recovery through clever copies over processes. Otherwise, data that has been lost and is not available anymore on any process is recovered through Interpolationrestart strategies. These numerical remedies have been implemented in the MaPHyS parallel solver so that we can assess their efficiency on a large number of processing units (up to 12; 288 CPU cores) for solving large-scale real-life problems.
|
Page generated in 0.0423 seconds