1 |
Performance Oriented Partial Checkpoint and Migration of LAM/MPI ApplicationsSingh, Rajendra 21 January 2011 (has links)
In parallel computing, MPI is heavily used due to its support of popular cluster based parallel machines and the Single Program Multiple Data (SPMD) model. Normally cluster nodes are dedicated to a single parallel job/application but MPI could also be used with nodes that are concurrently shared by multiple users. In this case, nodes could become overloaded with work from other users. Even a few overloaded nodes can result in application slowdown. Thus, it is desirable to relocate affected processes in a running application to lightly loaded nodes by partial checkpointing and migrating of those processes.
In some MPI applications, groups of processes communicate frequently with one another. Such groups must be near one another to ensure communication efficiency. Thus, if any member of a group is to be checkpointed and migrated, all should be. It must therefore be possible to identify such groups.
I have built a prototype, using LAM/MPI, that supports partial checkpoint, migration and restart of MPI processes. To identify process groups for checkpoint and migration, I adapted TEIRESIAS (an algorithm for pattern discovery from bioinformatics) to identify frequent, recurring patterns of communication using data gathered by LAM/MPI. I then created predictors that use the discovered patterns to predict groups of communicating processes that should be checkpointed and migrated together.
I have assessed the effectiveness of my technique using synthetic and real communication data (for a small set of representative applications) to show that my predictors can accurately predict process groups for those applications. Additionally, I have created a simple simulation system to allow me to explore scenarios related to network characteristics and overload conditions under which my system might provide useful speedup.
Not all MPI applications will benefit from my approach (e.g. those with unpredictable communication patterns or large groups of frequently communicating processes). However, my experimental and simulation results suggest that my technique should be effective for a number of common application types, network characteristics and overload conditions. Using partial checkpoint and migration should therefore allow many long running applications to finish faster than if a subset of their processes was left running on overloaded nodes.
|
2 |
Performance Oriented Partial Checkpoint and Migration of LAM/MPI ApplicationsSingh, Rajendra 21 January 2011 (has links)
In parallel computing, MPI is heavily used due to its support of popular cluster based parallel machines and the Single Program Multiple Data (SPMD) model. Normally cluster nodes are dedicated to a single parallel job/application but MPI could also be used with nodes that are concurrently shared by multiple users. In this case, nodes could become overloaded with work from other users. Even a few overloaded nodes can result in application slowdown. Thus, it is desirable to relocate affected processes in a running application to lightly loaded nodes by partial checkpointing and migrating of those processes.
In some MPI applications, groups of processes communicate frequently with one another. Such groups must be near one another to ensure communication efficiency. Thus, if any member of a group is to be checkpointed and migrated, all should be. It must therefore be possible to identify such groups.
I have built a prototype, using LAM/MPI, that supports partial checkpoint, migration and restart of MPI processes. To identify process groups for checkpoint and migration, I adapted TEIRESIAS (an algorithm for pattern discovery from bioinformatics) to identify frequent, recurring patterns of communication using data gathered by LAM/MPI. I then created predictors that use the discovered patterns to predict groups of communicating processes that should be checkpointed and migrated together.
I have assessed the effectiveness of my technique using synthetic and real communication data (for a small set of representative applications) to show that my predictors can accurately predict process groups for those applications. Additionally, I have created a simple simulation system to allow me to explore scenarios related to network characteristics and overload conditions under which my system might provide useful speedup.
Not all MPI applications will benefit from my approach (e.g. those with unpredictable communication patterns or large groups of frequently communicating processes). However, my experimental and simulation results suggest that my technique should be effective for a number of common application types, network characteristics and overload conditions. Using partial checkpoint and migration should therefore allow many long running applications to finish faster than if a subset of their processes was left running on overloaded nodes.
|
3 |
ANALYSES AVANCÉES DE LA MÉTHODE HYBRIDE GMRES/LS-ARNOLDI ASYNCHRONE PARALLÈLE ET DISTRIBUÉE POUR LES GRILLES DE CALCUL ET LES SUPERCALCULATEURSHe, Haiwu 08 July 2005 (has links) (PDF)
De nombreux problèmes scientifiques et industriels ont besoin de la résolution de systèmes linéaires non symétriques à grande échelle, qui sont décrits par des matrices creuses de très grande taille. On utilise fréquemment dans ce cas des méthodes numériques itératives et on fait appel au parallélisme pour une résolution rapide et efficace. L'algorithme GMRES(m) est une méthode itérative qui donne de bons résultats dans la plupart des cas. Mais on observe une limitation à sa parallélisation en raison des nombreuses communications produites. Dans quelques cas, la convergence est atteinte très lentement, voire jamais. Nous présentons dans cette thèse une méthode hybride GMRES(m)/LS-Arnoldi qui accélère la convergence grâce à la connaissance des valeurs propres calculées parallèlement par la méthode d'Arnoldi pour les cas réels, avec son implantation sur des supercalculateurs. Une extension aux cas complexes est également étudiée. La dernière tendance du calcul global, le calcul de grille, propose l'exploitation massive des ressources vacantes des réseaux locaux ainsi que sur Internet. Son avantage peut être énorme pour l'exécution d'applications parallèles. L'environnement XtremWeb est un système de grille léger, tolérant aux défaillances et sécurisé pour l'exécution d'applications parallèles. Il est un environnement de calcul haute-performance, une plate- forme de grille logicielle d'expérimentation pour des institutions académiques ou industrielles. Nous présentons dans cette thèse les implantations de la méthode GMRES(m) sur ce système de grille XtremWeb ainsi que sur un environnement distribué de calcul LAM-MPI. Nous avons fait de multiples tests sur grille et supercalculateur. Des performances que nous avons obtenues, nous constatons les avantages et les inconvénients de ces plates-formes de calcul différentes.
|
Page generated in 0.0135 seconds