Spelling suggestions: "subject:"message passing interface"" "subject:"essage passing interface""
1 |
Efficient scheduling of parallel applications on workstation clustersDantas, Mario A. R. January 1996 (has links)
No description available.
|
2 |
Calcul haute performance pour la simulation d'interactions fluide-structure / High performance computing for the simulation of fluid-structure interactionsPartimbene, Vincent 25 April 2018 (has links)
Cette thèse aborde la résolution des problèmes d'interaction fluide-structure par un algorithme consistant en un couplage entre deux solveurs : un pour le fluide et un pour la structure. Pour assurer la cohérence entre les maillages fluide et structure, on considère également une discrétisation de chaque domaine par volumes finis. En raison des difficultés de décomposition du domaine en sous-domaines, nous considérons pour chaque environnement un algorithme parallèle de multi-splitting (ou multi-décomposition) qui correspond à une présentation unifiée des méthodes de sous-domaines avec ou sans recouvrement. Cette méthode combine plusieurs applications de points fixes contractantes et nous montrons que, sous des hypothèses appropriées, chaque application de points fixes est contractante dans des espaces de dimensions finies normés par des normes hilbertiennes et non-hilbertiennes. De plus, nous montrons qu'une telle étude est valable pour les résolutions parallèles synchrones et plus généralement asynchrones de grands systèmes linéaires apparaissant lors de la discrétisation des problèmes d'interaction fluide-structure et peut être étendue au cas où le déplacement de la structure est soumis à des contraintes. Par ailleurs, nous pouvons également considérer l’analyse de la convergence de ces méthodes de multi-splitting parallèles asynchrones par des techniques d’ordre partiel, lié au principe du maximum discret, aussi bien dans le cadre linéaire que dans celui obtenu lorsque les déplacements de la structure sont soumis à des contraintes. Nous réalisons des simulations parallèles pour divers cas test fluide-structure sur différents clusters, en considérant des communications bloquantes et non bloquantes. Dans ce dernier cas nous avons eu à résoudre une difficulté d'implémentation dans la mesure où une erreur irrécupérable survenait lors de l'exécution ; cette difficulté a été levée par introduction d’une méthode assurant la terminaison de toutes les communications non bloquantes avant la mise à jour du maillage. Les performances des simulations parallèles sont présentées et analysées. Enfin, nous appliquons la méthodologie présentée précédemment à divers contextes d'interaction fluide-structure de type industriel sur des maillages non structurés, ce qui constitue une difficulté supplémentaire. / This thesis deals with the solution of fluid-structure interaction problems by an algorithm consisting in the coupling between two solvers: one for the fluid and one for the structure. In order to ensure the consistency between fluid and structure meshes, we also consider a discretization of each domain by finite volumes. Due to the difficulties of decomposing the domain into sub-domains, we consider a parallel multi-splitting algorithm for each environment which represents a unified presentation of sub-domain methods with or without overlapping. This method combines several contracting fixed point mappings and we show that, under appropriate assumptions, each fixed point mapping is contracting in finite dimensional spaces normalized by Hilbertian and non-Hilbertian norms. In addition, we show that such a study is valid for synchronous parallel solutions and more generally asynchronous of large linear systems arising from the discretization of fluidstructure interaction problems and can be extended to cases where the displacement of the structure is subject to constraints. Moreover, we can also consider the analysis of the convergence of these asynchronous parallel multi-splitting methods by partial ordering techniques, linked to the discrete maximum principle, both in the linear frame and in the one obtained when the structure's displacements are subjected to constraints. We carry out parallel simulations for various fluidstructure test cases on different clusters considering blocking and non-blocking communications. In the latter case, we had to solve an implementation problem due to the fact that an unrecoverable error occurred during execution; this issue has been overcome by introducing a method to ensure the termination of all non-blocking communications prior to the mesh update. Performances of parallel simulations are presented ans analyzed. Finally, we apply the methodology presented above to various fluid-structure interaction cases on unstructured meshes, which represents an additional difficulty.
|
3 |
Estudos de algumas ferramentas de coleta e visualiza??o de dados e desempenho de aplica??es paralelas no ambiente MPIFernandes, Cl?udio Ant?nio Costa 23 September 2003 (has links)
Made available in DSpace on 2014-12-17T14:56:04Z (GMT). No. of bitstreams: 1
ClaudioACF.pdf: 1310703 bytes, checksum: 20942a00fb9b1da452758bbafaf1b59d (MD5)
Previous issue date: 2003-09-23 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / The last years have presented an increase in the acceptance and adoption of the parallel processing, as much for scientific computation of high performance as for applications of general intention. This acceptance has been favored mainly for the development of environments with massive parallel processing (MPP - Massively Parallel Processing) and of the distributed computation. A common point between distributed systems and MPPs architectures is the notion of message exchange, that allows the communication between processes. An environment of message exchange consists basically of a communication library that, acting as an extension of the programming languages that allow to the elaboration of applications parallel, such as C, C++ and Fortran. In the development of applications parallel, a basic aspect is on to the analysis of performance of the same ones. Several can be the metric ones used in this analysis: time of execution, efficiency in the use of the processing elements, scalability of the application with respect to the increase in the number of processors or to the increase of the instance of the treat problem. The establishment of models or mechanisms that allow this analysis can be a task sufficiently complicated considering parameters and involved degrees of freedom in the implementation of the parallel application. An joined alternative has been the use of collection tools and visualization of performance data, that allow the user to identify to points of strangulation and sources of inefficiency in an application. For an efficient visualization one becomes necessary to identify and to collect given relative to the execution of the application, stage this called instrumentation. In this work it is presented, initially, a study of the main techniques used in the collection of the performance data, and after that a detailed analysis of the main available tools is made that can be used in architectures parallel of the type to cluster Beowulf with Linux on X86 platform being used libraries of communication based in applications MPI - Message Passing Interface, such as LAM and MPICH. This analysis is validated on applications parallel bars that deal with the problems of the training of neural nets of the type perceptrons using retro-propagation. The gotten conclusions show to the potentiality and easinesses of the analyzed tools. / Os ?ltimos anos t?m apresentado um aumento na aceita??o e ado??o do processamento paralelo, tanto para computa??o cient?fica de alto desempenho como para aplica??es de prop?sito geral. Essa aceita??o tem sido favorecida principalmente pelo desenvolvimento dos ambientes com processamento maci?amente paralelo (MPP - Massively Parallel Processing) e da computa??o distribu?da. Um ponto comum entre sistemas distribu?dos e arquiteturas MPPs ? a no??o de troca de mensagem, que permite a comunica??o entre processos. Um ambiente de troca de mensagem consiste basicamente de uma biblioteca de comunica??o que, atuando como uma extens?o das linguagens de programa??o, permite a elabora??o de aplica??es paralelas, tais como C, C++ e Fortran. No desenvolvimento de aplica??es paralelas, um aspecto fundamental esta ligado ? an?lise de desempenho das mesmas. V?rias podem ser as m?tricas utilizadas nesta an?lise: tempo de execu??o, efici?ncia na utiliza??o dos elementos de processamento, escalabilidade da aplica??o com respeito ao aumento no n?mero de processadores ou ao aumento da inst?ncia do problema tratado. O estabelecimento de modelos ou mecanismos que permitam esta an?lise pode ser uma tarefa bastante complicada considerando-se par?metros e graus de liberdade envolvidos na implementa??o da aplica??o paralela. Uma alternativa encontrada tem sido a utiliza??o de ferramentas de coleta e visualiza??o de dados de desempenho, que permitem ao usu?rio identificar pontos de estrangulamento e fontes de inefici?ncia em uma aplica??o. Para uma visualiza??o eficiente torna-se necess?rio identificar e coletar dados relativos ? execu??o da aplica??o, etapa esta denominada instrumenta??o. Neste trabalho ? apresentado, inicialmente, um estudo das principais t?cnicas utilizadas na coleta dos dados de desempenho, e em seguida ? feita uma an?lise detalhada das principais ferramentas dispon?veis que podem ser utilizadas em arquiteturas paralelas do tipo Cluster Beowulf com Linux sobre plataforma X86 utilizando bibliotecas de comunica??o baseadas em aplica??es MPI - Message Passing Interface, tais como LAM e MPICH . Esta an?lise ? validada sobre aplica??es paralelas que tratam do problema do treinamento de redes neurais do tipo perceptrons usando retropropaga??o. As conclus?es obtidas mostram as potencialidade e facilidades das ferramentas analisadas.
|
4 |
Návrh komunikačního protokolu pro generické simulátory mikroprocesorů / Design of Communication Protocol for Generic Simulators of MicroprocessorsMoskovčák, Jiří Unknown Date (has links)
This work concerns about designing of communication protocol for generic processor simulator. The main objective of this work was to design a communication protocol which allows to simulate multiprocessor system on a cluster of computers.
|
5 |
MPI Performance Engineering with the MPI Tools Information InterfaceRamesh, Srinivasan 06 September 2018 (has links)
The desire for high performance on scalable parallel systems is increasing
the complexity and the need to tune MPI implementations. The MPI Tools
Information Interface (MPI T) introduced in the MPI 3.0 standard provides
an opportunity for performance tools and external software to introspect and
understand MPI runtime behavior at a deeper level to detect scalability issues. The
interface also provides a mechanism to fine-tune the performance of the MPI library
dynamically at runtime.
This thesis describes the motivation, design, and challenges involved in
developing an MPI performance engineering infrastructure using MPI T for two performance toolkits — the TAU Performance System, and Caliper. I validate the design of the infrastructure for TAU by developing optimizations
for production and synthetic applications. I show that the MPI T runtime
introspection mechanism in Caliper enables a meaningful analysis of performance
data.
This thesis includes previously published co-authored material.
|
6 |
Optimized Composition of Parallel Components on a Linux ClusterAl-Trad, Anas January 2012 (has links)
We develop a novel framework for optimized composition of explicitly parallel software components with different implementation variants given the problem size, data distribution scheme and processor group size on a Linux cluster. We consider two approaches (or two cases of the framework). In the first approach, dispatch tables are built using measurement data obtained offline by executions for some (sample) points in the ranges of the context properties. Inter-/extrapolation is then used to do actual variant-selection for a given execution context at run-time. In the second approach, a cost function of each component variant is provided by the component writer for variant-selection. These cost functions can internally lookup measurements' tables built, either offline or at deployment time, for computation- and communication-specific primitives. In both approaches, the call to an explicitly parallel software component (with different implementation variants) is made via a dispatcher instead of calling a variant directly. As a case study, we apply both approaches on a parallel component for matrix multiplication with multiple implementation variants. We implemented our variants using Message Passing Interface (MPI). The results show the reduction in execution time for the optimally composed applications compared to applications with hard-coded composition. In addition, the results show the comparison of estimated and measured times for each variant using different data distributions, processor group and problem sizes.
|
7 |
Interprocess Communication Mechanisms With Inter-Virtual Machine Shared MemoryKe, Xiaodi Unknown Date
No description available.
|
8 |
McMPI : a managed-code message passing interface library for high performance communication in C#Holmes, Daniel John January 2012 (has links)
This work endeavours to achieve technology transfer between established best-practice in academic high-performance computing and current techniques in commercial high-productivity computing. It shows that a credible high-performance message-passing communication library, with semantics and syntax following the Message-Passing Interface (MPI) Standard, can be built in pure C# (one of the .Net suite of computer languages). Message-passing has been the dominant paradigm in high-performance parallel programming of distributed-memory computer architectures for three decades. The MPI Standard originally distilled architecture-independent and language-agnostic ideas from existing specialised communication libraries and has since been enhanced and extended. Object-oriented languages can increase programmer productivity, for example by allowing complexity to be managed through encapsulation. Both the C# computer language and the .Net common language runtime (CLR) were originally developed by Microsoft Corporation but have since been standardised by the European Computer Manufacturers Association (ECMA) and the International Standards Organisation (ISO), which facilitates portability of source-code and compiled binary programs to a variety of operating systems and hardware. Combining these two open and mature technologies enables mainstream programmers to write tightly-coupled parallel programs in a popular standardised object-oriented language that is portable to most modern operating systems and hardware architectures. This work also establishes that a thread-to-thread delivery option increases shared-memory communication performance between MPI ranks on the same node. This suggests that the thread-as-rank threading model should be explicitly specified in future versions of the MPI Standard and then added to existing MPI libraries for use by thread-safe parallel codes. This work also ascertains that the C# socket object suffers from undesirable characteristics that are critical to communication performance and proposes ways of improving the implementation of this object.
|
9 |
Calcul haute performance pour la simulation d'interactions fluide-structurePartimbene, Vincent 25 April 2018 (has links) (PDF)
Cette thèse aborde la résolution des problèmes d'interaction fluide-structure par un algorithme consistant en un couplage entre deux solveurs : un pour le fluide et un pour la structure. Pour assurer la cohérence entre les maillages fluide et structure, on considère également une discrétisation de chaque domaine par volumes finis. En raison des difficultés de décomposition du domaine en sous-domaines, nous considérons pour chaque environnement un algorithme parallèle de multi-splitting (ou multi-décomposition) qui correspond à une présentation unifiée des méthodes de sous-domaines avec ou sans recouvrement. Cette méthode combine plusieurs applications de points fixes contractantes et nous montrons que, sous des hypothèses appropriées, chaque application de points fixes est contractante dans des espaces de dimensions finies normés par des normes hilbertiennes et non-hilbertiennes. De plus, nous montrons qu'une telle étude est valable pour les résolutions parallèles synchrones et plus généralement asynchrones de grands systèmes linéaires apparaissant lors de la discrétisation des problèmes d'interaction fluide-structure et peut être étendue au cas où le déplacement de la structure est soumis à des contraintes. Par ailleurs, nous pouvons également considérer l’analyse de la convergence de ces méthodes de multi-splitting parallèles asynchrones par des techniques d’ordre partiel, lié au principe du maximum discret, aussi bien dans le cadre linéaire que dans celui obtenu lorsque les déplacements de la structure sont soumis à des contraintes. Nous réalisons des simulations parallèles pour divers cas test fluide-structure sur différents clusters, en considérant des communications bloquantes et non bloquantes. Dans ce dernier cas nous avons eu à résoudre une difficulté d'implémentation dans la mesure où une erreur irrécupérable survenait lors de l'exécution ; cette difficulté a été levée par introduction d’une méthode assurant la terminaison de toutes les communications non bloquantes avant la mise à jour du maillage. Les performances des simulations parallèles sont présentées et analysées. Enfin, nous appliquons la méthodologie présentée précédemment à divers contextes d'interaction fluide-structure de type industriel sur des maillages non structurés, ce qui constitue une difficulté supplémentaire.
|
10 |
Paralelizace sledování paprsku / Parallelization of Ray TracingČižek, Martin January 2009 (has links)
Ray tracing is widely used technique for realistic rendering of computer scenes. Its major drawback is time needed to compute the image, therefore it's usually parallelized. This thesis describes parallelization and ray tracing in general. It explains the possibility of how can be ray tracing parallelized as well as it defines the problems which may occur during the process. The result is parallel rendering application which uses selected ray tracing software and measurement of how successful this application is.
|
Page generated in 0.4779 seconds