Global ETD Search

171	Co-designing Communication Middleware and Deep Learning Frameworks for High-Performance DNN Training on HPC Systems Awan, Ammar Ahmad 10 September 2020 (has links) No description available. Computer Science Artificial Intelligence Data Parallelism Model Parallelism Hybrid Parallelism Keras Caffe TensorFlow PyTorch, MPI, Eager Execution Deep Learning Scalable DNN Training MVAPICH2-GDR CUDA-Aware MPI
172	P2P-MPI : A fault-tolerant Message Passing Interface Implementation for Grids Rattanapoka, Choopan 22 April 2008 (has links) (PDF) Cette thèse démontre la faisabilité d'un intergiciel destiné aux grilles de calcul, prenant en compte la dynamicité de ce type de plateforme, et les impératifs des programmes parallèles à passage de message. Pour cela, nous mettons en avant l'intérêt d'utiliser une architecture la plus distribuée possible : nous reprenons l'idée d'une infrastructure pair-à-pair pour l'organisation des ressources, qui facilite notamment la découverte des ressources, et nous retenons les détecteurs de défaillance distribués pour gérer la tolérance aux pannes. La dynamicité de ce type d'environnement est également un problème pour le modèle d'exécution sous-jacent à MPI, car la panne d'un seul processus entraine l'arrêt de l'application. La contribution de P2P-MPI dans ce domaine est la tolérance aux pannes par réplication. Nous pensons qu'elle est la mieux adaptée à une architecture pair-à-pair, les techniques classiques basées sur le check-point and restart nécessitant un ou des serveurs de sauvegardes. De plus, la réplication est totalement transparente à l'utilisateur et rejoint ainsi l'objectif de simplicité d'utilisation que nous nous sommes fixés. Nous pensons que garder un environnement très simple d'utilisation, entièrement maîtrisable par un utilisateur, est un des facteurs permettant d'augmenter le nombre de ressources disponibles sur la grille. Enfin, la contribution majeure de P2P-MPI est la librairie de communication proposée, qui est une implémentation de MPJ (MPI adapté à Java), et qui intègre la réplication des processus. Ce point particulier de notre travail plaide pour une collaboration étroite entre l'intergiciel, qui connaît l'état de la grille (détection des pannes par exemple) et la couche de communication qui peut adapter son comportement en connaissance de cause. systèmes distribués tolérance aux pannes MPI grille
173	A parallel explicit incompressible smoothed particle hydrodynamics (ISPH) model for nonlinear hydrodynamic applications Yeylaghi, Shahab 09 December 2016 (has links) Fluid structure interactions in the presence of a free surface includes complex phenomena, such as slamming, air entrainment, transient loads, complex free surface profiles and turbulence. Hence, an appropriate and efficient numerical method is required to deal with these type of problems (efficient both in problem setup and numerical solution). Eulerian mesh-based methods can be used to solve different types of problems, however they have difficulties in problems involving moving boundaries and discontinuities (e.g. fluid structure interactions in the presence of a free surface). Smoothed Particle Hydrodynamics (SPH) is a mesh-less Lagrangian particle method, ideal for solving problems with large deformation and fragmentation such as complex free surface flows. The SPH method was originally invented to study astrophysical applications and requires modifications in order to be applied for hydrodynamic applications. Applying solid boundary conditions for hydrodynamic applications in SPH is a key difference to the original SPH developed for astrophysics. There are several methods available in literature to apply solid boundaries in SPH. In this research, an accurate solid boundary condition is used to calculate the pressure at the boundary particles based on the surrounding fluid particles. The two main methods to calculate the pressure in the SPH method are the weakly compressible SPH (WCSPH) and the incompressible SPH (ISPH) approaches. The WCSPH uses the equation of state while ISPH solves Poisson's equation to determine the pressure. In this dissertation, an explicit incompressible SPH (ISPH) method is used to study nonlinear free surface applications. In the explicit ISPH method, Poisson's equation is explicitly solved to calculate the pressure within a projection based algorithm. This method does not require solving a set of algebraic equations for pressure at each time step unlike the implicit method. Here, an accurate boundary condition along with an accurate source term for Poisson's equation is used within the explicit method. Also, the sub-particle turbulent calculation is applied to the explicit ISPH method (which handles large-scale turbulent structures implicitly) in order to calculate the flow field quantities and consequently forces on the device more accurately. The SPH method is typically computationally more expensive than Eulerian-based CFD methods. Therefore, parallelization methods are required to improve the performance of the method, especially for 3D simulations. In this dissertation, two novel parallel schemes are developed based on Open Multi Processing (OpenMP) and Message Passing Interface (MPI) standards. The explicit ISPH approach is an advantage for parallel computing but our proposed method could also be applied to the WCSPH or implicit ISPH. The proposed SPH model is used to simulate and analyze several nonlinear free surface problems. First, the proposed explicit ISPH method is used to simulate a transient wave overtopping on a horizontal deck. Second, a wave impacting on a scaled oscillating wave surge converter (OWSC) is simulated and studied. Third, the performance and accuracy of the code is tested for a dam-break impacting on tall and short structures. Forth, the hydrodynamic loads from the spar of a scaled self-reacting point absorber wave energy converter (WEC) design is studied. Finally, a comprehensive set of landslide generated waves are modeled and analyzed and a new technique is proposed to calculate the motion of a slide on an inclined ramp implicitly without using a prescribed motion. / Graduate Hydrodynamic Applications SPH Oscillating Wave Surge Converter Landslide Generated Waves Explicit ISPH MPI
174	Hybrid MPI - uma implementação MPI para ambientes distribuídos híbridos. / Hybrid MPI - a MPI implementation for hybrid distributed systems. Massetto, Francisco Isidro 04 October 2007 (has links) O crescente desenvolvimento de aplicações de alto desempenho é uma realidade presente nos dias atuais. Entretanto, a diversidade de arquiteturas de máquinas, incluindo monoprocessadores e multiprocessadores, clusters com ou sem máquina front-end, variedade de sistemas operacionais e implementações da biblioteca MPI tem aumentado cada dia mais. Tendo em vista este cenário, bibliotecas que proporcionem a integração de diversas implementações MPI, sistemas operacionais e arquiteturas de máquinas são necessárias. Esta tese apresenta o HyMPI, uma implementação da biblioteca MPI voltada para integração, em um mesmo ambiente distribuído de alto desempenho, nós com diferentes arquiteturas, clusters com ou sem máquina front-end, sistemas operacionais e implementações MPI. HyMPI oferece um conjunto de primitivas compatíveis com a especificação MPI, incluindo comunicação ponto a ponto, operações coletivas, inicio e termino, além de outras primitivas utilitárias. / The increasing develpment of high performance applications is a reality on current days. However, the diversity of computer architectures, including mono and multiprocessor machines, clusters with or without front-end node, the variety of operating systems and MPI implementations has growth increasingly. Focused on this scenario, programming libraries that allows integration of several MPI implementations, operating systems and computer architectures are needed. This thesis introduces HyMPI, a MPI implementation aiming integratino, on a distributed high performance system nodes with different architectures, clusters with or without front-end machine, operating systems and MPI implementations. HyMPI offers a set of primitives based on MPI specification, including point-to-point communication, collective operations, startup and finalization and some other utility functions. Distributed systems High performance computing Middleware MPI Redes de computadores Sistemas distribuídos
175	Uma biblioteca para desenvolvimento de aplicações CUDA em aglomerados de GPUS Morais Junior, Aderbal de January 2013 (has links) Orientador: Raphael Yokoingawa de Camargo / Dissertação (mestrado) - Universidade Federal do ABC. Programa de Pós-Graduação em Ciências da Computação, 2013 CUDA MPI GPU CLUSTER AGLOMERADO
176	Prostředky paralelního programování a jejich implementace / Means of parallel programming and their implementation Krejčová, Iva January 2011 (has links) The aim of this Diploma thesis is to get acquitained with the approaches to parallel programming and possibilities of their practical implementation, including possibilities of their usage in management. An important part of the Diploma thesis is the practical implementation of parallel program in a PC cluster environment, which was implemented in computer laboratory of Faculty of Management VŠE. The practical part consists of an example of decision-making under uncertainty (risk) which is solved with the employment of the Monte Carlo method.
177	Hybrid MPI - uma implementação MPI para ambientes distribuídos híbridos. / Hybrid MPI - a MPI implementation for hybrid distributed systems. Francisco Isidro Massetto 04 October 2007 (has links) O crescente desenvolvimento de aplicações de alto desempenho é uma realidade presente nos dias atuais. Entretanto, a diversidade de arquiteturas de máquinas, incluindo monoprocessadores e multiprocessadores, clusters com ou sem máquina front-end, variedade de sistemas operacionais e implementações da biblioteca MPI tem aumentado cada dia mais. Tendo em vista este cenário, bibliotecas que proporcionem a integração de diversas implementações MPI, sistemas operacionais e arquiteturas de máquinas são necessárias. Esta tese apresenta o HyMPI, uma implementação da biblioteca MPI voltada para integração, em um mesmo ambiente distribuído de alto desempenho, nós com diferentes arquiteturas, clusters com ou sem máquina front-end, sistemas operacionais e implementações MPI. HyMPI oferece um conjunto de primitivas compatíveis com a especificação MPI, incluindo comunicação ponto a ponto, operações coletivas, inicio e termino, além de outras primitivas utilitárias. / The increasing develpment of high performance applications is a reality on current days. However, the diversity of computer architectures, including mono and multiprocessor machines, clusters with or without front-end node, the variety of operating systems and MPI implementations has growth increasingly. Focused on this scenario, programming libraries that allows integration of several MPI implementations, operating systems and computer architectures are needed. This thesis introduces HyMPI, a MPI implementation aiming integratino, on a distributed high performance system nodes with different architectures, clusters with or without front-end machine, operating systems and MPI implementations. HyMPI offers a set of primitives based on MPI specification, including point-to-point communication, collective operations, startup and finalization and some other utility functions. Middleware Redes de computadores Sistemas distribuídos Distributed systems High performance computing MPI
178	An Analyzer for Message Passing Programs Huang, Yu 01 May 2016 (has links) Asynchronous message passing systems are fast becoming a common means for communication between devices. Two problems existing in message passing programs are difficult to solve. The first problem, intended or otherwise, is message-race where a receive may match with more than one send in the runtime system. This non-determinism often leads to intermittent and unexpected behavior depending on the resolution of the race. Another problem is deadlock, which is a situation in that each member process of the group is waiting for some member process to communicate with it, but no member is attempting to communicate with it. Detecting if message-race and/or deadlocks exist in a message passing program are both NP-complete. The difficulty of solving the two problems also comes from three factors that complicate the semantics: asynchronous communication, synchronous barrier, and buffering settings including infinite buffering (the system can buffer messages) and zero buffering (the system has no internal buffering). To solve the above problems with complicating factors, this research provides a novel predictive analysis that initializes a concrete execution and then predicts the behavior of other executions that arise from the initial execution. This research starts with Satisfiability Modulo Theories (SMT) based model checking that provides precise analysis for the program behavior. Unfortunately, a precise analysis using SMT does not scale to large programs. As such, the SMT based model checking is combined with heuristic search for witnessing program properties. The heuristic search is efficient in identifying how sends may match with receives in the runtime as it only looks for the match relations for sends and receives in a small searching space initially; the space is increased only if the program property is not witnessed, until all possible match relations for sends and receives reflected in message non-determinism are found. This research also gives a static analysis approach that is scalable as it does not need to analyze the full set of program behaviors; rather, the static analysis only uses polynomial-time algorithms to identify all potential deadlocks in a send-receive templates given a set of pre-defined deadlock patterns. Given the predictive analysis consisting of SMT based model checking with heuristic search and static analysis, this research is able to solve the two problems above. The work in this dissertation also demonstrates that the predictive analysis is more efficient than the existing tools for verifying message passing programs. Message passing MPI MCAPI SMT Static analysis Model checking Computer Sciences
179	Hierarchical Implementation of Aggregate Functions Quevedo, Pablo 01 January 2017 (has links) Most systems in HPC make use of hierarchical designs that allow multiple levels of parallelism to be exploited by programmers. The use of multiple multi-core/multi-processor computers to form a computer cluster supports both fine-grain and large-grain parallel computation. Aggregate function communications provide an easy to use and efficient set of mechanisms for communicating and coordinating between processing elements, but the model originally targeted only fine grain parallel hardware. This work shows that a hierarchical implementation of aggregate functions is a viable alternative to MPI (the standard Message Passing Interface library) for programming clusters that provide both fine grain and large grain execution. Performance of a prototype implementation is evaluated and compared to that of MPI. AFN aggregate functions parallel computation MPI OpenMP Computer and Systems Architecture Computer Engineering Digital Communications and Networking
180	Communications à hautes performances portables en environnements hiérarchiques, hétérogènes et dynamiques Mercier, Guillaume 20 December 2004 (has links) (PDF) Cette thèse a pour cadre les communications dans les machines paral lèles dans une optique de calcul haute-performance. Les évolutions du matériel ont rendu nécessaire les adaptations des logiciels destinés à exploiter les machines parallèles. En effet, les architectures de type ``grappes'' sont maintenant très répandues et l'apparition des grilles de calcul complique encore plus la situation car l'obtention des hautes performances passe par une exploitation des différents réseaux rapides disponibles et une prise en compte de la hiérarchie intrinsèque des configurations considérées. Au niveau applicatif, de nouvelles exigences émergent comme la dynamicité. Or, ces aspects sont trop souvent partiellement traités, en particulier dans les implémentations du standard de programmation par passage de messages MPI. Les solutions existantes se concentrent sur la hiérarchie et l'hétérogénéité ou la dynamicité, exceptionnellement les deux. En ce qui concerne les premiers aspects, des simplifications conduisent à une exploitation suboptimale du matériel potentiellement disponible. Nous avons analysé des implémentations existantes de MPI et avons proposé une architecture répondant aux besoins formulés. Cette architecture repose sur une for te interaction entre communications et processus légers et son c\oe ur est constitué par un moteur de progression des communications qui permet d'améliorer substantiellement les mécanismes existants. Les deux éléments logiciels fondamentaux sont une bibliothèque de processus légers (Marcel) ainsi qu'une couche générique de communication (Madeleine). L'implémentation de cette architecture a débouché sur le logiciel MPICH-Madeleine, utilisé ou évalué par plusieurs équipes et projets de recherche en France comme à l'étranger. L'évalution des performances (comparaisons avec Madeleine, mesures des opérations point-à-point, noyaux applicatifs) menée avec plusieurs réseaux haut-débit sur des grappes homogènes de machines multi-processeurs et les comparaisons avec MPICH-G2 ou PACX-MPI en environnement hétérogène démontrent que MPICH-Madeleine atteint des résultats de niveau similaire voire supérieur à ceux d'implémentations spécialisées de MPI. Grappes de PC MPI réseaux rapides hiérarchie hétérogénéité dynamicité haute-performance

Search results