Global ETD Search

1	Outils pour la parallélisation automatique Boulet, Pierre 18 January 1996 (has links) (PDF) La parallélisation automatique est une des approches visant une plus grande facilité d'utilisation des ordinateurs parallèles. La parallélisation consiste prendre un programme écrit pour une machine séquentielle (qui n'a qu'un processeur) et de l'adapter une machine parallèle. L'intérêt de faire faire cette parallélisation automatiquement par un programme appelé paralléliseur est qu'on pourrait alors réutiliser tout le code déjà écrit en Fortran pour machine séquentielles, après parallélisation, sur des machines parallèles. Nous n'y sommes pas encore, mais on s'en approche. C'est dans ce cadre que se situe mon travail. Une moitié approximativement de ma thèse est consacrée à la réalisation d'un logiciel qui parallélise automatiquement une classe réduite de programmes (les nids de boucles uniformes qui utilisent des translations comme accès aux tableaux de données) en HPF (High Performance Fortran). J'insiste surtout sur la partie génération de code HPF, qui est la partie la plus novatrice de ce programme. Outre la réalisation de Bouclettes, ma contribution au domaine est aussi théorique avec une étude sur un partitionnement des données appelé pavage par des parallélépipèdes et une étude de l'optimisation des calculs d' « expressions de tableaux » dans le langage High Performance Fortran. Le pavage est une technique permettant d'optimiser la taille des tâches qu'on répartit sur les processeurs pour diminuer le temps passé en communications. L'évaluation d'expressions de tableaux est une étape d'optimisation du compilateur parallèle (le programme qui traduit le code parallèle écrit dans un langage de haut niveau comme HPF en code machine directement exécutable par l'ordinateur parallèle). parallélisation automatique data-parallélisme High Performance Fortran nids de boucles compilation optimisation
2	Automatic data distribution for massively parallel processors García Almiñana, Jordi 16 April 1997 (has links) Massively Parallel Processor systems provide the required computational power to solve most large scale High Performance Computing applications. Machines with physically distributed memory allow a cost-effective way to achieve this performance, however, these systems are very diffcult to program and tune. In a distributed-memory organization each processor has direct access to its local memory, and indirect access to the remote memories of other processors. But the cost of accessing a local memory location can be more than one order of magnitude faster than accessing a remote memory location. In these systems, the choice of a good data distribution strategy can dramatically improve performance, although different parts of the data distribution problem have been proved to be NP-complete.The selection of an optimal data placement depends on the program structure, the program's data sizes, the compiler capabilities, and some characteristics of the target machine. In addition, there is often a trade-off between minimizing interprocessor data movement and load balancing on processors. Automatic data distribution tools can assist the programmer in the selection of a good data layout strategy. These use to be source-to-source tools which annotate the original program with data distribution directives.Crucial aspects such as data movement, parallelism, and load balance have to be taken into consideration in a unified way to efficiently solve the data distribution problem.In this thesis a framework for automatic data distribution is presented, in the context of a parallelizing environment for massive parallel processor (MPP) systems. The applications considered for parallelization are usually regular problems, in which data structures are dense arrays. The data mapping strategy generated is optimal for a given problem size and target MPP architecture, according to our current cost and compilation model.A single data structure, named Communication-Parallelism Graph (CPG), that holds symbolic information related to data movement and parallelism inherent in the whole program, is the core of our approach. This data structure allows the estimation of the data movement and parallelism effects of any data distribution strategy supported by our model. Assuming that some program characteristics have been obtained by profiling and that some specific target machine features have been provided, the symbolic information included in the CPG can be replaced by constant values expressed in seconds representing data movement time overhead and saving time due to parallelization. The CPG is then used to model a minimal path problem which is solved by a general purpose linear 0-1 integer programming solver. Linear programming techniques guarantees that the solution provided is optimal, and it is highly effcient to solve this kind of problems.The data mapping capabilities provided by the tool includes alignment of the arrays, one or two-dimensional distribution with BLOCK or CYCLIC fashion, a set of remapping actions to be performed between phases if profitable, plus the parallelization strategy associated. The effects of control flow statements between phases are taken into account in order to improve the accuracy of the model. The novelty of the approach resides in handling all stages of the data distribution problem, that traditionally have been treated in several independent phases, in a single step, and providing an optimal solution according to our model. massively parallel processors high performance fortran automatic data recomposition multicomputers automatic data distribution automatic data happing 3304. Tecnologia dels ordinadors 004 62
3	Genetic Algorithm Based Automatic Data Partitioning Scheme For HPF On A Linux Cluster Anand, Sunil Kumar 12 1900 (has links) (PDF) No description available. Data Partitioning (Computer Science) High Performance Fortran Fortran (Computer Program Language) Linux Computing Clusters Genetic Algorithms Automatic Data Partitioning Cluster (Computing) Linux Cluster Computer Science

1

Page generated in 0.0856 seconds