Analyzing and understanding the performance behavior of parallel applicationson various compute infrastructures is a long-standing concern in the HighPerformance Computing community. When the targeted execution environments arenot available, simulation is a reasonable approach to obtain objectiveperformance indicators and explore various ''what-if?'' scenarios. In thiswork we present a framework for the off-line simulation of MPIapplications. The main originality of our work with regard to the literature is to rely on\tit execution traces. This allows for an extreme scalability as heterogeneousand distributed resources can be used to acquire a trace. We propose a formatwhere for each event that occurs during the execution of an application we logthe volume of instructions for a computation phase or the bytes and the type ofa communication. To acquire time-independent traces of the execution of MPI applications, wehave to instrument them to log the required data. There exist many profilingtools which can instrument an application. We propose a scoring system thatcorresponds to our framework specific requirements and evaluate the mostwell-known and open source profiling tools according to it. Furthermore weintroduce an original tool called Minimal Instrumentation that was designed tofulfill the requirements of our framework. We study different instrumentationmethods and we also investigate several acquisition strategies. We detail thetools that extract the \tit traces from the instrumentation traces of somewell-known profiling tools. Finally we evaluate the whole acquisition procedureand we present the acquisition of large scale instances. We describe in detail the procedure to provide a realistic simulated platformfile to our trace replay tool taking under consideration the topology of thereal platform and the calibration procedure with regard to the application thatis going to be simulated. Moreover we present the implemented trace replaytools that we used during this work. We show that our simulator can predictthe performance of some MPI benchmarks with less than 11\% relativeerror between the real execution and simulation for the cases that there is noperformance issue. Finally, we identify the reasons of the performance issuesand we propose solutions.
Identifer | oai:union.ndltd.org:CCSD/oai:tel.archives-ouvertes.fr:tel-00951125 |
Date | 20 January 2014 |
Creators | Markomanolis, Georgios |
Publisher | Ecole normale supérieure de lyon - ENS LYON |
Source Sets | CCSD theses-EN-ligne, France |
Language | English |
Detected Language | English |
Type | PhD thesis |
Page generated in 0.0018 seconds