MPI I/O replay (MPIOR) is an I/O performance modeling and prediction tool used to trace and replay a parallel application to determine application performance under a new I/O sub system. The trace collector deduces synchronization inter-dependencies between nodes and I/O demands placed by each node on the storage subsystem. It uses a novel runtime graph traversal technique to filter and log only those MPI calls that affect I/O, thus substantially reducing both the number of runs and the size of the trace file. Unlike other such tools, MPIOR collects a valid trace in a single run and it does not rely on node sampling or I/O sampling. MPIOR's post processing engine analyzes the trace files and sets up the re-player. Due to minimal overhead for trace collection, MPIOR can be used during production runs rather than just as a debugging tool. The re-player mimics the behavior of the application across a variety of storage systems by mapping multiple processes to multiple threads running on a single node. We show average replay error for parallel applications is below 30%. / Master of Science
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/31484 |
Date | 11 April 2012 |
Creators | Banerjee, Shankha |
Contributors | Computer Science and Applications, Varadarajan, Srinidhi, Tilevich, Eli, Ribbens, Calvin J. |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Relation | Banerjee_Shankha_T_2012.pdf |
Page generated in 0.0025 seconds