This thesis proposes to reduce the latency of MPI receive operations on cacheless architectures, by removing the delay of copying messages when they are first received. This is achieved by copying the messages directly into buffers in the lowest level of the memory hierarchy (e.g., scratchpad memory). The previously proposed solution introduced an Indirection Cache which would map between the receive variables and the buffered message payload locations. This proved somewhat beneficial, but the lookup penalty of the Indirection Cache limited its effectiveness. Therefore this thesis proposes that a most recently used buffer (i.e., an Indirection Buffer) be placed in front of the Indirection Cache to eliminate this penalty and speed up access. The tests conducted demonstrated that this method was indeed effective and improved over the original method by at least an order of magnitude. Finally, examination of implementation feasibility showed that this could be implemented with a small Cache, and that even with access times 6x slower than initially assumed, the approach with the Indirection Buffer would still be effective. / Graduate
Identifer | oai:union.ndltd.org:uvic.ca/oai:dspace.library.uvic.ca:1828/3813 |
Date | 12 January 2012 |
Creators | Kroeker, Anthony |
Contributors | Dimopoulos, Nikitas J. |
Source Sets | University of Victoria |
Language | English, English |
Detected Language | English |
Type | Thesis |
Rights | Available to the World Wide Web |
Page generated in 0.0016 seconds