Parallel scientific programs executing in a NUMA environment are confronted with the problem of how to place their data in the system's physically separate memories so as to minimise the latency of accesses to this data made by the program's threads. Motivated by this poor performance, this thesis describes a technique by which the partition of a parallel program's workload that is created by a loadbalancing routine may be used to identify the affinities of the threads of this program for regions of the program's address space.
|Publisher||University of Manchester|
|Source Sets||Ethos UK|
|Type||Electronic Thesis or Dissertation|
Page generated in 0.0025 seconds