Chip multiprocessors (CMPs) commonly share a large portion of memory
system resources among different cores. Since memory requests from
different threads executing on different cores significantly interfere
with one another in these shared resources, the design of the shared
memory subsystem is crucial for achieving high performance and
fairness.
Inter-thread memory system interference has different implications
based on the type of workload running on a CMP. In multi-programmed
workloads, different applications can experience significantly
different slowdowns. If left uncontrolled, large disparities in
slowdowns result in low system performance and make system software's
priority-based thread scheduling policies ineffective. In a single
multi-threaded application, memory system interference between threads
of the same application can slow each thread down significantly. Most
importantly, the critical path of execution can also be
significantly slowed down, resulting in increased application
execution time.
This dissertation proposes three mechanisms that address different
shortcomings of current shared resource management techniques targeted
at multi-programmed workloads, and one mechanism which speeds up a
single multi-threaded application by managing main-memory related
interference between its different threads.
With multi-programmed workloads, the key idea is that both demand- and
prefetch-caused inter-application interference should be taken into
account in shared resource management techniques across the entire
shared memory system. Our evaluations demonstrate that doing so
significantly improves both system performance and fairness compared
to the state-of-the-art. When executing a single multi-threaded
application on a CMP, the key idea is to take into account the
inter-dependence of threads in memory scheduling decisions. Our
evaluation shows that doing so significantly reduces the execution
time of the multi-threaded application compared to using
state-of-the-art memory schedulers designed for multi-programmed
workloads.
This dissertation concludes that the performance and fairness of CMPs
can be significantly improved by better management of inter-thread
interference in the shared memory resources, both for multi-programmed
workloads and multi-threaded applications. / text
Identifer | oai:union.ndltd.org:UTEXAS/oai:repositories.lib.utexas.edu:2152/ETD-UT-2011-12-4855 |
Date | 31 January 2012 |
Creators | Ebrahimi, Eiman |
Source Sets | University of Texas |
Language | English |
Detected Language | English |
Type | thesis |
Format | application/pdf |
Page generated in 0.002 seconds