1 |
Operating System Management of Shared Caches on Multicore ProcessorsTam, David Kar Fai 01 September 2010 (has links)
Our thesis is that operating systems should manage the on-chip shared caches of multicore processors for the purposes of achieving performance gains. Consequently, this dissertation demonstrates how the operating system can profitably manage these shared caches. Two shared-cache management principles are investigated: (1) promoting shared use of the shared cache, demonstrated by an automated online thread clustering technique, and (2) providing cache space isolation, demonstrated by a software-based cache partitioning technique. In support of providing isolation, cache provisioning is also investigated, demonstrated by an automated online technique called RapidMRC. We show how these software-based techniques are feasible on existing multicore systems with the help of their hardware performance monitoring units and their associated hardware performance counters. On a 2-chip IBM POWER5 multicore system, promoting sharing reduced processor pipeline stalls caused by cross-chip cache accesses by up to 70%, resulting in performance improvements of up to 7%. On a larger 8-chip IBM POWER5+ multicore system, the potential for up to 14% performance improvement was measured. Providing isolation improved performance by up to 50%, using an exhaustive offline search method to determine optimal partition size. On the other hand, up to 27% performance improvement was extracted from the corresponding workload using an automated online approximation technique, made possible by RapidMRC.
|
2 |
Operating System Management of Shared Caches on Multicore ProcessorsTam, David Kar Fai 01 September 2010 (has links)
Our thesis is that operating systems should manage the on-chip shared caches of multicore processors for the purposes of achieving performance gains. Consequently, this dissertation demonstrates how the operating system can profitably manage these shared caches. Two shared-cache management principles are investigated: (1) promoting shared use of the shared cache, demonstrated by an automated online thread clustering technique, and (2) providing cache space isolation, demonstrated by a software-based cache partitioning technique. In support of providing isolation, cache provisioning is also investigated, demonstrated by an automated online technique called RapidMRC. We show how these software-based techniques are feasible on existing multicore systems with the help of their hardware performance monitoring units and their associated hardware performance counters. On a 2-chip IBM POWER5 multicore system, promoting sharing reduced processor pipeline stalls caused by cross-chip cache accesses by up to 70%, resulting in performance improvements of up to 7%. On a larger 8-chip IBM POWER5+ multicore system, the potential for up to 14% performance improvement was measured. Providing isolation improved performance by up to 50%, using an exhaustive offline search method to determine optimal partition size. On the other hand, up to 27% performance improvement was extracted from the corresponding workload using an automated online approximation technique, made possible by RapidMRC.
|
Page generated in 0.1163 seconds