Chip Multi-Processor (CMP) platforms, well-established in the server, desktop and embedded domain, succeeded in overcoming the power consumption and heat dissipation bottlenecks by integrating multiple cores, less complex and powerful than their single-core ancestors, in a single die. A major issue induced by the design of the CMPs is contention for the shared resources of the platform, Last Level Cache (LLC) and main memory bandwidth. Applications, running concurrently on the cores, compete with each other for the shared resources, and are subject to performance degradation. The way applications are assigned to the CMP, is crucial for the overall performance of the system. A scheduling policy that accounts for contention will bring high performance speed-ups, whereas an agnostic one will generate unpredictable contention conditions. For this reason the significance of the scheduler has been elevated, as it is the component that determines which applications utilize the resources each time period.In this thesis, we address cross-core interference on CMP platforms, by designing scheduling policies that improve performance and fairness. We deal with contention in three ways. In our first approach, we incorporate the notion of progress in order to balance unfairness among the applications of the workload. Performance degradation is not evenly distributed and progress greatly varies among them. In order to provide a fair execution environment, we monitor, at run-time, applications assigned to the CPU and prioritize them based on the extent at which they are affected by contention.In our second approach, we target performance by mitigating contention on shared resources. It is necessary to decide, out of all the possible application schedules, the one that generates the least amount of resource interference. To achieve that, the first indispensable step is to extract an interference profile for the applications executed on the CMP. We accomplish that by applying pressure to all levels of memory hierarchy and identifying the point at which performance is compromised. From our analysis, we understand that shared resources can tolerate pressure of certain amount; applications can be grouped together if the overall generated pressure does not reach the saturation point of the shared resources. Having extracted this information, we proceed to the placement of the application in such a way that overall resource requirements are as balanced as possible across the execution.Finally, we design a policy in order to improve performance and fairness at the same time. Applications that heavily rely on the LLC are separated from those with high main memory bandwidth, in order to avoid the destructive effects caused by the LLC thrashing behavior of the latter. The group executed on the CPU is determined based on the key observation that the overall requirements of the group should not exceed the saturation limits of the CMP. Additionally, during execution, the progress for each application is estimated and those with the least accumulated progress are prioritized.Our proposed policies are evaluated in an Intel Xeon E5-2620 v3 processor. A variety of benchmark suites were utilized to generate mixes of diverse characteristics. Our methodologies are implemented in user-space and can be deployed on Linux-based systems. Experimental results show the benefits of tackling contention in shared resources. We achieve throughput gains of up to 16% and unfairness is reduced by 2.37x on average compared to Linux scheduler.
Identifer | oai:union.ndltd.org:siu.edu/oai:opensiuc.lib.siu.edu:dissertations-2761 |
Date | 01 December 2019 |
Creators | Marinakis, Theodoros |
Publisher | OpenSIUC |
Source Sets | Southern Illinois University Carbondale |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Dissertations |
Page generated in 0.002 seconds