This dissertation focuses on developing algorithms and systems to improve the efficiency of operating mega datacenters with hundreds of thousands of servers. In particular, it seeks to address two challenges: First, how to distribute the workload among the set of datacenters geographically deployed across the wide area? Second, how to manage the server resources of datacenters using virtualization technology?
In the first part, we consider the workload management problem in geo-distributed datacenters. We first present a novel distributed workload management algorithm that jointly considers request mapping, which determines how to direct user requests to an appropriate datacenter for processing, and response routing, which decides how to select a path among the set of ISP links of a datacenter to route the response packets back to a user. In the next chapter, we study some key aspects of cost and workload in geo-distributed datacenters that have not been fully understood before. Through extensive empirical studies of climate data and cooling systems, we make a case for temperature aware workload management, where the geographical diversity of temperature and its impact on cooling energy efficiency can be used to reduce the overall cooling energy. Moreover, we advocate for holistic workload management for both interactive and batch jobs, where the delay-tolerant elastic nature of batch jobs can be exploited to further reduce the energy cost. A consistent 15% to 20% cooling energy reduction, and a 5% to 20% overall cost reduction are observed from extensive trace-driven simulations.
In the second part of the thesis, we consider the resource management problem in virtualized datacenters. We design Anchor, a scalable and flexible architecture that efficiently supports a variety of resource management policies. We implement a prototype of Anchor on a small-scale in-house datacenter with 20 servers. Experimental results and trace-driven simulations show that Anchor is effective in realizing various resource management policies, and its simple algorithms are practical to solve virtual machine allocation with thousands of VMs and servers in just ten seconds.
Identifer | oai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/36071 |
Date | 13 August 2013 |
Creators | Xu, Hong |
Contributors | Li, Baochun |
Source Sets | University of Toronto |
Language | en_ca |
Detected Language | English |
Type | Thesis |
Page generated in 0.0021 seconds