1 |
Scalable and Energy-Efficient SIMT Systems for Deep Learning and Data Center MicroservicesMahmoud Khairy A. Abdallah (12894191) 04 July 2022 (has links)
<p> </p>
<p>Moore’s law is dead. The physical and economic principles that enabled an exponential rise in transistors per chip have reached their breaking point. As a result, High-Performance Computing (HPC) domain and cloud data centers are encountering significant energy, cost, and environmental hurdles that have led them to embrace custom hardware/software solutions. Single Instruction Multiple Thread (SIMT) accelerators, like Graphics Processing Units (GPUs), are compelling solutions to achieve considerable energy efficiency while still preserving programmability in the twilight of Moore’s Law.</p>
<p>In the HPC and Deep Learning (DL) domain, the death of single-chip GPU performance scaling will usher in a renaissance in multi-chip Non-Uniform Memory Access (NUMA) scaling. Advances in silicon interposers and other inter-chip signaling technology will enable single-package systems, composed of multiple chiplets that continue to scale even as per-chip transistors do not. Given this evolving, massively parallel NUMA landscape, the placement of data on each chiplet, or discrete GPU card, and the scheduling of the threads that use that data is a critical factor in system performance and power consumption.</p>
<p>Aside from the supercomputer space, general-purpose compute units are still the main driver of data center’s total cost of ownership (TCO). CPUs consume 60% of the total data center power budget, half of which comes from the CPU pipeline’s frontend. Coupled with the hardware efficiency crisis is an increased desire for programmer productivity, flexible scalability, and nimble software updates that have led to the rise of software microservices. Consequently, single servers are now packed with many threads executing the same, relatively small task on different data.</p>
<p>In this dissertation, I discuss these new paradigm shifts, addressing the following concerns: (1) how do we overcome the non-uniform memory access overhead for next-generation multi-chiplet GPUs in the era of DL-driven workloads?; (2) how can we improve the energy efficiency of data center’s CPUs in the light of microservices evolution and request similarity?; and (3) how to study such rapidly-evolving systems with an accurate and extensible SIMT performance modeling?</p>
|
2 |
Fast Algorithms for Mining Co-evolving Time SeriesLi, Lei 01 September 2011 (has links)
Time series data arise in many applications, from motion capture, environmental monitoring, temperatures in data centers, to physiological signals in health care. In the thesis, I will focus on the theme of learning and mining large collections of co-evolving sequences, with the goal of developing fast algorithms for finding patterns, summarization, and anomalies. In particular, this thesis will answer the following recurring challenges for time series:
1. Forecasting and imputation: How to do forecasting and to recover missing values in time series data?
2. Pattern discovery and summarization: How to identify the patterns in the time sequences that would facilitate further mining tasks such as compression, segmentation and anomaly detection?
3. Similarity and feature extraction: How to extract compact and meaningful features from multiple co-evolving sequences that will enable better clustering and similarity queries of time series?
4. Scale up: How to handle large data sets on modern computing hardware?
We develop models to mine time series with missing values, to extract compact representation from time sequences, to segment the sequences, and to do forecasting. For large scale data, we propose algorithms for learning time series models, in particular, including Linear Dynamical Systems (LDS) and Hidden Markov Models (HMM). We also develop a distributed algorithm for finding patterns in large web-click streams. Our thesis will present special models and algorithms that incorporate domain knowledge. For motion capture, we will describe the natural motion stitching and occlusion filling for human motion. In particular, we provide a metric for evaluating the naturalness of motion stitching, based which we choose the best stitching. Thanks to domain knowledge (body structure and bone lengths), our algorithm is capable of recovering occlusions in mocap sequences, better in accuracy and longer in missing period. We also develop an algorithm for forecasting thermal conditions in a warehouse-sized data center. The forecast will help us control and manage the data center in a energy-efficient way, which can save a significant percentage of electric power consumption in data centers.
|
3 |
Energy-Efficient Key/Value StoreTena, Frezewd Lemma 11 September 2017 (has links) (PDF)
Energy conservation is a major concern in todays data centers, which are the 21st century data processing factories, and where large and complex software systems such as distributed data management stores run and serve billions of users. The two main drivers of this major concern are the pollution impact data centers have on the environment due to their waste heat, and the expensive cost data centers incur due to their enormous energy demand. Among the many subsystems of data centers, the storage system is one of the main sources of energy consumption. Among the many types of storage systems, key/value stores happen to be the widely used in the data centers. In this work, I investigate energy saving techniques that enable a consistent hash based key/value store save energy during low activity times, and whenever there is an opportunity to reuse the waste heat of data centers.
|
4 |
Energy-Efficient Key/Value StoreTena, Frezewd Lemma 29 August 2017 (has links)
Energy conservation is a major concern in todays data centers, which are the 21st century data processing factories, and where large and complex software systems such as distributed data management stores run and serve billions of users. The two main drivers of this major concern are the pollution impact data centers have on the environment due to their waste heat, and the expensive cost data centers incur due to their enormous energy demand. Among the many subsystems of data centers, the storage system is one of the main sources of energy consumption. Among the many types of storage systems, key/value stores happen to be the widely used in the data centers. In this work, I investigate energy saving techniques that enable a consistent hash based key/value store save energy during low activity times, and whenever there is an opportunity to reuse the waste heat of data centers.
|
Page generated in 0.105 seconds