Global ETD Search

Return to search

Exploiting Application Characteristics for Efficient System Support of Data-Parallel Machine Learning

Large scale machine learning has many characteristics that can be exploited in the system designs to improve its efficiency. This dissertation demonstrates that the characteristics of the ML computations can be exploited in the design and implementation of parameter server systems, to greatly improve the efficiency by an order of magnitude or more. We support this thesis statement with three case study systems, IterStore, GeePS, and MLtuner. IterStore is an optimized parameter server system design that exploits the repeated data access pattern characteristic of ML computations. The designed optimizations allow IterStore to reduce the total run time of our ML benchmarks by up to 50×. GeePS is a parameter server that is specialized for deep learning on distributed GPUs. By exploiting the layer-by-layer data access and computation pattern of deep learning, GeePS provides almost linear scalability from single-machine baselines (13× more training throughput with 16 machines), and also supports neural networks that do not fit in GPU memory. MLtuner is a system for automatically tuning the training tunables of ML tasks. It exploits the characteristic that the best tunable settings can often be decided quickly with just a short trial time. By making use of optimization-guided online trial-and-error, MLtuner can robustly find and re-tune tunable settings for a variety of machine learning applications, including image classification, video classification, and matrix factorization, and is over an order of magnitude faster than traditional hyperparameter tuning approaches.

Big Data Analytics

Large-Scale Machine Learning

Identifer	oai:union.ndltd.org:cmu.edu/oai:repository.cmu.edu:dissertations-1947
Date	01 May 2017
Creators	Cui, Henggang
Publisher	Research Showcase @ CMU
Source Sets	Carnegie Mellon University
Detected Language	English
Type	text
Format	application/pdf
Source	Dissertations

Page generated in 0.001 seconds

Exploiting Application Characteristics for Efficient System Support of Data-Parallel Machine Learning

Description

Links & Downloads

Tags

Additional Fields