Return to search

Accelerating machine learning with memory management and persistent memory

Machine Learning (ML) is expensive: it requires machines that possess large compute capabilities, high memory and storage capacities, and exceedingly large amounts of time for models to train. When training a model, ML programs in general follow a standard blueprint: the model makes at least one “pass” through the dataset, where a single “pass” is comprised of supplying the data in chunks to the model. During each pass through the dataset, the model updates its internal state with the goal of improving predictive performance on the data that it sees. One of the primary reasons ML programs are expensive is an artifact of this blueprint. In fact, especially in the age of “big data,” the requirements to run ML programs are so large that they must be distributed across a cluster of machines as a single machine alone cannot meet all of the requirements.
However, the landscape of hardware is changing. With the introduction of new memory technology called Persistent Memory (PM), a single machine can now satisfy program requirements that, in the past, they could not. Persistent Memory is unique in that it can play multiple roles within the memory hierarchy: the collection of memory devices a machine is equipped with. By utilizing the unique properties of PM, ML can be further optimized for program performance metrics such as runtime, crash consistency, etc.
In this dissertation, I accelerate each stage of the ML training blueprint. First, I show that for algorithms that normally cannot execute on a single machine, the entire blueprint can be executed using PM. I then evaluate PM against other devices in a series of micro-benchmarks that emulate common memory operations used by ML programs. Through these micro-benchmarks, I provide guidelines for researchers to consider when optimizing their programs. Finally, I use these guidelines to accelerate the checkpointing operation: the process of recording the state of the ML program to persistent storage in a crash consistent manner.

Identiferoai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/48020
Date07 February 2024
CreatorsWood, Andrew
ContributorsChin, Sang
Source SetsBoston University
Languageen_US
Detected LanguageEnglish
TypeThesis/Dissertation
RightsAttribution 4.0 International, http://creativecommons.org/licenses/by/4.0/

Page generated in 0.0021 seconds