Return to search

Hardware-Aware Distributed Pipelined Neural Network Models Inference

Neural Network models got the attention of the scientific community for their
increasing accuracy in predictions and good emulation of some human tasks.
This led to extensive enhancements in their architecture, resulting in models
with fast-growing memory and computation requirements. Due to hardware constraints such as memory and computing capabilities, the inference of a large neural network model can be distributed across multiple devices by a partitioning
algorithm. The proposed framework finds the optimal model splits and chooses
which device shall compute a corresponding split to minimize inference time and
energy. The framework is based on PipeEdge algorithm and extends it by not
only increasing inference throughput but also simultaneously minimizing inference energy consumption. Another thesis contribution is the augmentation of
the emerging technology Compute-in-memory (CIM) devices to the system. To
the best of my knowledge, no one studied the effect of including CIM, specifically DNN+NeuroSim simulator, devices in a distributed inference. My proposed
framework could partition VGG8 and ResNet152 on ImageNet and achieve a comparable trade-off between inference slowest stage increase and energy reduction
when it tried to decrease inference energy (e.g. 19% energy reduction with 34%
time increase) and when CIM devices were augmenting the system (e.g. 34%
energy reduction with 45% time increase).

Identiferoai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/693470
Date07 1900
CreatorsAlshams, Mojtaba
ContributorsEltawil, Ahmed, Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Salam, Khaled N., Fahmy, Suhaib A.
Source SetsKing Abdullah University of Science and Technology
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Rights2024-08-06, At the time of archiving, the student author of this thesis opted to temporarily restrict access to it. The full text of this thesis will become available to the public after the expiration of the embargo on 2024-08-06.
RelationNA

Page generated in 0.003 seconds