• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Hardware-Aware Distributed Pipelined Neural Network Models Inference

Alshams, Mojtaba 07 1900 (has links)
Neural Network models got the attention of the scientific community for their increasing accuracy in predictions and good emulation of some human tasks. This led to extensive enhancements in their architecture, resulting in models with fast-growing memory and computation requirements. Due to hardware constraints such as memory and computing capabilities, the inference of a large neural network model can be distributed across multiple devices by a partitioning algorithm. The proposed framework finds the optimal model splits and chooses which device shall compute a corresponding split to minimize inference time and energy. The framework is based on PipeEdge algorithm and extends it by not only increasing inference throughput but also simultaneously minimizing inference energy consumption. Another thesis contribution is the augmentation of the emerging technology Compute-in-memory (CIM) devices to the system. To the best of my knowledge, no one studied the effect of including CIM, specifically DNN+NeuroSim simulator, devices in a distributed inference. My proposed framework could partition VGG8 and ResNet152 on ImageNet and achieve a comparable trade-off between inference slowest stage increase and energy reduction when it tried to decrease inference energy (e.g. 19% energy reduction with 34% time increase) and when CIM devices were augmenting the system (e.g. 34% energy reduction with 45% time increase).

Page generated in 0.0926 seconds