Return to search

Inference Engine: A high efficiency accelerator for Deep Neural Networks

Deep Neural Networks are state-of the art algorithms for various image and natural language processing tasks. These networks are composed of billions of operations working on an input to produce the desired result. Along with this computational complexity, these workloads are also massively parallel in nature. These inherent properties make deep neural networks an excellent target for custom acceleration. The main challenge faced by such accelerators is achieving a compromise between power consumption, software programmability, and resource utilization for the varied compute and data access patterns presented by DNN workloads. In this work, I present Inference Engine, a scalable and efficient DNN accelerator designed to be agnostic to the type of DNN workload. Inference Engine was designed to provide near peak hardware resource utilization, minimize data transfer, and offer a programmer friendly instruction set. Inference engine scales at the level of individually programmable clusters, each of which contains several hundred compute resources. It provides an instruction set designed to exploit parallelism within the workload while also allowing freedom for compiler based exploration of data access patterns.

  1. 10.25394/pgs.9108539.v1
Identiferoai:union.ndltd.org:purdue.edu/oai:figshare.com:article/9108539
Date12 October 2021
CreatorsAliasger Tayeb Zaidy (7043234)
Source SetsPurdue University
Detected LanguageEnglish
TypeText, Thesis
RightsCC BY 4.0
Relationhttps://figshare.com/articles/thesis/Inference_Engine_A_high_efficiency_accelerator_for_Deep_Neural_Networks/9108539

Page generated in 0.0021 seconds