Global ETD Search

Return to search

Inference Engine: A high efficiency accelerator for Deep Neural Networks

Deep Neural Networks are state-of the art algorithms for various image and natural language processing tasks. These networks are composed of billions of operations working on an input to produce the desired result. Along with this computational complexity, these workloads are also massively parallel in nature. These inherent properties make deep neural networks an excellent target for custom acceleration. The main challenge faced by such accelerators is achieving a compromise between power consumption, software programmability, and resource utilization for the varied compute and data access patterns presented by DNN workloads. In this work, I present Inference Engine, a scalable and efficient DNN accelerator designed to be agnostic to the type of DNN workload. Inference Engine was designed to provide near peak hardware resource utilization, minimize data transfer, and offer a programmer friendly instruction set. Inference engine scales at the level of individually programmable clusters, each of which contains several hundred compute resources. It provides an instruction set designed to exploit parallelism within the workload while also allowing freedom for compiler based exploration of data access patterns.

10.25394/pgs.9108539.v1

Computer Engineering

Computer System Architecture

hardware accelerator

Computer architecture

Deep neural network

Identifer	oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/9108539
Date	12 October 2021
Creators	Aliasger Tayeb Zaidy (7043234)
Source Sets	Purdue University
Detected Language	English
Type	Text, Thesis
Rights	CC BY 4.0
Relation	https://figshare.com/articles/thesis/Inference_Engine_A_high_efficiency_accelerator_for_Deep_Neural_Networks/9108539

Page generated in 0.002 seconds

Inference Engine: A high efficiency accelerator for Deep Neural Networks

Description

Links & Downloads

Tags

Additional Fields