Global ETD Search

Return to search

Approximate Neural Networks for Speech Applications in Resource-Constrained Environments

abstract: Speech recognition and keyword detection are becoming increasingly popular applications for mobile systems. While deep neural network (DNN) implementation of these systems have very good performance,

they have large memory and compute resource requirements, making their implementation on a mobile device quite challenging. In this thesis, techniques to reduce the memory and computation cost

of keyword detection and speech recognition networks (or DNNs) are presented.

The first technique is based on representing all weights and biases by a small number of bits and mapping all nodal computations into fixed-point ones with minimal degradation in the

accuracy. Experiments conducted on the Resource Management (RM) database show that for the keyword detection neural network, representing the weights by 5 bits results in a 6 fold reduction in memory compared to a floating point implementation with very little loss in performance. Similarly, for the speech recognition neural network, representing the weights by 6 bits results in a 5 fold reduction in memory while maintaining an error rate similar to a floating point implementation. Additional reduction in memory is achieved by a technique called weight pruning,

where the weights are classified as sensitive and insensitive and the sensitive weights are represented with higher precision. A combination of these two techniques helps reduce the memory

footprint by 81 - 84% for speech recognition and keyword detection networks respectively.

Further reduction in memory size is achieved by judiciously dropping connections for large blocks of weights. The corresponding technique, termed coarse-grain sparsification, introduces

hardware-aware sparsity during DNN training, which leads to efficient weight memory compression and significant reduction in the number of computations during classification without

loss of accuracy. Keyword detection and speech recognition DNNs trained with 75% of the weights dropped and classified with 5-6 bit weight precision effectively reduced the weight memory

requirement by ~95% compared to a fully-connected network with double precision, while showing similar performance in keyword detection accuracy and word error rate. / Dissertation/Thesis / Masters Thesis Computer Science 2016

http://hdl.handle.net/2286/R.I.39402

Artificial intelligence

Identifer	oai:union.ndltd.org:asu.edu/item:39402
Date	January 2016
Contributors	Arunachalam, Sairam (Author), Chakrabarti, Chaitali (Advisor), Seo, Jae-sun (Advisor), Cao, Yu (Committee member), Arizona State University (Publisher)
Source Sets	Arizona State University
Language	English
Detected Language	English
Type	Masters Thesis
Format	52 pages
Rights	http://rightsstatements.org/vocab/InC/1.0/, All Rights Reserved

Page generated in 0.0023 seconds

Approximate Neural Networks for Speech Applications in Resource-Constrained Environments

Description

Links & Downloads

Tags

Additional Fields