Global ETD Search

Return to search

Hardware/Software Co-Design for Keyword Spotting on Edge Devices

The introduction of artificial neural networks (ANNs) to speech recognition applications has sparked the rapid development and popularization of digital assistants. These digital assistants perform keyword spotting (KWS), constantly monitoring the audio captured by a microphone for a small set of words or phrases known as keywords. Upon recognizing a keyword, a larger audio recording is saved and processed by a separate, more complex neural network. More broadly, neural networks in speech recognition have popularized voice as means of interacting with electronic devices, sparking an interest in individuals using speech recognition in their own projects. However, while large companies have the means to develop custom neural network architectures alongside proprietary hardware platforms, such development precludes those lacking similar resources from developing efficient and effective neural networks for embedded systems. While small, low-power embedded systems are widely available in the hobbyist space, a clear process is needed for developing a neural network that accounts for the limitations of these resource-constrained systems. In contrast, a wide variety of neural network architectures exists, but often little thought is given to deploying these architectures on edge devices. 
 
This thesis first presents an overview of audio processing techniques, artificial neural network fundamentals, and machine learning tools. A summary of a set of specific neural network architectures is also discussed. Finally, the process of implementing and modifying these existing neural network architectures and training specific models in Python using TensorFlow is demonstrated. The trained models are also subjected to post-training quantization to evaluate the effect on model performance. The models are evaluated using metrics relevant to deployment on resource-constrained systems, such as memory consumption, latency, and model size, in addition to the standard comparisons of accuracy and parameter count. After evaluating the models and architectures, the process of deploying one of the trained and quantized models is explored on an Arduino Nano 33 BLE using TensorFlow Lite for Microcontrollers and on a Digilent Nexys 4 FPGA board using CFU Playground.

10.25394/pgs.22701319.v1

keyword classification

keyword spotting

keyword spotting (KWS)

machine learning

artificial intelligence

hardware software co-design

hardware software codesign

edge devices

speech recognition system

neural network architecture

Identifer	oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/22701319
Date	29 April 2023
Creators	Jacob Irenaeus M Bushur (15360553)
Source Sets	Purdue University
Detected Language	English
Type	Text, Thesis
Rights	CC BY 4.0
Relation	https://figshare.com/articles/thesis/Hardware_Software_Co-Design_for_Keyword_Spotting_on_Edge_Devices/22701319

Page generated in 0.0015 seconds

Hardware/Software Co-Design for Keyword Spotting on Edge Devices

Description

Links & Downloads

Tags

Additional Fields