Ultra-Low-Power IoT Solutions for Sound Source Localization: Combining Mixed-Signal Processing and Machine Learning

With the prevalence of smartphones, pedestrians and joggers today often walk or run while listening to music. Since they are deprived of auditory stimuli that could provide important cues to dangers, they are at a much greater risk of being hit by cars or other vehicles. We start this research into building a wearable system that uses multichannel audio sensors embedded in a headset to help detect and locate cars from their honks and engine and tire noises. Based on this detection, the system can warn pedestrians of the imminent danger of approaching cars. We demonstrate that using a segmented architecture and implementation consisting of headset-mounted audio sensors, front-end hardware that performs signal processing and feature extraction, and machine-learning-based classification on a smartphone, we are able to provide early danger detection in real time, from up to 80m distance, with greater than 80% precision and 90% recall, and alert the user on time (about 6s in advance for a car traveling at 30mph).

The time delay between audio signals in a microphone array is the most important feature for sound-source localization. This work also presents a polarity-coincidence, adaptive time-delay estimation (PCC-ATDE) mixed-signal technique that uses 1-bit quantized signals and a negative-feedback architecture to directly determine the time delay between signals in the analog inputs and convert it to a digital number. This direct conversion, without a multibit ADC and further digital-signal processing, allows for ultra low power consumption. A prototype chip in 0:18μm CMOS with 4 analog inputs consumes 78nW with a 3-channel 8-bit digital time-delay output while sampling at 50kHz with a 20μs resolution and 6.06 ENOB. We present a theoretical analysis for the nonlinear, signal-dependent feedback loop of the PCC-ATDE. A delay-domain model of the system is developed to estimate the power bandwidth of the converter and predict its dynamic response. Results are validated with experiments using real-life stimuli, captured with a microphone array, that demonstrate the technique’s ability to localize a sound source. The chip is further integrated in an embedded platform and deployed as an audio-based vehicle-bearing IoT system.

Finally, we investigate the signal’s envelope, an important feature for a host of applications enabled by machine-learning algorithms. Conventionally, the raw analog signal is digitized first, followed by feature extraction in the digital domain. This work presents an ultra-low-power envelope-to-digital converter (EDC) consisting of a passive switched-capacitor envelope detector and an inseparable successive approximation-register analog-to-digital converter (ADC). The two blocks integrate directly at different sampling rates without a buffer between them thanks to the ping-pong operation of their sampling capacitors. An EDC prototype was fabricated in 180nm CMOS. It provides 7.1 effective bits of ADC resolution and supports input signal bandwidth up to 5kHz and an envelope bandwidth up to 50Hz while consuming 9.6nW.

Identiferoai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/d8-we43-y259
Date January 2019
Creatorsde Godoy Peixoto, Daniel
Source SetsColumbia University
LanguageEnglish
Detected LanguageEnglish
TypeTheses

Page generated in 0.0022 seconds