Global ETD Search

Return to search

Lip Detection and Adaptive Tracking

Performance of automatic speech recognition (ASR) systems utilizing only acoustic information degrades significantly in noisy environments such as a car cabins. Incorporating audio and visual information together can improve performance in these situations. This work proposes a lip detection and tracking algorithm to serve as a visual front end to an audio-visual automatic speech recognition (AVASR) system.
Several color spaces are examined that are effective for segmenting lips from skin pixels. These color components and several features are used to characterize lips and to train cascaded lip detectors. Pre- and post-processing techniques are employed to maximize detector accuracy. The trained lip detector is incorporated into an adaptive mean-shift tracking algorithm for tracking lips in a car cabin environment. The resulting detector achieves 96.8% accuracy, and the tracker is shown to recover and adapt in scenarios where mean-shift alone fails.

Lip Detection

Haar-like Features

Histograms of Oriented Gradients

Local Binary Patterns

Mean-Shift Tracking

Signal Processing

Identifer	oai:union.ndltd.org:CALPOLY/oai:digitalcommons.calpoly.edu:theses-2902
Date	01 January 2017
Creators	Wang, Benjamin
Publisher	DigitalCommons@CalPoly
Source Sets	California Polytechnic State University
Detected Language	English
Type	text
Format	application/pdf
Source	Master's Theses

Page generated in 0.0022 seconds

Lip Detection and Adaptive Tracking

Description

Links & Downloads

Tags

Additional Fields