Global ETD Search

Return to search

A Study of the Automatic Speech Recognition Process and Speaker Adaptation

This thesis considers the entire automated speech recognition process and presents a standardised approach to LVCSR experimentation with HMMs. It also discusses various approaches to speaker adaptation such as MLLR and multiscale, and presents experimental results for cross-task speaker adaptation. An analysis of training parameters and data sufficiency for reasonable system performance estimates are also included. It is found that Maximum Likelihood Linear Regression (MLLR) supervised adaptation can result in 6% reduction (absolute) in word error rate given only one minute of adaptation data, as compared with an unadapted model set trained on a different task. The unadapted system performed at 24% WER and the adapted system at 18% WER. This is achieved with only 4 to 7 adaptation classes per speaker, as generated from a regression tree.

http://hdl.handle.net/10012/840

Electrical & Computer Engineering

automatic speech recognition

Identifer	oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OWTU.10012/840
Date	January 2000
Creators	Stokes-Rees, Ian James
Publisher	University of Waterloo
Source Sets	Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
Language	English
Detected Language	English
Type	Thesis or Dissertation
Format	application/pdf, 512540 bytes, application/pdf
Rights	Copyright: 2000, Stokes-Rees, Ian James. All rights reserved.

Page generated in 0.0022 seconds

A Study of the Automatic Speech Recognition Process and Speaker Adaptation

Description

Links & Downloads

Tags

Additional Fields