Global ETD Search

Return to search

Characterization of speakers for improved automatic speech recognition

Automatic speech recognition technology is becoming increasingly widespread in many applications. For dictation tasks, where a single talker is to use the system for long periods of time, the high recognition accuracies obtained are in part due to the user performing a lengthy enrolment procedure to ‘tune’ the parameters of the recogniser to their particular voice characteristics and speaking style. Interactive speech systems, where the speaker is using the system for only a short period of time (for example to obtain information) do not have the luxury of long enrolments and have to adapt rapidly to new speakers and speaking styles. This thesis discusses the variations between speakers and speaking styles which result in decreased recognition performance when there is a mismatch between the talker and the systems models. An unsupervised method to rapidly identify and normalise differences in vocal tract length is presented and shown to give improvements in recognition accuracy for little computational overhead. Two unsupervised methods of identifying speakers with similar speaking styles are also presented. The first, a data-driven technique, is shown to accurately classify British and American accented speech, and is also used to improve recognition accuracy by clustering groups of similar talkers. The second uses the phonotactic information available within pronunciation dictionaries to model British and American accented speech. This model is then used to rapidly and accurately classify speakers.

http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.323266

621.3994

Accent identification

Identifer	oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:323266
Date	January 1999
Creators	Lincoln, Michael
Publisher	University of East Anglia
Source Sets	Ethos UK
Detected Language	English
Type	Electronic Thesis or Dissertation
Source	http://hdl.handle.net/1842/1191

Page generated in 0.0013 seconds

Characterization of speakers for improved automatic speech recognition

Description

Links & Downloads

Tags

Additional Fields