Global ETD Search

Return to search

Embedded speech recognition systems

Apart from recognition accuracy, decoding speed and vocabulary size, another point of consideration when developing a practical ASR application is the adaptability of the system. An ASR system is more useful if it can cope with changes that are introduced by users, for example, new words and new grammar rules. In addition, the system can also automatically update the underlying knowledge sources, such as language model probabilities, for better recognition accuracy. Since the knowledge sources need to be adaptable, it is in°exible to statically combine them. It is because on-line modi¯cation becomes di±cult once all the knowledge sources have been combined into one static search space. The second objective of the thesis is to develop an algorithm which allows dynamic integration of knowledge sources during decoding. In this approach, each knowledge source is represented by a weighted ¯nite state transducer (WFST). The knowledge source that is subject to adaptation is factorized from the entire search space. The adapted knowledge source is then combined with the others during decoding. In this thesis, we propose a generalized dynamic WFST composition algorithm, which avoids the creation of non- coaccessible paths, performs weight look-ahead and does not impose any constraints to the topology of the WFSTs. Experimental results on Wall Street Journal (WSJ1) 20k- word trigram task show that our proposed approach has a better word accuracy versus real-time factor characteristics than other dynamic composition approaches.

http://hdl.handle.net/2292/3279

Speech Recognition

Embedded Systems

Identifer	oai:union.ndltd.org:ADTP/275445
Date	January 2008
Creators	Cheng, Octavian
Publisher	ResearchSpace@Auckland
Source Sets	Australiasian Digital Theses Program
Language	English
Detected Language	English
Rights	Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated., http://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm, Copyright: The author

Page generated in 0.0014 seconds

Embedded speech recognition systems

Description

Links & Downloads

Tags

Additional Fields