Return to search

Improved learning strategies for small vocabulary automatic speech recognition

There are basically three areas which can be explored to improve an HMM based recognizer, namely, parameter extraction, training methods and vocabulary representation. / The goal of parameter extraction is not only to find a compact and robust parametric representation of the speech signal, but also to find one which allows the HMMs to obtain the best possible recognition performance. Historically, improvements at this level have usually been obtained on a trial and error basis, using as much knowledge as possible about both the speech production and speech perception mechanisms. That is, the acoustic parameter extraction module has always been viewed as a separate module from the HMMs. This thesis will explore the concept of performing parameter extraction with a connectionist model, whose parameters can be learned from training data. / Two HMM training techniques are used in this thesis, namely MLE and MMIE. Parameter initialization, of critical importance for both, will be investigated for discrete, semi-continuous and continuous HMMs. Training processes involving a combination of MLE and MMIE training are studied. Other issues such as codebook exponents and the use of pause and silence models will also be explored. / Even if the vocabulary contains only 11 words, its representation is a very important issue. The effects of vocabulary representation with phoneme based, word based (with no sharing) and inter-word models will be experimentally evaluated. It will be shown how a word error rate of 0.23% and a string error rate of 0.68% can be achieved on the TIDIGITS corpus--a performance rivalling the best results ever reported by any group of researchers.

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:QMM.41560
Date January 1993
CreatorsCardin, Régis
ContributorsDe Mori, Renato (advisor)
PublisherMcGill University
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
LanguageEnglish
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Formatapplication/pdf
CoverageDoctor of Philosophy (School of Computer Science.)
RightsAll items in eScholarship@McGill are protected by copyright with all rights reserved unless otherwise indicated.
Relationalephsysno: 001393624, proquestno: NN94599, Theses scanned by UMI/ProQuest.

Page generated in 0.0024 seconds