There are basically three areas which can be explored to improve an HMM based recognizer, namely, parameter extraction, training methods and vocabulary representation. / The goal of parameter extraction is not only to find a compact and robust parametric representation of the speech signal, but also to find one which allows the HMMs to obtain the best possible recognition performance. Historically, improvements at this level have usually been obtained on a trial and error basis, using as much knowledge as possible about both the speech production and speech perception mechanisms. That is, the acoustic parameter extraction module has always been viewed as a separate module from the HMMs. This thesis will explore the concept of performing parameter extraction with a connectionist model, whose parameters can be learned from training data. / Two HMM training techniques are used in this thesis, namely MLE and MMIE. Parameter initialization, of critical importance for both, will be investigated for discrete, semi-continuous and continuous HMMs. Training processes involving a combination of MLE and MMIE training are studied. Other issues such as codebook exponents and the use of pause and silence models will also be explored. / Even if the vocabulary contains only 11 words, its representation is a very important issue. The effects of vocabulary representation with phoneme based, word based (with no sharing) and inter-word models will be experimentally evaluated. It will be shown how a word error rate of 0.23% and a string error rate of 0.68% can be achieved on the TIDIGITS corpus--a performance rivalling the best results ever reported by any group of researchers.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:QMM.41560 |
Date | January 1993 |
Creators | Cardin, Régis |
Contributors | De Mori, Renato (advisor) |
Publisher | McGill University |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Format | application/pdf |
Coverage | Doctor of Philosophy (School of Computer Science.) |
Rights | All items in eScholarship@McGill are protected by copyright with all rights reserved unless otherwise indicated. |
Relation | alephsysno: 001393624, proquestno: NN94599, Theses scanned by UMI/ProQuest. |
Page generated in 0.0011 seconds