Linear Predictive Coding (LPC) has been used to compress and encode speech signals for digital transmission at a low bit rate. The Partial Correlation (PARCOR) parameter associated with LPC that represents a vocal tract model based on a lattice filter structure is considered for speech recognition. For the same purpose, the use of FIR coefficients and the frequency response of AR model were previously investigated. <p>In this thesis, we investigate the mechanics of the speech production process in human beings and discuss the place and manner of articulation for each of the major phoneme classes of American English. Then we characterize some typical vowel and consonant phonemes by using the eighth order PARCOR parameter associated with LPC.<p>This thesis explores a method to detect phonemes from a continuous stream of speech. The system being developed slides a time window of 16 ms and calculates PARCOR parameters continuously, feeding them to a phoneme classifier. The phoneme classifier is a supervised classifier that requires training. The training uses TIMIT speech database, which contains the recordings of 630 speakers of 8 major dialects of American English. The training data are grouped into the vowel group including phoneme [ae], [iy] and [uw] and the consonant group including [sh] and [f]. After the training, the decision rule is derived. We design two classifiers in this thesis, one is a vowel classifier and the other one is a consonant classifier, both of them use the maximum likelihood decision rule to classify unknown phonemes. <p>The results of classification of vowel and consonant in a one-syllable word are shown in the thesis. The correct classification rate is 65:22% for the vowel group. The correct classification rate is 93:51% for the consonant group. The results indicate that PARCOR parameters have the potential capability to characterize the phoneme.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:SSU.etd-01122007-084418 |
Date | 15 January 2007 |
Creators | Cui, Ying |
Contributors | Takaya, Kunio, Ko, Seok-Bum, Karki, Rajesh, Gander, Robert, Chen, X. B. (Daniel) |
Publisher | University of Saskatchewan |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://library.usask.ca/theses/available/etd-01122007-084418/ |
Rights | unrestricted, I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to University of Saskatchewan or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. |
Page generated in 0.0021 seconds