This thesis investigates the design and implementation strategies for a German speech recognition system. It utilizes the speech features of the 434 common German mono-syllables as the major training and recognition methodology. A training database is established by reading each mono-syllable 12 times in 6 rounds. Every mono-syllable is consecutively read twice with different tones. The first pronounced pattern has high pitch of tone 1, while the second one has falling pitch of tone 4. Mel-frequency cepstral coefficients, linear predictive cepstral coefficients, and hidden Markov model are used as the two feature models and the recognition model respectively. Under the AMD Athlon X2-240 with 2.8 GHz clock rate personal computer and Ubuntu 9.04 operating system environment, a correct phrase recognition rate of 84% can be reached for a 3900 German phrase database. The average computation time for each phrase is within 1 second.
Identifer | oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0824110-153157 |
Date | 24 August 2010 |
Creators | Lai, Shih-Sin |
Contributors | Er-Hui Lu, Chih-Chien Chen, Chii-Maw Uang, Xiao-Song Bo, Tsung Lee |
Publisher | NSYSU |
Source Sets | NSYSU Electronic Thesis and Dissertation Archive |
Language | Cholon |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0824110-153157 |
Rights | not_available, Copyright information available at source archive |
Page generated in 0.0018 seconds