This thesis investigates the design and implementation strategies for a Japanese speech recognition system. It utilizes the speech features of the 188 common Japanese mono-syllables as the major training and recognition methodology. A training database of 10 utterances per mono-syllable is established by applying Japanese pronunciation rules. These 10 utterances are collected through reading 5 rounds of 188 mono-syllables, where every mono-syllable is consecutively read twice in each round. Mel-frequency cepstrum coefficients, linear predicted cepstrum coefficients, and hidden Markov model are used as the two feature models and the recognition model respectively. Under the Pentium 2.4 GHz personal computer and Ubuntu 8.04 operating system environment, a correct phrase recognition rate of 87% can be reached for a 34,000 Japanese phrase database. The average computation time for each phrase is about 1.5 seconds.
Identifer | oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0824109-164940 |
Date | 24 August 2009 |
Creators | Chen, Meng-yang |
Contributors | Tsung Lee, Sheau-Shong Bor, Chii-Maw Uang, Chih-Chien Chen, Tsung Lee |
Publisher | NSYSU |
Source Sets | NSYSU Electronic Thesis and Dissertation Archive |
Language | Cholon |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0824109-164940 |
Rights | not_available, Copyright information available at source archive |
Page generated in 0.0016 seconds