Global ETD Search

Return to search

A Design of Trilingual Speech Recognition System for Chinese, Russian and Thai

Economy growth rate is an index of a nation¡¦s gross productivity. China, Russia and Thailand are a few nations whose economy growth rates exceed the global average. In the recent years, the rapid development in China, including the enhanced relation with Taiwan, has made her the member of the BRICS, the top five emerging countries in the world. Russia has been playing an important role in the international society during the past decades. She is not only the member of the G8, the group of eight major industrial nations, but also her language, Russian, is one of the six official languages in the United Nations. According to the statistics of the Taiwan Funds, Russia and Thailand are the top two countries in their investment growth. Thailand, located in the middle of the Southeast Peninsular, together with Malaysia and Philippines, are the three founding members of the ASEAN 10, the Association of Ten Southeast Asian Nations. Due to the industrial and household needs, Taiwan has offered job opportunities to foreign labors from the Southeast countries. Therefore, it is our objective to design a trilingual speech recognition system for Chinese, Russian and Thai to meet the needs of language learning and household living.
This system utilizes 404 Chinese, 611 Russian and 123 Thai common mono-syllables, selected from their pronunciation rules, as the major speech training and recognition methodology. Mel-frequency cepstral coefficients, linear predicted cepstral coefficients, and hidden Markov model are used as the two syllable feature models and the recognition model respectively. Under the AMD 2.2 GHz Athlon XP 2800+ personal computer and Ubuntu 9.04 operating system environment, the correct phrase recognition rates of 88.87%, 84.31% and 87.58% can be reached using phonotactical rules for the 82,000 Chinese, 31,883 Russian and 3,809 Thai phrase database respectively. Furthermore, a trilingual language-speech recognition system for 300 common words, composed of 100 words from each language, is developed. A 98.66% correct language-phrase recognition rate can be obtained.

http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0910112-144047

Hidden Markov model

Mel-frequency cepstral coefficients

Linear predicted cepstral coefficients

Phonotactic

Speech recognition

Identifer	oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0910112-144047
Date	10 September 2012
Creators	Pan, Hao-Ming
Contributors	Chii-Maw Uang, Chih-Chien Chen, Sheau-Shong Bor
Publisher	NSYSU
Source Sets	NSYSU Electronic Thesis and Dissertation Archive
Language	Cholon
Detected Language	English
Type	text
Format	application/pdf
Source	http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0910112-144047
Rights	user_define, Copyright information available at source archive

Page generated in 0.0021 seconds

A Design of Trilingual Speech Recognition System for Chinese, Russian and Thai

Description

Links & Downloads

Tags

Additional Fields