Return to search

A Design of Russian Speech Recognition System

Language plays an important role for understanding people, their history, culture and even technology. Many countries of the world have developed the technology of the outer space recently, and Russian is the top of the world. In 1998 Russia further launched Zarya, the first International Space Station (ISS) Module, to the outer space, and was deeply involved in the development of the ISS with the U.S.. Since the end of the World War Two, Russia has been one of the five Permanent Members in the United Nations. And then, she became one of the G8 members, an economical forum of eight industrially advanced nations. Because these informations, it is our objective to build a language system that can help us to learn Russian, to taste the beauty of her culture, and to widen our vision of technologies.
This thesis investigates the design and implementation strategies for a Russian speech recognition system. It utilizes the speech features of the 514 common Russian mono-syllables as the major training and recognition methodology. The mono-syllable is established by applying Russian pronunciation rules. These 12 utterances are collected through reading 6 rounds of the same mono-syllables twice with different tones. The first pronounced pattern has high pitch of tone 1, while the second one has falling pitch of tone 4. Mel-frequency cepstral coefficients, linear predicted cepstral coefficients, and hidden Markov model are used as the two syllable feature models and the recognition model respectively. Under the AMD 2.2 GHz Athlon XP 2800+ personal computer and Ubuntu 9.04 operating system environment, correct phrase recognition rates of 86.90% and 94.83% can be reached using phonotactical rules for a 3,900 vocabulary Russian phrase database for TORFL (Test of Russian as a Foreign Language) and a 600 person name database for Russian. The average computation time for each system is less than 1.5 seconds, and the training time for the systems is about three hours.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0819111-210242
Date19 August 2011
CreatorsWu, Yin-Jie
ContributorsChii-Maw Uang, Tsung Lee, Chih-Chien Chen, Sheau-Shong Bor, Erl-Huei Lu
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageCholon
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0819111-210242
Rightsuser_define, Copyright information available at source archive

Page generated in 0.0017 seconds