Return to search

A Design of Arabic Speech Recognition System

Arab world is one of the most spectacular regions in the earth, especially for her over 2,800 year history, Islamic religion and magnificent culture. She consists of 24 countries and territories where people speak Arabic. The population of Arabic speaking people is approximately 221 million, and ranked the fourth according to the 2009 statistics by Summer Institute of Linguistics, USA. Since 1973, petroleum embargoes, imposed by the Arab world, have influenced global economy and hurt national security seriously. This kind of fossil energy is still irreplaceable until efficient green energy alternative becomes feasible. It is our objective to build a language system that can help us to learn Arabic, to appreciate the beauty of her culture, and to widen our vision of religions.
This thesis investigates the design and implementation strategies for an Arabic speech recognition system. It utilizes the speech features of the 302 common Arabic mono-syllables as the major training and recognition methodology. A training database of 10 utterances per mono-syllable is established by applying Arabic pronunciation rules. These 10 utterances are collected through reading 5 rounds of the same mono-syllables twice with different tones. The first pronounced pattern has high pitch of tone 1, while the second one has falling pitch of tone 4. Mel-frequency cepstral coefficients, linear predicted cepstral coefficients, and hidden Markov model are used as the two syllable feature models and the recognition model respectively. Under the AMD 2.2 GHz Athlon XP 2800+ personal computer and Ubuntu 9.04 operating system environment, correct phrase recognition rates of 86.31% and 93.90% can be reached respectively using phonotactical rules for a 3,600 vocabulary Arabic phrase database and a 590 person name database for Arabic figures. The average computation time for each system is less than 1 second, and the training time for the systems is about two hours.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0819111-212604
Date19 August 2011
CreatorsLee, Shih-Chung
ContributorsChih-Chien Chen, Erl-Huei Lu, Chii-Maw Uang, Sheau-Shong Bor, Tsung Lee
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageCholon
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0819111-212604
Rightsuser_define, Copyright information available at source archive

Page generated in 0.0019 seconds