• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Design of Arabic Speech Recognition System

Lee, Shih-Chung 19 August 2011 (has links)
Arab world is one of the most spectacular regions in the earth, especially for her over 2,800 year history, Islamic religion and magnificent culture. She consists of 24 countries and territories where people speak Arabic. The population of Arabic speaking people is approximately 221 million, and ranked the fourth according to the 2009 statistics by Summer Institute of Linguistics, USA. Since 1973, petroleum embargoes, imposed by the Arab world, have influenced global economy and hurt national security seriously. This kind of fossil energy is still irreplaceable until efficient green energy alternative becomes feasible. It is our objective to build a language system that can help us to learn Arabic, to appreciate the beauty of her culture, and to widen our vision of religions. This thesis investigates the design and implementation strategies for an Arabic speech recognition system. It utilizes the speech features of the 302 common Arabic mono-syllables as the major training and recognition methodology. A training database of 10 utterances per mono-syllable is established by applying Arabic pronunciation rules. These 10 utterances are collected through reading 5 rounds of the same mono-syllables twice with different tones. The first pronounced pattern has high pitch of tone 1, while the second one has falling pitch of tone 4. Mel-frequency cepstral coefficients, linear predicted cepstral coefficients, and hidden Markov model are used as the two syllable feature models and the recognition model respectively. Under the AMD 2.2 GHz Athlon XP 2800+ personal computer and Ubuntu 9.04 operating system environment, correct phrase recognition rates of 86.31% and 93.90% can be reached respectively using phonotactical rules for a 3,600 vocabulary Arabic phrase database and a 590 person name database for Arabic figures. The average computation time for each system is less than 1 second, and the training time for the systems is about two hours.

Page generated in 0.0898 seconds