Return to search

A Design of Italian Speech Recognition System

The European Union (EU) established on November 1, 1993, according to the Maastricht Treaty signed on February 7, 1992. This economic and political community consists of 27 member states, primarily located in Europe. She operates through a supranational and intergovernmental system, including the European Commission, the Council, the Parliament and the Central Bank, to transfer herself from the joint economic development regions to the single market of economic and political integration. Italy is one of the six founding countries of the EU, also one of the G8 members, the eight industrially advanced nations in the world, and playing a force to be reckoned with. It is our objective to build a language system that can help us to learn Italian more effectively, to promote our competency of intercultural understanding, and to widen our vision of travel and living.
This thesis investigates the design and implementation strategies for an Italian speech recognition system. It utilizes the speech features of the 370 common Italian mono-syllables as the major training and recognition methodology. A training database of 10 utterances per mono-syllable is established by applying Italian pronunciation rules. These 10 utterances are collected through reading 5 rounds of the same mono-syllables twice with different tones. The first pronounced pattern has high pitch of tone 1, while the second one has falling pitch of tone 4. Mel-frequency cepstral coefficients, linear predicted cepstral coefficients, and hidden Markov model are used as the two syllable feature models and the recognition model respectively. Under the AMD 2.2 GHz Athlon XP 2800+ personal computer and Ubuntu 9.04 operating system environment, correct phrase recognition rates of 88.35% and 89.32% can be reached using phonotactical rules for a 4,000 vocabulary Italian phrase database and a 3,304 word database for Italian Language Proficiency Test. The average computation time for each system is less than 1.5 seconds, and the training time for the systems is about two hours.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0822111-020504
Date22 August 2011
CreatorsLin, Wei-cheng
ContributorsErl-Huei Lu, Tsung Lee, Sheau-Shong Bor, Chii-Maw Uang, Chih-Chien Chen
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageCholon
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0822111-020504
Rightsuser_define, Copyright information available at source archive

Page generated in 0.0019 seconds