Global ETD Search

1	A Design of Speech Inputting System for Chinese Resumes Ciou, Jhao-dong 06 September 2007 (has links) In this thesis, hidden Markov model, maximum likelihood ratio and lexicon search strategy are used to establish a Chinese resume inputting system. The resume contains five items: name introduction, gender, birth date, birth place and education. This system is developed using a PC with an Intel Pentium 1.6 GHz CPU and Red Hat Linux 9.0 operating system. For the speaker-dependent case, a resume can be completed within 45 seconds on the average. Hidden Markov model Mel-frequency cepstrum coefficients
2	Feature Design for Text Independent Speaker Recognition in Numerous Speaker Cases Huang, Chun-Hao 28 June 2001 (has links) A Microsoft Windows program is designed to implement a text independent speaker recognition system in numerous speaker cases based on Mel-Cepstrum and hierarchical tree classifier and binary vector quantization. Experimental result show that the accuracy is barely affected by increasing population sizes. And the speed of recognizing is fast than traditional methods. Speaker Recognition Mel-Cepstrum Artificial Neural
3	A Design of Mandarin Keyword Spotting System Wang, Yi-Lii 07 February 2003 (has links) A Mandarin keyword spotting system based on LPC, VQ, discrete-time HMM and Viterbi algorithm is proposed in the thesis. Joining with a dialogue system, this keyword spotting platform is further refined to a prototype of Taiwan Railway Natural Language Reservation System. In the reservation process, five questions: name and ID number, departure station, destination station, train type and number of tickets, and time schedule are asked by the computer-dialogue attendant. Following by the customer¡¦s speech confirmation, electronic tickets can be correctly issued and printed within 90 seconds in a laboratory environment. Speech Recognition LPC Hidden Markov Model Cepstrum
4	A Design of French Speech Recognition System Li, Chun-Ching 24 August 2010 (has links) This thesis investigates the design and implementation strategies for a French speech recognition system. It utilizes the speech features of the 425 common French mono-syllables as the major training and recognition methodology. A training database is established by reading each mono-syllable 12 times in 6 rounds. Every mono-syllable is consecutively read twice with different tones. The first pronounced pattern has high pitch of tone 1,while the second one has falling pitch of tone 4. Mel-frequency cepstrum coefficients, linear predictive cepstrum coefficients, and hidden Markov model are used as the two feature models and the recognition model respectively. Under the AMD Athlon xp 2800+ with clock rate 2.2GHz personal computer and Ubuntu 9.04 operating system environment, a correct phrase recognition rate of 86% can be reached for a 3850 French phrase database. The average computation time for each phrase is about 1.5 seconds. Linear predictive cepstrum coefficients Mel-frequency cepstrum coefficients Hidden Markov model
5	A Design of Mandarin Speech Recognition System for Addresses in Taiwan Cheng, Chi-Feng 31 August 2005 (has links) A Mandarin speech recognition system for addresses in Taiwan, based on end-point detection, MFCC and HMM, is proposed and implemented in this thesis. It includes both phrase and monosyllable recognition tasks. For the phrase recognition part, we select the initial candidates before the final recognition stage to tremendously reduce the computational time. On the other side, for the monosyllable recognition part, we further refine the recognition details to improve the correct rate under easily confused circumstances. The final system can achieve 85% correct identification rate, and the address recognition can be completed within 2 seconds in the laboratory environment for speaker-dependent case. Hidden Markov model(HMM) Mel-frequency cepstrum(MFCC) End-point detection Mel-frequency cepstrum
6	A Feature Design System for Speaker Independent Phrase Recognition Huang, Ming-Chong 15 June 2001 (has links) A novel phrase recognition method is proposed. It eliminates the speech difference between intraspeaker or interspeaker by transform phrases to difference subspace. A new endpoint detection method is also proposed, it can detection the human speech signal more effectively. All methods are test and verify at Microsoft Windows environment. Signal Space Difference Subspace Cepstrum Energy-Entropy Feature Phrase Recognition Mel-Cepstrum
7	A Feature Design of Multi-Language Identification System Lin, Jun-Ching 17 July 2003 (has links) A multi-language identification system of 10 languages: Mandarin, Japanese, Korean, Tamil, Vietnamese, English, French, German, Spanish and Farsi, is built in this thesis. The system utilizes cepstrum coefficients, delta cepstrum coefficients and linear predictive coding coefficients to extract the language features, and incorporates Gaussian mixture model and N-gram model to make the language classification. The feasibility of the system is demonstrated in this thesis. Gaussian Mixture Model Linear Predictive Coding Cepstrum Language Identification Delta Cepstrum
8	A Design of Japanese Speech Recognition System Chen, Meng-yang 24 August 2009 (has links) This thesis investigates the design and implementation strategies for a Japanese speech recognition system. It utilizes the speech features of the 188 common Japanese mono-syllables as the major training and recognition methodology. A training database of 10 utterances per mono-syllable is established by applying Japanese pronunciation rules. These 10 utterances are collected through reading 5 rounds of 188 mono-syllables, where every mono-syllable is consecutively read twice in each round. Mel-frequency cepstrum coefficients, linear predicted cepstrum coefficients, and hidden Markov model are used as the two feature models and the recognition model respectively. Under the Pentium 2.4 GHz personal computer and Ubuntu 8.04 operating system environment, a correct phrase recognition rate of 87% can be reached for a 34,000 Japanese phrase database. The average computation time for each phrase is about 1.5 seconds. Linear predicted cepstrum coefficients Hidden Markov model Speech recognition Mel-frequency cepstrum coefficients
9	A Design of English Speech Recognition System Chen, Yung-ming 24 August 2009 (has links) This thesis investigates the design and implementation strategies for a English speech recognition system. Two speech inputting methods, the spelling inputting and the reading inputting, are implemented for English word recognition and query. Mel-frequency cepstrum coefficients, linear predicted cepstrum coefficients, and hidden Markov model are used as the two feature models and the recognition model respectively. Under the Pentium 1.6 GHz personal computer and Ubuntu 8.04 operating system environment, a 95% correct recognition rate can be obtained for a 110 thousand English word database by the spelling inputting method; and a 93% correct recognition rate can be achieved for a 1,500 English word database by the reading inputting method. The average computation time for each word using either inputting method is about 1.5 seconds. Linear predicted cepstrum coefficients Hidden Markov model Mel frequency cepstrum coefficients Speech recognition
10	Mutlimediální diff - audio dokumenty / Multimedia Diff - Audio Documents Komadel, Michal January 2011 (has links) This work describes development of a diff tool working with audio files containing general sound such as music, speech and other sounds. There are presented facts from different domains of science related to sound, such as psychoacoustics, speech recognition and automatic music genre categorisation. This paper also contains description of some diff algorithms and external tools needed for development of the goal application. Moreover, there is introduced design and implementation of the application, settings used for sound features extraction and evaluation of attained results.

Search results