Global ETD Search

71	A Study of the Automatic Speech Recognition Process and Speaker Adaptation Stokes-Rees, Ian James January 2000 (has links) This thesis considers the entire automated speech recognition process and presents a standardised approach to LVCSR experimentation with HMMs. It also discusses various approaches to speaker adaptation such as MLLR and multiscale, and presents experimental results for cross-task speaker adaptation. An analysis of training parameters and data sufficiency for reasonable system performance estimates are also included. It is found that Maximum Likelihood Linear Regression (MLLR) supervised adaptation can result in 6% reduction (absolute) in word error rate given only one minute of adaptation data, as compared with an unadapted model set trained on a different task. The unadapted system performed at 24% WER and the adapted system at 18% WER. This is achieved with only 4 to 7 adaptation classes per speaker, as generated from a regression tree. Electrical & Computer Engineering automatic speech recognition speaker adaptation HTK HMM MLLR LVCSR
72	A Study of the Automatic Speech Recognition Process and Speaker Adaptation Stokes-Rees, Ian James January 2000 (has links) This thesis considers the entire automated speech recognition process and presents a standardised approach to LVCSR experimentation with HMMs. It also discusses various approaches to speaker adaptation such as MLLR and multiscale, and presents experimental results for cross-task speaker adaptation. An analysis of training parameters and data sufficiency for reasonable system performance estimates are also included. It is found that Maximum Likelihood Linear Regression (MLLR) supervised adaptation can result in 6% reduction (absolute) in word error rate given only one minute of adaptation data, as compared with an unadapted model set trained on a different task. The unadapted system performed at 24% WER and the adapted system at 18% WER. This is achieved with only 4 to 7 adaptation classes per speaker, as generated from a regression tree. Electrical & Computer Engineering automatic speech recognition speaker adaptation HTK HMM MLLR LVCSR
73	A Hidden Markov Model-Based Approach for Emotional Speech Synthesis Yang, Chih-Yung 30 August 2010 (has links) In this thesis, we describe two approaches to automatically synthesize the emotional speech of a target speaker based on the hidden Markov model for his/her neutral speech. In the interpolation based method, the basic idea is the model interpolation between the neutral model of the target speaker and an emotional model selected from a candidate pool. Both the interpolation model selection and the interpolation weight computation are determined based on a model-distance measure. We propose a monophone-based Mahalanobis distance (MBMD). In the parallel model combination (PMC) based method, our basic idea is to model the mismatch between neutral model and emotional model. We train linear regression model to describe this mismatch. And then we combine the target speaker neutral model with the linear regression model. We evaluate our approach on the synthesized emotional speech of angriness, happiness, and sadness with several subjective tests. Experimental results show that the implemented system is able to synthesize speech with emotional expressiveness of the target speaker. speech synthesis HMM emotional expressiveness model combination linear regression model interpolation Mahalanobis distance
74	A Design of Mandarin Speech Recognition System for Addresses Chang, Ching-Yung 06 September 2004 (has links) A Mandarin speech recognition system for addresses based on MFCC, hidden Markov model (HMM) and Viterbi algorithm is proposed in this thesis. HMM is a doubly stochastic process describing the ways of pronunciation by recording the state transitions according to the time-varing properties of the speech signal. In order to simplify the system design and reduce the computational cost, the mono-syllable structure information in Mandarin is used by incorporating both mono-syllable recognizor and HMM for our system. For the speaker-dependent case, Mandarin address inputting can be accomplished within 60 seconds and 98% correct identification rate can be achieved in the laboratory environment. Mel-frequency cepstrum coefficients Hidden Markov model (HMM) phrase recognition end-point detection
75	A Design of Mandarin Speech Recognition System for Addresses in Taiwan Cheng, Chi-Feng 31 August 2005 (has links) A Mandarin speech recognition system for addresses in Taiwan, based on end-point detection, MFCC and HMM, is proposed and implemented in this thesis. It includes both phrase and monosyllable recognition tasks. For the phrase recognition part, we select the initial candidates before the final recognition stage to tremendously reduce the computational time. On the other side, for the monosyllable recognition part, we further refine the recognition details to improve the correct rate under easily confused circumstances. The final system can achieve 85% correct identification rate, and the address recognition can be completed within 2 seconds in the laboratory environment for speaker-dependent case. Hidden Markov model(HMM) Mel-frequency cepstrum(MFCC) End-point detection Mel-frequency cepstrum
76	A System Design of Chinese Resume by Speech Construction Chen, Yue-sheng 28 August 2006 (has links) A system of Chinese resume by speech construction is developed by the use of a novel segmentation mechanism and the classical Hidden Markov Model. The recognition system is based on both mono-syllable HMM's and speech-text alignment schemes. Experimental results indicate that the amount of training materials used for feature extraction can be greatly reduced, and the text content of the recorded speech training data can be different from those of the recognition tasks as well. Each phrase in the resume can be identified within one second, that is approximately the same as the graduate did last year. Furthermore, the user interface of the resume system has been redesigned and polished by the GTK toolkit in order to enable event-driven X-window operations. Speech-text alignment Hidden Markov model(HMM)
77	A Design of Speech Recognition System for Chinese Names of Historical Figures Around the World Lin, Wei-Ci 07 September 2006 (has links) A design of speech recognition system for Chinese names of historical figures around the world is proposed in this thesis. A speech database of approximately forty-six thousand Chinese names is collected and recorded twice for system evaluation. This system applies Mel-frequency cepstrum coefficients, monosyllable HMM¡¦s and speech-text alignment scheme to accomplish initial candidate selection. A Mandarin pitch identification mechanism is then followed to increase the correct rate and obtain the final answer. The experimental results indicate that a 90% correct identification rate can be achieved, under the condition that the first session recording material is used for training and the second one for testing. For the speaker dependent case, the correct name can be recognized within 1.5 seconds, using a PC with an Intel Celeron 2.4 GHz CPU and RedHat Linux 9.0 Operation System. Hidden Markov model(HMM) Endpoint detection
78	Kalbos atpažinimo priemonių tyrimas / Research of speech recognition methods Prokopovič, Valerij 15 June 2005 (has links) Two speech recognition methods: Dynamic Time Warping and Hidden Markov model based methods were investigated in this work To estimate efficiency of the methods, speaker dependent and speaker independent isolated word recognition experiments were performed. During experimental research it was determined that Dynamic Time Warping method is suitable only for speaker dependent speech recognition. Hidden Markov model based method is suitable for both – speaker dependent and speaker independent speech recognition. Informatics LSK Dinaminis laiko skalės kraipymas HTK HMM Paslėpti Markovo modeliai
79	Optimalių parametrų parinkimas, automatizuotam garsyno anotavimui, taikant paslėptų Markovo modelių metodiką / Selecting the most suitable parameters for automatic sound annotation by using hidden Markov models method Štrimaitis, Kęstutis 11 August 2009 (has links) Magistrinio darbo tikslas buvo nustatyti optimalias parametrų reikšmes automatizuotam garsyno anotavimui, taikant paslėptų Markovo modelių metodiką. Tyrime buvo nagrinėjami 25 skirtingų kalbėtojų įrašai. Kiekvienam kalbėtojui buvo naudojama po 60min įrašų apmokymams ir vienas 2min įrašas buvo naudojamas testavimui. Buvo atliekami keturių tipų eksperimentai pavadinti: PMM modelių apjungimas, PMM tikslinimų skaičius, visų Gauso mišinių įterpimas iškarto, Gauso mišinių įterpimas mišinių skaičių didinat po vieną. Kad palengvinti eksperimentų atlikimą ir rezultatų nagrinėjimą buvo sukurtos dvi programos: garsynų sulyginimo programa ir rezultatų vizualizavimo programa. Garsynų sulyginimo programa sulygino eksperto anotuotą garsyną su automatizuotos anotavimo sistemos anotuotu garsynu. Pagal gaunamus sulyginimo rezultatus buvo galima spręsti apie parinktų parametrų reikšmių gerumą. Gauti sulyginimo rezultatai buvo atvaizduojami rezultatų atvaizdavimo programos pagalba. / In my master degree work I am trying to optimize values of parameters for automatic sound annotation, by using hidden Markov models method. In this research were analyzed 25 different speakers. It was used 60min of speak records for learning and 1 record of 2min speak for testing of all speakers. In this research were used four types of experiments which were named: combination of HMM models, HMM correction number, insertion of all Gaussian mixtures at once, insertion of Gaussian mixtures by increasing mixtures one by one. There was created two programs for facilitating experiment execution: corpus comparison program, result visualization program. Corpus comparison program compares two corpuses expert annotated corpus and automatic annotation system’s annotated corpus. From these results we find out how good are the parameter values. The comparison results can be visualized with the visualization program. Informatics Automatizuotas garsyno anotavimas PMM HTK Automatic sound annotation HMM HTK
80	An HMM/MRF-based stochastic framework for robust vehicle tracking Kato, Jien, Watanabe, Toyohide, Joga, Sébastien, Ying, Liu, Hase, Hiroyuki, 加藤, ジェーン, 渡邉, 豊英 09 1900 (has links) No description available. Hidden Markov model (HMM) image classification image segmentation Markov random field (MRF) traffic surveillance vehicle tracking

Search results