41 |
Multi-resolution analysis based acoustic features for speech recognition =: 基於多尺度分析的聲學特徵在語音識別中的應用. / 基於多尺度分析的聲學特徵在語音識別中的應用 / Multi-resolution analysis based acoustic features for speech recognition =: Ji yu duo chi du fen xi de sheng xue te zheng zai yu yin shi bie zhong de ying yong. / Ji yu duo chi du fen xi de sheng xue te zheng zai yu yin shi bie zhong de ying yongJanuary 1999 (has links)
Chan Chun Ping. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 134-137). / Text in English; abstracts in English and Chinese. / Chan Chun Ping. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Automatic Speech Recognition --- p.1 / Chapter 1.2 --- Review of Speech Recognition Techniques --- p.2 / Chapter 1.3 --- Review of Signal Representation --- p.4 / Chapter 1.4 --- Review of Wavelet Transform --- p.7 / Chapter 1.5 --- Objective of Thesis --- p.11 / Chapter 1.6 --- Thesis Outline --- p.11 / References --- p.13 / Chapter 2 --- Baseline Speech Recognition System --- p.17 / Chapter 2.1 --- Intoduction --- p.17 / Chapter 2.2 --- Feature Extraction --- p.18 / Chapter 2.3 --- Hidden Markov Model for Speech Recognition --- p.24 / Chapter 2.3.1 --- The Principle of Using HMM in Speech Recognition --- p.24 / Chapter 2.3.2 --- Elements of an HMM --- p.27 / Chapter 2.3.3 --- Parameters Estimation and Recognition Algorithm --- p.30 / Chapter 2.3.4 --- Summary of HMM based Speech Recognition --- p.31 / Chapter 2.4 --- TIMIT Continuous Speech Corpus --- p.32 / Chapter 2.5 --- Baseline Speech Recognition Experiments --- p.36 / Chapter 2.6 --- Summary --- p.39 / References --- p.40 / Chapter 3 --- Multi-Resolution Based Acoustic Features --- p.42 / Chapter 3.1 --- Introduction --- p.42 / Chapter 3.2 --- Discrete Wavelet Transform --- p.43 / Chapter 3.3 --- Periodic Discrete Wavelet Transform --- p.47 / Chapter 3.4 --- Multi-Resolution Analysis on STFT Spectrum --- p.49 / Chapter 3.5 --- Principal Component Analysis --- p.52 / Chapter 3.5.1 --- Related Work --- p.52 / Chapter 3.5.2 --- Theoretical Background of PCA --- p.53 / Chapter 3.5.3 --- Examples of Basis Vectors Found by PCA --- p.57 / Chapter 3.6 --- Experiments for Multi-Resolution Based Feature --- p.60 / Chapter 3.6.1 --- Experiments with Clean Speech --- p.60 / Chapter 3.6.2 --- Experiments with Noisy Speech --- p.64 / Chapter 3.7 --- Summary --- p.69 / References --- p.70 / Chapter 4 --- Wavelet Packet Based Acoustic Features --- p.72 / Chapter 4.1 --- Introduction --- p.72 / Chapter 4.2 --- Wavelet Packet Filter-Bank --- p.74 / Chapter 4.3 --- Dimensionality Reduction --- p.76 / Chapter 4.4 --- Filter-Bank Parameters --- p.77 / Chapter 4.4.1 --- Mel-Scale Wavelet Packet Filter-Bank --- p.77 / Chapter 4.4.2 --- Effect of Down-Sampling --- p.78 / Chapter 4.4.3 --- Mel-Scale Wavelet Packet Tree --- p.81 / Chapter 4.4.4 --- Wavelet Filters --- p.84 / Chapter 4.5 --- Experiments Using Wavelet Packet Based Acoustic Features --- p.86 / Chapter 4.6 --- Broad Phonetic Class Analysis --- p.89 / Chapter 4.7 --- Discussion --- p.92 / Chapter 4.8 --- Summary --- p.99 / References --- p.100 / Chapter 5 --- De-Noising by Wavelet Transform --- p.101 / Chapter 5.1 --- Introduction --- p.101 / Chapter 5.2 --- De-Noising Capability of Wavelet Transform --- p.103 / Chapter 5.3 --- Wavelet Transform Based Wiener Filtering --- p.105 / Chapter 5.3.1 --- Sub-Band Position for Wiener Filtering --- p.107 / Chapter 5.3.2 --- Estimation of Short-Time Speech and Noise Power --- p.109 / Chapter 5.4 --- De-Noising Embedded in Wavelet Packet Filter-Bank --- p.115 / Chapter 5.5 --- Experiments Using Wavelet Build-in De-Noising Properties --- p.118 / Chapter 5.6 --- Discussion --- p.120 / Chapter 5.6.1 --- Broad Phonetic Class Analysis --- p.122 / Chapter 5.6.2 --- Distortion Measure --- p.124 / Chapter 5.7 --- Summary --- p.132 / References --- p.134 / Chapter 6 --- Conclusions and Future Work --- p.138 / Chapter 6.1 --- Conclusions --- p.138 / Chapter 6.2 --- Future Work --- p.140 / References --- p.142 / Appendix 1 Jacobi's Method --- p.143 / Appendix 2 Broad Phonetic Class --- p.148
|
42 |
Model-based speech separation and enhancement with single-microphone input. / CUHK electronic theses & dissertations collectionJanuary 2008 (has links)
Experiments were carried out for continuous real speech mixed with either competitive speech source or broadband noise. Results show that separation outputs bear similar spectral trajectories as the ideal source signals. For speech mixtures, the proposed algorithm is evaluated in two ways: segmental signal-to-interference ratio (segSIR) and Itakura-Saito distortion ( dIS). It is found that (1) interference signal power is reduced in term of segSIR improvement, even under harsh condition of similar target speech and interference powers; and (2) dIS between the estimated source and the clean speech source is significantly smaller than before processing. These assert the capability of the proposed algorithm to extract individual sources from a mixture signal by reducing the interference signal and generating appropriate spectral trajectory for individual source estimates. / Our approach is based on the findings of psychoacoustics. To separate individual sound sources in a mixture signal, human exploits perceptual cues like harmonicity, continuity, context information and prior knowledge of familiar auditory patterns. Furthermore, the application of prior knowledge of speech for top-down separation (called schema-based grouping) is found to be powerful, yet unexplored. In this thesis, a bi-directional, model-based speech separation and enhancement algorithm is proposed by utilizing speech schemas, in particular. As model patterns are employed to generate subsequent spectral envelopes in an utterance, output speech is expected to be natural and intelligible. / The proposed separation algorithm regenerates a target speech source by working out the corresponding spectral envelope and harmonic structure. In the first stage, an optimal sequence of Wiener filtering is determined for subsequent interference removal. Specifically, acoustic models of speech schemas represented by possible line spectrum pair (LSP) patterns, are manipulated to match the input mixture and the given transcription if available, in a top-down manner. Specific LSP patterns are retrieved to constitute a spectral evolution that synchronizes with the target speech source. With this evolution, the mixture spectrum is then filtered to approximate the target source in an appropriate signal level. In the second stage, irrelevant harmonic structure from interfering sources is eliminated by comb filtering. These filters are designed according to the results of pitch tracking. / This thesis focuses on speech source separation problem in a single-microphone scenario. Possible applications of speech separation include recognition, auditory prostheses and surveillance systems. Sound signals typically reach our ears as a mixture of desired signals, other competing sounds and background noise. Example scenarios are talking with someone in crowd with other people speaking or listening to an orchestra with a number of instruments playing concurrently. These sounds are often overlapped in time and frequency. While human attends to individual sources remarkably well under these adverse conditions even with a single ear, the performance of most speech processing system is easily degraded. Therefore, modeling how human auditory system performs is one viable way to extract target speech sources from the mixture before any vulnerable processes. / Lee, Siu Wa. / "April 2008." / Adviser: Chung Ching. / Source: Dissertation Abstracts International, Volume: 70-03, Section: B, page: 1846. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (p. 233-252). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
|
43 |
Clustering wide-contexts and HMM topologies for spontaneous speech recognition /Shafran, Izhak. January 2001 (has links)
Thesis (Ph. D.)--University of Washington, 2001. / Includes bibliographical references (p. 80-95).
|
44 |
Speech recognition software : an alternative to reduce ship control manning /Kuffel, Robert F. January 2004 (has links) (PDF)
Thesis (M.S. in Information Systems and Operations)--Naval Postgraduate School, March 2004. / Thesis advisor(s): Russell Gottfried, Monique P. Fargues. Includes bibliographical references (p. 43-45). Also available online.
|
45 |
Effects of transcription errors on supervised learning in speech recognitionSundaram, Ramasubramanian H. January 2003 (has links)
Thesis (M.S.)--Mississippi State University. Department of Electrical and Computer Engineering. / Title from title screen. Includes bibliographical references.
|
46 |
Speaker-independent recognition of Putonghua finals /Chan, Chit-man. January 1987 (has links)
Thesis (Ph. D.)--University of Hong Kong, 1988.
|
47 |
A study of some variations on the hidden Markov modelling approach to speaker independent isolated word speech recognition梁舜德, Leung, Shun Tak Albert. January 1990 (has links)
published_or_final_version / Electrical and Electronic Engineering / Master / Master of Philosophy
|
48 |
Analysis and compensation of stressed and noisy speech with application to robust automatic recognitionHansen, John H. L. 08 1900 (has links)
No description available.
|
49 |
Modeling speech using a partially observable Markov decison process /Jonas, Michael. January 1900 (has links)
Thesis (Ph.D.)--Tufts University, 2003. / Adviser: James G. Schmolze. Submitted to the Dept. of Computer Science. Includes bibliographical references (leaves 103-109). Access restricted to members of the Tufts University community. Also available via the World Wide Web;
|
50 |
Transformation sharing strategies for MLLR speaker adaptation /Mandal, Arindam. January 2007 (has links)
Thesis (Ph. D.)--University of Washington, 2007. / Vita. Includes bibliographical references (p. 102-115).
|
Page generated in 0.1494 seconds