Global ETD Search

81	Multi-resolution analysis based acoustic features for speech recognition =: 基於多尺度分析的聲學特徵在語音識別中的應用. / 基於多尺度分析的聲學特徵在語音識別中的應用 / Multi-resolution analysis based acoustic features for speech recognition =: Ji yu duo chi du fen xi de sheng xue te zheng zai yu yin shi bie zhong de ying yong. / Ji yu duo chi du fen xi de sheng xue te zheng zai yu yin shi bie zhong de ying yong January 1999 (has links) Chan Chun Ping. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 134-137). / Text in English; abstracts in English and Chinese. / Chan Chun Ping. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Automatic Speech Recognition --- p.1 / Chapter 1.2 --- Review of Speech Recognition Techniques --- p.2 / Chapter 1.3 --- Review of Signal Representation --- p.4 / Chapter 1.4 --- Review of Wavelet Transform --- p.7 / Chapter 1.5 --- Objective of Thesis --- p.11 / Chapter 1.6 --- Thesis Outline --- p.11 / References --- p.13 / Chapter 2 --- Baseline Speech Recognition System --- p.17 / Chapter 2.1 --- Intoduction --- p.17 / Chapter 2.2 --- Feature Extraction --- p.18 / Chapter 2.3 --- Hidden Markov Model for Speech Recognition --- p.24 / Chapter 2.3.1 --- The Principle of Using HMM in Speech Recognition --- p.24 / Chapter 2.3.2 --- Elements of an HMM --- p.27 / Chapter 2.3.3 --- Parameters Estimation and Recognition Algorithm --- p.30 / Chapter 2.3.4 --- Summary of HMM based Speech Recognition --- p.31 / Chapter 2.4 --- TIMIT Continuous Speech Corpus --- p.32 / Chapter 2.5 --- Baseline Speech Recognition Experiments --- p.36 / Chapter 2.6 --- Summary --- p.39 / References --- p.40 / Chapter 3 --- Multi-Resolution Based Acoustic Features --- p.42 / Chapter 3.1 --- Introduction --- p.42 / Chapter 3.2 --- Discrete Wavelet Transform --- p.43 / Chapter 3.3 --- Periodic Discrete Wavelet Transform --- p.47 / Chapter 3.4 --- Multi-Resolution Analysis on STFT Spectrum --- p.49 / Chapter 3.5 --- Principal Component Analysis --- p.52 / Chapter 3.5.1 --- Related Work --- p.52 / Chapter 3.5.2 --- Theoretical Background of PCA --- p.53 / Chapter 3.5.3 --- Examples of Basis Vectors Found by PCA --- p.57 / Chapter 3.6 --- Experiments for Multi-Resolution Based Feature --- p.60 / Chapter 3.6.1 --- Experiments with Clean Speech --- p.60 / Chapter 3.6.2 --- Experiments with Noisy Speech --- p.64 / Chapter 3.7 --- Summary --- p.69 / References --- p.70 / Chapter 4 --- Wavelet Packet Based Acoustic Features --- p.72 / Chapter 4.1 --- Introduction --- p.72 / Chapter 4.2 --- Wavelet Packet Filter-Bank --- p.74 / Chapter 4.3 --- Dimensionality Reduction --- p.76 / Chapter 4.4 --- Filter-Bank Parameters --- p.77 / Chapter 4.4.1 --- Mel-Scale Wavelet Packet Filter-Bank --- p.77 / Chapter 4.4.2 --- Effect of Down-Sampling --- p.78 / Chapter 4.4.3 --- Mel-Scale Wavelet Packet Tree --- p.81 / Chapter 4.4.4 --- Wavelet Filters --- p.84 / Chapter 4.5 --- Experiments Using Wavelet Packet Based Acoustic Features --- p.86 / Chapter 4.6 --- Broad Phonetic Class Analysis --- p.89 / Chapter 4.7 --- Discussion --- p.92 / Chapter 4.8 --- Summary --- p.99 / References --- p.100 / Chapter 5 --- De-Noising by Wavelet Transform --- p.101 / Chapter 5.1 --- Introduction --- p.101 / Chapter 5.2 --- De-Noising Capability of Wavelet Transform --- p.103 / Chapter 5.3 --- Wavelet Transform Based Wiener Filtering --- p.105 / Chapter 5.3.1 --- Sub-Band Position for Wiener Filtering --- p.107 / Chapter 5.3.2 --- Estimation of Short-Time Speech and Noise Power --- p.109 / Chapter 5.4 --- De-Noising Embedded in Wavelet Packet Filter-Bank --- p.115 / Chapter 5.5 --- Experiments Using Wavelet Build-in De-Noising Properties --- p.118 / Chapter 5.6 --- Discussion --- p.120 / Chapter 5.6.1 --- Broad Phonetic Class Analysis --- p.122 / Chapter 5.6.2 --- Distortion Measure --- p.124 / Chapter 5.7 --- Summary --- p.132 / References --- p.134 / Chapter 6 --- Conclusions and Future Work --- p.138 / Chapter 6.1 --- Conclusions --- p.138 / Chapter 6.2 --- Future Work --- p.140 / References --- p.142 / Appendix 1 Jacobi's Method --- p.143 / Appendix 2 Broad Phonetic Class --- p.148 Automatic speech recognition Wavelets (Mathematics)
82	Model-based speech separation and enhancement with single-microphone input. / CUHK electronic theses & dissertations collection January 2008 (has links) Experiments were carried out for continuous real speech mixed with either competitive speech source or broadband noise. Results show that separation outputs bear similar spectral trajectories as the ideal source signals. For speech mixtures, the proposed algorithm is evaluated in two ways: segmental signal-to-interference ratio (segSIR) and Itakura-Saito distortion ( dIS). It is found that (1) interference signal power is reduced in term of segSIR improvement, even under harsh condition of similar target speech and interference powers; and (2) dIS between the estimated source and the clean speech source is significantly smaller than before processing. These assert the capability of the proposed algorithm to extract individual sources from a mixture signal by reducing the interference signal and generating appropriate spectral trajectory for individual source estimates. / Our approach is based on the findings of psychoacoustics. To separate individual sound sources in a mixture signal, human exploits perceptual cues like harmonicity, continuity, context information and prior knowledge of familiar auditory patterns. Furthermore, the application of prior knowledge of speech for top-down separation (called schema-based grouping) is found to be powerful, yet unexplored. In this thesis, a bi-directional, model-based speech separation and enhancement algorithm is proposed by utilizing speech schemas, in particular. As model patterns are employed to generate subsequent spectral envelopes in an utterance, output speech is expected to be natural and intelligible. / The proposed separation algorithm regenerates a target speech source by working out the corresponding spectral envelope and harmonic structure. In the first stage, an optimal sequence of Wiener filtering is determined for subsequent interference removal. Specifically, acoustic models of speech schemas represented by possible line spectrum pair (LSP) patterns, are manipulated to match the input mixture and the given transcription if available, in a top-down manner. Specific LSP patterns are retrieved to constitute a spectral evolution that synchronizes with the target speech source. With this evolution, the mixture spectrum is then filtered to approximate the target source in an appropriate signal level. In the second stage, irrelevant harmonic structure from interfering sources is eliminated by comb filtering. These filters are designed according to the results of pitch tracking. / This thesis focuses on speech source separation problem in a single-microphone scenario. Possible applications of speech separation include recognition, auditory prostheses and surveillance systems. Sound signals typically reach our ears as a mixture of desired signals, other competing sounds and background noise. Example scenarios are talking with someone in crowd with other people speaking or listening to an orchestra with a number of instruments playing concurrently. These sounds are often overlapped in time and frequency. While human attends to individual sources remarkably well under these adverse conditions even with a single ear, the performance of most speech processing system is easily degraded. Therefore, modeling how human auditory system performs is one viable way to extract target speech sources from the mixture before any vulnerable processes. / Lee, Siu Wa. / "April 2008." / Adviser: Chung Ching. / Source: Dissertation Abstracts International, Volume: 70-03, Section: B, page: 1846. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (p. 233-252). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307. Automatic speech recognition Speech perception
83	The Speech Recognition System using Neural Networks Chen, Sung-Lin 06 July 2002 (has links) This paper describes an isolated-word and speaker-independent Mandarin digit speech recognition system based on Backpropagation Neural Networks(BPNN). The recognition rate will achieve up to 95%. When the system was applied to a new user with adaptive modification method, the recognition rate will be higher than 99%. In order to implement the speech recognition system on Digital Signal Processors (DSP) we use a neuron-cancellation rule in accordance with BPNN. The system will cancel about 1/3 neurons and reduce 20%¡ã40% memory size under the rule. However, the recognition rate can still achiever up to 85%. For the output structure of the BPNN, we present a binary-code to supersede the one-to-one model. In addition, we use a new ideal about endpoint detection algorithm for the recoding signals. It can avoid disturbance without complex computations. neural network speech recognition backpropagation
84	Clustering wide-contexts and HMM topologies for spontaneous speech recognition / Shafran, Izhak. January 2001 (has links) Thesis (Ph. D.)--University of Washington, 2001. / Includes bibliographical references (p. 80-95).
85	Speech recognition software : an alternative to reduce ship control manning / Kuffel, Robert F. January 2004 (has links) (PDF) Thesis (M.S. in Information Systems and Operations)--Naval Postgraduate School, March 2004. / Thesis advisor(s): Russell Gottfried, Monique P. Fargues. Includes bibliographical references (p. 43-45). Also available online.
86	Effects of transcription errors on supervised learning in speech recognition Sundaram, Ramasubramanian H. January 2003 (has links) Thesis (M.S.)--Mississippi State University. Department of Electrical and Computer Engineering. / Title from title screen. Includes bibliographical references.
87	Speaker-independent recognition of Putonghua finals / Chan, Chit-man. January 1987 (has links) Thesis (Ph. D.)--University of Hong Kong, 1988.
88	A study of some variations on the hidden Markov modelling approach to speaker independent isolated word speech recognition 梁舜德, Leung, Shun Tak Albert. January 1990 (has links) published_or_final_version / Electrical and Electronic Engineering / Master / Master of Philosophy Automatic speech recognition. Markov processes.
89	Analysis and compensation of stressed and noisy speech with application to robust automatic recognition Hansen, John H. L. 08 1900 (has links) No description available. Speech perception Automatic speech recognition
90	Evolutionary algorithms in artificial intelligence : a comparative study through applications Nettleton, David John January 1994 (has links) For many years research in artificial intelligence followed a symbolic paradigm which required a level of knowledge described in terms of rules. More recently subsymbolic approaches have been adopted as a suitable means for studying many problems. There are many search mechanisms which can be used to manipulate subsymbolic components, and in recent years general search methods based on models of natural evolution have become increasingly popular. This thesis examines a hybrid symbolic/subsymbolic approach and the application of evolutionary algorithms to a problem from each of the fields of shape representation (finding an iterated function system for an arbitrary shape), natural language dialogue (tuning parameters so that a particular behaviour can be achieved) and speech recognition (selecting the penalties used by a dynamic programming algorithm in creating a word lattice). These problems were selected on the basis that each should have a fundamentally different interactions at the subsymbolic level. Results demonstrate that for the experiments conducted the evolutionary algorithms performed well in most cases. However, the type of subsymbolic interaction that may occur influences the relative performance of evolutionary algorithms which emphasise either top-down (evolutionary programming - EP) or bottom-up (genetic algorithm - GA) means of solution discovery. For the shape representation problem EP is seen to perform significantly better than a GA, and reasons for this disparity are discussed. Furthermore, EP appears to offer a powerful means of finding solutions to this problem, and so the background and details of the problem are discussed at length. Some novel constraints on the problem's search space are also presented which could be used in related work. For the dialogue and speech recognition problems a GA and EP produce good results with EP performing slightly better. Results achieved with EP have been used to improve the performance of a speech recognition system. 510 Genetic algorithm; Speech recognition

Search results