91 |
Model-based speech separation and enhancement with single-microphone input. / CUHK electronic theses & dissertations collectionJanuary 2008 (has links)
Experiments were carried out for continuous real speech mixed with either competitive speech source or broadband noise. Results show that separation outputs bear similar spectral trajectories as the ideal source signals. For speech mixtures, the proposed algorithm is evaluated in two ways: segmental signal-to-interference ratio (segSIR) and Itakura-Saito distortion ( dIS). It is found that (1) interference signal power is reduced in term of segSIR improvement, even under harsh condition of similar target speech and interference powers; and (2) dIS between the estimated source and the clean speech source is significantly smaller than before processing. These assert the capability of the proposed algorithm to extract individual sources from a mixture signal by reducing the interference signal and generating appropriate spectral trajectory for individual source estimates. / Our approach is based on the findings of psychoacoustics. To separate individual sound sources in a mixture signal, human exploits perceptual cues like harmonicity, continuity, context information and prior knowledge of familiar auditory patterns. Furthermore, the application of prior knowledge of speech for top-down separation (called schema-based grouping) is found to be powerful, yet unexplored. In this thesis, a bi-directional, model-based speech separation and enhancement algorithm is proposed by utilizing speech schemas, in particular. As model patterns are employed to generate subsequent spectral envelopes in an utterance, output speech is expected to be natural and intelligible. / The proposed separation algorithm regenerates a target speech source by working out the corresponding spectral envelope and harmonic structure. In the first stage, an optimal sequence of Wiener filtering is determined for subsequent interference removal. Specifically, acoustic models of speech schemas represented by possible line spectrum pair (LSP) patterns, are manipulated to match the input mixture and the given transcription if available, in a top-down manner. Specific LSP patterns are retrieved to constitute a spectral evolution that synchronizes with the target speech source. With this evolution, the mixture spectrum is then filtered to approximate the target source in an appropriate signal level. In the second stage, irrelevant harmonic structure from interfering sources is eliminated by comb filtering. These filters are designed according to the results of pitch tracking. / This thesis focuses on speech source separation problem in a single-microphone scenario. Possible applications of speech separation include recognition, auditory prostheses and surveillance systems. Sound signals typically reach our ears as a mixture of desired signals, other competing sounds and background noise. Example scenarios are talking with someone in crowd with other people speaking or listening to an orchestra with a number of instruments playing concurrently. These sounds are often overlapped in time and frequency. While human attends to individual sources remarkably well under these adverse conditions even with a single ear, the performance of most speech processing system is easily degraded. Therefore, modeling how human auditory system performs is one viable way to extract target speech sources from the mixture before any vulnerable processes. / Lee, Siu Wa. / "April 2008." / Adviser: Chung Ching. / Source: Dissertation Abstracts International, Volume: 70-03, Section: B, page: 1846. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (p. 233-252). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
|
92 |
Speech perception in Chinese: how are the different levels of ambiguity resolved?. / CUHK electronic theses & dissertations collectionJanuary 2009 (has links)
Three experiments were conducted to provide a better understanding about the fundamental processes involved in Chinese speech recognition. Specifically, we intended to answer three questions. First, are subsyllabic units like individual phonemes or whole syllables the basic encoding units in Chinese speech recognition? Second, does tone play a significant role in generating candidate words before correct identification? Third, how can the different meanings of homophones be resolved? In Experiment 1, we used the gating paradigm to explore the three issues. Results suggested that both subsyllabic (onset) and syllabic representations were important in recognizing Chinese monosyllables. Tonal constraints emerged only when context was available. And context also facilitated homophone recognition. In Experiment 2, the visual-world paradigm was used to verify the major findings in gating. While the salience of syllable and the absence of tonal constraints without context were replicated, the onset effect was greatly diminished. Further analyses suggested that acoustic similarity might also play a role in speech recognition. Experiment 3 also employed the visual-world paradigm. The resolution of Chinese homophones was found to be influenced by relative meaning frequency and context position. Based on these findings and those from related studies, we proposed a model of Chinese speech perception, in which initially, segmental and suprasegmental types of information were processed in separate but interacting pathways. Outputs from the two pathways were then combined at a later time point and jointly activated the corresponding morpheme. Implications of the model and its relations to previous findings are discussed. / Tsang, Yiu Kei. / Adviser: Hsuan-Chih Chen. / Source: Dissertation Abstracts International, Volume: 72-11, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2009. / Includes bibliographical references (leaves 161-174). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [201-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese; some appendices include Chinese characters.
|
93 |
Effects of speech and noise on Cantonese speech intelligibilityMak, Cheuk-yan, Charin. January 2006 (has links)
Thesis (M. Sc.)--University of Hong Kong, 2006. / Title proper from title frame. Also available in printed format.
|
94 |
Phonetic category learningMcGuire, Grant Leese, January 2007 (has links)
Thesis (Ph. D.)--Ohio State University, 2007. / Title from first page of PDF file. Includes bibliographical references (p. 119-126).
|
95 |
The growth of phonological awareness response to reading intervention by children with reading disabilities who exhibit typical or below-average language skills /Wise, Justin Coy, January 2005 (has links)
Thesis (Ph.D.)--Georgia State University, 2005. / Title from title screen. Rose Sevcik, committee chair; Robin Morris, Mary Ann Romski, Byron Robinson, committee members. 194 p. [numbered xii, 180] ; ill. (some col.) Description based on contents viewed Feb. 26, 2007. Includes bibliographical references (p. 168-180).
|
96 |
Constraints on infant speech acquisition a cross-language perspective /Gildersleeve-Neumann, Christina Elke. January 2001 (has links)
Thesis (Ph. D.)--University of Texas at Austin, 2001. / Vita. Includes bibliographical references. Available also from UMI/Dissertation Abstracts International.
|
97 |
Intermodal perception of speech in Asperger syndrome /Schroeder, Jessica H. January 2008 (has links)
Thesis (M.A.)--York University, 2008. Graduate Programme in Psychology. / Typescript. Includes bibliographical references (leaves 78-90). Also available on the Internet. MODE OF ACCESS via web browser by entering the following URL: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&res_dat=xri:pqdiss&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&rft_dat=xri:pqdiss:MR45971
|
98 |
Tonal and segmental perception in native Cantonese-speaking musicians, amateur musicians and non-musiciansPang, Ming-wai, 彭明慧 January 2013 (has links)
Tone matching, judgment and segmental judgment tasks conducted in silent reading and listening conditions are devised to test the hypothesis that musical training improves tone and segmental (onset, rime) perception in a tone language, Cantonese, in native Cantonese-speaking individuals. Four-word sequences (in which two words are primes and two are targets, or three words are primes and one is target) were presented to three groups of participants: professional musicians, amateur musicians and non-musicians in the silent reading condition, whereas four sound stimuli of Chinese characters were presented in the listening condition, and their accuracy and response time were recorded. Musicians, both professional and amateur, performed significantly better in tone and segmental perception than their musically naïve counterparts. Moreover, the response time exhibited a contrastive pattern in the two conditions: musicians tended to respond faster in the silent reading condition, but took a longer time in the listening condition.
These results clearly demonstrate that musical training facilitated the perceptual processing of Cantonese tone and segmental phonemes by native Cantonese- speakers. Music-to-language transfer effects are highlighted and the non-significant differences exhibited between professional musicians and amateur musicians in five out of six tasks show that musical training need not be pursued to an advanced level for participants to gain perceptual benefits. The results shed light on possible forms of remedial programme development and interventions for children with language disorders such as dyslexia. / published_or_final_version / Linguistics / Master / Master of Philosophy
|
99 |
Constraints on infant speech acquisition : a cross-language perspectiveGildersleeve-Neumann, Christina Elke 14 March 2011 (has links)
Not available / text
|
100 |
An upper bound for tactile recognition of speechMcClellan, Richard Paul, 1944- January 1967 (has links)
No description available.
|
Page generated in 0.1067 seconds