The first objective of this thesis is to apply psycho-acoustic principles to the spatial processing of speech signals in noisy and reverberant environment. The key assumption that will be adopted is that modern signal processing has failed to mimic the <i>cock-tail party effect</i> because there has been no attempt to adequately incorporate the psycho acoustical phenomenon of audio masking to aid source separation. A quasi linear mechanism for mimicking <i>simultaneous frequency</i> masking and<i> temporal masking</i> (post masking) techniques are developed. This frame work is used to construct blind source separation algorithms that exploit audio masking prior to source separation (preprocessor) and after source separation (postprocessor). The final objective of this thesis is to exploit the perceptual irrelevancy of some of the input speech spectrum using the perceptual masking techniques before utilizing the subspace method as a preprocessor of the frequency-domain ICA (FDICA) which reduces the effect of room reflections in advance and the remaining direct sounds then being separated by ICA. Incorporating the perceptual masking techniques prior to the application of FDICA with the subspace method as preprocessor not only reduces the computational complexity of similarity measure for solving the permutations but also avoids the so-called permutation problem by targeting a specific speech signal more intelligible than the available microphone signals.
Identifer | oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:660986 |
Date | January 2005 |
Creators | Guddeti, Ram Mohana Reddy |
Publisher | University of Edinburgh |
Source Sets | Ethos UK |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Source | http://hdl.handle.net/1842/12073 |
Page generated in 0.0025 seconds