Return to search

Perceptually motivated blind source separation of convolutive audio mixtures

The first objective of this thesis is to apply psycho-acoustic principles to the spatial processing of speech signals in noisy and reverberant environment. The key assumption that will be adopted is that modern signal processing has failed to mimic the <i>cock-tail party effect</i> because there has been no attempt to adequately incorporate the psycho acoustical phenomenon of audio masking to aid source separation. A quasi linear mechanism for mimicking <i>simultaneous frequency</i> masking and<i> temporal masking</i> (post masking) techniques are developed. This frame work is used to construct blind source separation algorithms that exploit audio masking prior to source separation (preprocessor) and after source separation (postprocessor). The final objective of this thesis is to exploit the perceptual irrelevancy of some of the input speech spectrum using the perceptual masking techniques before utilizing the subspace method as a preprocessor of the frequency-domain ICA (FDICA) which reduces the effect of room reflections in advance and the remaining direct sounds then being separated by ICA. Incorporating the perceptual masking techniques prior to the application of FDICA with the subspace method as preprocessor not only reduces the computational complexity of similarity measure for solving the permutations but also avoids the so-called permutation problem by targeting a specific speech signal more intelligible than the available microphone signals.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:660986
Date January 2005
CreatorsGuddeti, Ram Mohana Reddy
PublisherUniversity of Edinburgh
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/1842/12073

Page generated in 0.0025 seconds