Global ETD Search

Return to search

Auditory domain speech enhancement

Many speech enhancement algorithms suffer from musical noise - an estimation residue noise consisting of music-like varying tones. To reduce this annoying noise, some speech enhancement algorithms require post-processing. However, a lack of auditory perception theories about musical noise limits the effectiveness of musical noise reduction methods.

Scientists now have some understanding of the human auditory system, thanks to the advances in hearing research across multiple disciplines - anatomy, physiology, psychology, and neurophysiology. Auditory models, such as the gammatone filter bank and the Meddis inner hair cell model, have been developed to simulate the acoustic to neuron transduction process. The auditory models generate the neuron firing signals called the cochleagram. Cochleagram analysis is a powerful tool to investigate musical noise.

We use auditory perception theories in our musical noise investigations. Some auditory perception theories (e.g., volley theory and auditory scene analysis theories) suggest that speech perception is an auditory grouping process. Temporal properties of neuron firing signals, such as period and rhythm, play important roles in the grouping process. The grouping process generates a foreground speech stream, a background noise stream, and possibly additional streams.

We assume that musical noise is the result of grouping to the background stream the neuron firing signals whose temporal properties are different from the ones grouped to the foreground stream. Based on this hypothesis, we believe that a musical noise reduction method should increase the probability of grouping the enhanced neuron
firing signals to the foreground speech stream, or decrease the probability of grouping them into the background stream. We propose a post-processing musical noise reduction method for the auditory Wiener filter speech enhancement method, in which we
employ a proposed complex gammatone filter bank for the cochlear decomposition. The results of a subjective listening test of our speech enhancement system show that the proposed musical noise reduction method is effective. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2008-05-28 16:11:28.374

http://hdl.handle.net/1974/1229

Speech enhancement

Musical noise

Gammatone filter

Meddis inner hair cell model

Cochleagram

Auditory grouping

Perception

Identifer	oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OKQ.1974/1229
Date	04 June 2008
Creators	Yang, Xiaofeng
Contributors	Queen's University (Kingston, Ont.). Theses (Queen's University (Kingston, Ont.))
Source Sets	Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
Language	English, English
Detected Language	English
Type	Thesis
Format	2291892 bytes, application/pdf
Rights	This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Relation	Canadian theses

Page generated in 0.0021 seconds

Auditory domain speech enhancement

Description

Links & Downloads

Tags

Additional Fields