Global ETD Search

Spectral refinement to speech enhancement

The goal of a speech enhancement algorithm is to remove noise and recover the original signal with as little distortion and residual noise as possible. Most successful real-time algorithms thereof have done in the frequency domain where the frequency amplitude of clean speech is estimated per short-time frame of the noisy signal. The state of-the-art short-time spectral amplitude estimator algorithms estimate the clean spectral amplitude in terms of the power spectral density (PSD) function of the noisy signal. The PSD has to be computed from a large ensemble of signal realizations. However, in practice, it may only be estimated from a finite-length sample of a single realization of the signal. Estimation errors introduced by these limitations deviate the solution from the optimal. Various spectral estimation techniques, many with added spectral smoothing, have been investigated for decades to reduce the estimation errors. These algorithms do not address significantly issue on quality of speech as perceived by a human. This dissertation presents analysis and techniques that offer spectral refinements toward speech enhancement. We present an analytical framework of the effect of spectral estimate variance on the performance of speech enhancement. We use the variance quality factor (VQF) as a quantitative measure of estimated spectra. We show that reducing the spectral estimator VQF reduces significantly the VQF of the enhanced speech. The Autoregressive Multitaper (ARMT) spectral estimate is proposed as a low VQF spectral estimator for use in speech enhancement algorithms. An innovative method of incorporating a speech production model using multiband excitation is also presented as a technique to emphasize the harmonic components of the glottal speech input. / The preconditioning of the noisy estimates by exploiting other avenues of information, such as pitch estimation and the speech production model, effectively increases the localized narrow-band signal-to noise ratio (SNR) of the noisy signal, which is subsequently denoised by the amplitude gain. Combined with voicing structure enhancement, the ARMT spectral estimate delivers enhanced speech with sound clarity desirable to human listeners. The resulting improvements in enhanced speech are observed to be significant with both Objective and Subjective measurement. / by Werayuth Charoenruengkit. / Vita. / Thesis (Ph.D.)--Florida Atlantic University, 2009. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2009. Mode of access: World Wide Web.

Spectral theory (Mathematics)

Noise control

Fuzzy algorithms

Identifer	oai:union.ndltd.org:fau.edu/oai:fau.digital.flvc.org:fau_2875
Contributors	Charoenruengkit, Werayuth., College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
Publisher	Florida Atlantic University
Source Sets	Florida Atlantic University
Language	English
Detected Language	English
Type	Text, Electronic Thesis or Dissertation
Format	xiv, 124 p. : ill. (some col.)., electronic
Rights	http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0017 seconds

Spectral refinement to speech enhancement

Description

Links & Downloads

Tags

Additional Fields