• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • Tagged with
  • 4
  • 4
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Enhancement of Speech in Highly Nonstationary Noise Conditions using Harmonic Reconstruction

Liu, Xin 01 January 2009 (has links)
The quality and intelligibility of single channel speech degraded by additive noise remains a challenging problem when only the noisy speech is available. An accurate estimation of the noise spectrum is important for the effective performance of speech enhancement algorithms, especially in nonstationary noise environments. This thesis addresses both two issues. First, a speech enhancement algorithm using harmonic features is introduced. A spectral weighting function is derived by constrained optimization to suppress noise in the frequency domain. Two design parameters are included in the suppression gain, namely the frequency-dependent noise-flooring parameter (FDNFP) and the gain factor. The FDNFP controls the level of admissible residual noise in the enhanced speech, while further enhancement is achieved by adaptive comb filtering using the gain factor with a peak-picking algorithm. Second, a noise estimation algorithm is proposed for nonstationary noise conditions. The speech presence probability is updated by introducing a time-frequency dependent threshold. The frequency dependent smoothing factor for noise estimation is computed based on the estimated speech presence probability in each frequency bin. This algorithm adapts quickly to nonstationary noise environments and preserves more information on weak speech phoneme. The performance of the proposed speech enhancement algorithm is evaluated in terms of Perceptual Evaluation of Speech Quality (ITU-PESQ) scores and Modified Bark Spectral Distortion (MBSD) measures, composite objective measures and listening tests. Our listening tests indicate that 16 listeners on average preferred our harmonic enhanced speech over any of three other approaches about 73% of the time. The performance of the proposed noise estimation algorithm combined with the proposed speech enhancement method in nonstionary noise environments is also tested in terms of ITU-PESQ scores and MBSD measures. Experimental results indicate that the proposed noise estimation algorithm when integrated with the harmonic enhancement method outperforms spectral subtraction, signal subspace method, a perceptually-based enhancement method with a constant noise-flooring parameter, and our original harmonic speech enhancement method in highly nonstationary noise environments.
2

A mathematical model of noise in narrowband power line communication systems

Katayama, Masaaki, Yamazato, Takaya, Okada, Hiraku, 片山, 正昭, 山里, 敬也, 岡田, 啓 07 1900 (has links)
No description available.
3

Knowledge-based speech enhancement

Srinivasan, Sriram January 2005 (has links)
Speech is a fundamental means of human communication. In the last several decades, much effort has been devoted to the efficient transmission and storage of speech signals. With advances in technology making mobile communication ubiquitous, communications anywhere has become a reality. The freedom and flexibility offered by mobile technology brings with it new challenges, one of which is robustness to acoustic background noise. Speech enhancement systems form a vital front-end for mobile telephony in noisy environments such as in cars, cafeterias, subway stations, etc., in hearing aids, and to improve the performance of speech recognition systems. In this thesis, which consists of four research articles, we discuss both single and multi-microphone approaches to speech enhancement. The main contribution of this thesis is a framework to exploit available prior knowledge about both speech and noise. The physiology of speech production places a constraint on the possible shapes of the speech spectral envelope, and this information s captured using codebooks of speech linear predictive (LP) coefficients obtained from a large training database. Similarly, information about commonly occurring noise types is captured using a set of noise codebooks, which can be combined with sound environment classi¯cation to treat different environments differently. In paper A, we introduce maximum-likelihood estimation of the speech and noise LP parameters using the codebooks. The codebooks capture only the spectral shape. The speech and noise gain factors are obtained through a frame-by-frame optimization, providing good performance in practical nonstationary noise environments. The estimated parameters are subsequently used in a Wiener filter. Paper B describes Bayesian minimum mean squared error estimation of the speech and noise LP parameters and functions there-of, while retaining the in- stantaneous gain computation. Both memoryless and memory-based estimators are derived. While papers A and B describe single-channel techniques, paper C describes a multi-channel Bayesian speech enhancement approach, where, in addition to temporal processing, the spatial diversity provided by multiple microphones s also exploited. In paper D, we introduce a multi-channel noise reduction technique motivated by blind source separation (BSS) concepts. In contrast to standard BSS approaches, we use the knowledge that one of the signals is speech and that the other is noise, and exploit their different characteristics. / QC 20100929
4

Zvyšování účinnosti strojového rozpoznávání řeči / Enhancing the effectiveness of automatic speech recognition

Zelinka, Petr January 2012 (has links)
This work identifies the causes for unsatisfactory reliability of contemporary systems for automatic speech recognition when deployed in demanding conditions. The impact of the individual sources of performance degradation is documented and a list of known methods for their identification from the recognized signal is given. An overview of the usual methods to suppress the impact of the disruptive influences on the performance of speech recognition is provided. The essential contribution of the work is the formulation of new approaches to constructing acoustical models of noisy speech and nonstationary noise allowing high recognition performance in challenging conditions. The viability of the proposed methods is verified on an isolated-word speech recognizer utilizing several-hour-long recording of the real operating room background acoustical noise recorded at the Uniklinikum Marburg in Germany. This work is the first to identify the impact of changes in speaker’s vocal effort on the reliability of automatic speech recognition in the full vocal effort range (i.e. whispering through shouting). A new concept of a speech recognizer immune to the changes in vocal effort is proposed. For the purposes of research on changes in vocal effort, a new speech database, BUT-VE1, was created.

Page generated in 0.1046 seconds