• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Novel Non-Acoustic Voiced Speech Sensor: Experimental Results and Characterization

Keenaghan, Kevin Michael 14 January 2004 (has links)
Recovering clean speech from an audio signal with additive noise is a problem that has plagued the signal processing community for decades. One promising technique currently being utilized in speech-coding applications is a multi-sensor approach, in which a microphone is used in conjunction with optical, mechanical, and electrical non-acoustic speech sensors to provide greater versatility in signal processing algorithms. One such non-acoustic glottal waveform sensor is the Tuned Electromagnetic Resonator Collar (TERC) sensor, first developed in [BLP+02]. The sensor is based on Magnetic Resonance Imaging (MRI) concepts, and is designed to detect small changes in capacitance caused by changes to the state of the vocal cords - the glottal waveform. Although preliminary simulations in [BLP+02] have validated the basic theory governing the TERC sensor's operation, results from human subject testing are necessary to accurately characterize the sensor's performance in practice. To this end, a system was designed and developed to provide real-time audio recordings from the sensor while attached to a human test subject. From these recordings, executed in a variety of acoustic noise environments, the practical functionality of the TERC sensor was demonstrated. The sensor in its current evolution is able to detect a periodic waveform during voiced speech, with two clear harmonics and a fundamental frequency equal to that of the speech it is detecting. This waveform is representative of the glottal waveform, with little or no articulation as initially hypothesized. Though statistically significant conclusions about the sensor's immunity to environmental noise are difficult to draw, the results suggest that the TERC sensor is considerably more resistant to the effects of noise than typical acoustic sensors, making it a valuable addition to the multi-sensor speech processing approach.
2

Estimation of glottal source features from the spectral envelope of the acoustic speech signal

Torres, Juan Félix 17 May 2010 (has links)
Speech communication encompasses diverse types of information, including phonetics, affective state, voice quality, and speaker identity. From a speech production standpoint, the acoustic speech signal can be mainly divided into glottal source and vocal tract components, which play distinct roles in rendering the various types of information it contains. Most deployed speech analysis systems, however, do not explicitly represent these two components as distinct entities, as their joint estimation from the acoustic speech signal becomes an ill-defined blind deconvolution problem. Nevertheless, because of the desire to understand glottal behavior and how it relates to perceived voice quality, there has been continued interest in explicitly estimating the glottal component of the speech signal. To this end, several inverse filtering (IF) algorithms have been proposed, but they are unreliable in practice because of the blind formulation of the separation problem. In an effort to develop a method that can bypass the challenging IF process, this thesis proposes a new glottal source information extraction method that relies on supervised machine learning to transform smoothed spectral representations of speech, which are already used in some of the most widely deployed and successful speech analysis applications, into a set of glottal source features. A transformation method based on Gaussian mixture regression (GMR) is presented and compared to current IF methods in terms of feature similarity, reliability, and speaker discrimination capability on a large speech corpus, and potential representations of the spectral envelope of speech are investigated for their ability represent glottal source variation in a predictable manner. The proposed system was found to produce glottal source features that reasonably matched their IF counterparts in many cases, while being less susceptible to spurious errors. The development of the proposed method entailed a study into the aspects of glottal source information that are already contained within the spectral features commonly used in speech analysis, yielding an objective assessment regarding the expected advantages of explicitly using glottal information extracted from the speech signal via currently available IF methods, versus the alternative of relying on the glottal source information that is implicitly contained in spectral envelope representations.
3

Development of an Electromagnetic Glottal Waveform Sensor for Applications in High Acoustic Noise Environments

Pelteku, Altin E. 14 January 2004 (has links)
The challenges of measuring speech signals in the presence of a strong background noise cannot be easily addressed with traditional acoustic technology. A recent solution to the problem considers combining acoustic sensor measurements with real-time, non-acoustic detection of an aspect of the speech production process. While significant advancements have been made in that area using low-power radar-based techniques, drawbacks inherent to the operation of such sensors are yet to be surmounted. Therefore, one imperative scientific objective is to devise new, non-invasive non-acoustic sensor topologies that offer improvements regarding sensitivity, robustness, and acoustic bandwidth. This project investigates a novel design that directly senses the glottal flow waveform by measuring variations in the electromagnetic properties of neck tissues during voiced segments of speech. The approach is to explore two distinct sensor configurations, namely the“six-element" and the“parallel-plate" resonator. The research focuses on the modeling aspect of the biological load and the resonator prototypes using multi-transmission line (MTL) and finite element (FE) simulation tools. Finally, bench tests performed with both prototypes on phantom loads as well as human subjects are presented.

Page generated in 0.0651 seconds