Return to search

General Auditory Model of Adaptive Perception of Speech

One of the fundamental challenges for communication by speech is the variability in speech production/acoustics. Talkers vary in the size and shape of their vocal tract, in dialect, and in speaking mannerisms. These differences all impact the acoustic output. Despite this lack of invariance in the acoustic signal, listeners can correctly perceive the speech of many different talkers. This ability to adapt one's perception to the particular acoustic structure of a talker has been investigated for over fifty years. The prevailing explanation for this phenomenon is that listeners construct talker-specific representations that can serve as referents for subsequent speech sounds. Specifically, it is thought that listeners may either be creating mappings between acoustics and phonemes or extracting the vocal tract anatomy and shape for each individual talker. This research focuses on an alternative explanation. A separate line of work has demonstrated that much of the variance between talkers' productions can be captured in their neutral vocal tract shape (that is, the average shape of their vocal tract across multiple vowel productions). The current model tested is that listeners compute an average spectrum (long term average spectrum - LTAS) of a talker's speech and use it as a referent. If this LTAS resembles the acoustic output of the neutral vocal tract shape - the neutral vowel - then it could accommodate some of the talker based variability. The LTAS model results in four main hypotheses: 1) during carrier phrases, listeners compute an LTAS for the talker; 2) this LTAS resembles the spectrum of the neutral vowel; 3) listeners represent subsequent targets relative to this LTAS referent; 4) such a representation reduces talker-specific acoustic variability. The goal of this project was to further develop and test the predictions arising from these hypotheses. Results suggest that the LTAS model needs to be further investigated, as the simple model proposed does not explain the effects found across all studies.

Identiferoai:union.ndltd.org:arizona.edu/oai:arizona.openrepository.com:10150/265343
Date January 2012
CreatorsVitela, Antonia David
ContributorsLotto, Andrew, Story, Brad, Bunton, Kate, Wilson, Stephen, Warner, Natasha, Lotto, Andrew
PublisherThe University of Arizona.
Source SetsUniversity of Arizona
LanguageEnglish
Detected LanguageEnglish
Typetext, Electronic Dissertation
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.

Page generated in 0.002 seconds