Spelling suggestions: "subject:"emporal five structure"" "subject:"emporal find structure""
1 |
Predicting Speech Intelligibility and Quality from Model Auditory Nerve Fiber Mean-rate and Spike-timing ActivityWirtzfeld, Michael Roy January 2017 (has links)
This dissertation examines the prediction of speech intelligibility and quality using simulated auditory nerve fiber activity. The relationship of neural mean-rate and spike-timing activity to the perceptual salience of the envelope (ENV) and temporal fine-structure (TFS) of
speech is indistinct. TFS affects neural temporal coding in two ways. TFS produces phase-locked spike-timing responses and narrowband cochlear filtering of TFS generates recovered ENV. These processes, with direct encoding of ENV to mean-rate responses, are the established transduction processes. We postulate that models based on mean-rate (over a time-window of approx. 6 to 16 ms) and spike-timing cues should produce accurate predictions of subjectively graded speech. Two studies are presented.
The first study examined the contribution of mean-rate and spike-timing cues to predicting intelligibility. The relative level of mean-rate and spike-timing cues were manipulated using chimaerically vocoded speech. The Spectro-Temporal Modulation Index (STMI) and
Neurogram SIMilarity (NSIM) were used to quantify the mean-rate and spike-timing activity. Linear regression models were developed using the STMI and NSIM. An interpretable model combining the STMI and the fine-timing NSIM demonstrated the most accurate
predictions of the graded speech.
The second study examined the contribution of mean-rate and spike-timing cues for predicting the quality of enhanced wideband speech. The mean-rate and fine-timing NSIM were used to quantify the mean-rate and spike-timing activity. Linear regression
models were developed using the NSIM measures and optimization of the NSIM was investigated. A quality-optimized model with intermediate temporal resolution had the best predictive performance.
The modelling approach used here allows for the study of normal and impaired hearing. It supports the design of hearing-aid processing algorithms and furthers the understanding how TFS cues might be applied in cochlear implant stimulation schemes. / Thesis / Doctor of Philosophy (PhD) / This dissertation examines how auditory nerve fiber activity can be used to predict speech intelligibility and quality. A model of the cochlea is used to generate simulated auditory nerve fiber responses to speech stimuli and the information conveyed by the corresponding spike-events is quantified using different measures of neural activity. A set of predictive models are constructed in a systematic manner using these neural measures and used to estimate the perceptual scoring of intelligibility and quality of normal-hearing listeners for two speech datasets. The results indicate that a model combining a measure of average neural discharge activity with a measure of instantaneous activity provides the best prediction accuracy. This work contributes to the knowledge of neural coding in the cochlea and higher centers of the brain and facilitates the development of hearing-aid and cochlear implant processing strategies.
|
2 |
The Role of Temporal Fine Structure Processing in “Listening in the Dips” of NoiseDraper, S., Smith, Sherri, Smurzynski, Jacek 06 April 2011 (has links)
No description available.
|
3 |
THE ROLE OF TEMPORAL FINE STRUCTURE CUES IN SPEECH PERCEPTIONIbrahim, Rasha 04 1900 (has links)
<p>In this thesis, the importance of temporal fine structure (TFS) in speech perception is investigated. It is well accepted that TFS is important for sound localization and pitch perception, while envelope (ENV) is primarily responsible for speech perception. Recently, a significant contribution of TFS in speech perception has been suggested. This was linked to the improved ability of normal-hearing subjects to understand speech in fluctuating-power background noise as compared to hearing-impaired people. However, the accuracy of this claim is questionable since TFS and ENV are correlated and one can recover ENV to some extent if provided with TFS-only speech. In this work, we quantify the relative advantages of TFS and the possible influence of recovered ENV on speech recognition scores. We used a computational model for the cat auditory periphery, which was modified to match the available data for human cochlear tuning. The output of the model was analyzed by the spectro-temporal modulation index (STMI) metric to predict speech intelligibility. A speech recognition experiment was conducted on five normal-hearing subjects and the STMI predictions were mapped to intelligibility using a specially constructed mapping function. The TFS role was quantified by examining the TFS intelligibility scores and the corresponding intelligibility predictions from ENV recovery. Our results show that although ENV recovery has some influence on the intelligibility results, it cannot account for the total reported intelligibility.</p> / Doctor of Philosophy (PhD)
|
4 |
Investigation of speech processing in frequency regions where absolute thresholds are normal for hearing-impaired listeners / Etude du traitement de la parole dans des régions fréquentielles au sein desquelles les seuils absolus sont normaux pour des auditeurs malentendantsLéger, Agnès 30 November 2012 (has links)
Une perte auditive neurosensorielle est généralement associée à uneréduction de l’intelligibilité de la parole, et ce tout particulièrement dans le bruit.Les contributions respectives d’une réduction de l'audibilité et de déficitssupraliminaires sont encore débattues.L'objectif principal de cette thèse était d'évaluer l'effet spécifique desdéficits supraliminaires sur l’intelligibilité de la parole. L'effet de l'audibilité étaitcontrôlé en mesurant l’intelligibilité de signaux de parole sans signification filtrésdans les régions basses et moyennes fréquences au sein desquelles la détection desons purs était normale chez des auditeurs malentendants présentant par ailleursune perte auditive en hautes fréquences. Dans ces régions fréquentielles oùl’audibilité est supposée normale, des déficits d'intelligibilité de la parole légers àsévères ont été observés dans le silence comme dans le bruit chez les auditeursmalentendants. Les déficits étaient similaires dans les bruits masquantstationnaires et fluctuants. Ces résultats démontrent l’influence des déficitsauditifs supraliminaires sur l’intelligibilité de la parole.Le second objectif de cette thèse était d'étudier l'origine de ces déficitssupraliminaires. Les résultats indiquent qu’une réduction de la sélectivitéfréquentielle cochléaire ne peut pas expliquer entièrement les déficitsd’intelligibilité de la parole des auditeurs malentendants. L'influence de lasensibilité à la structure temporelle fine reste incertaine / Speech intelligibility is reduced for listeners with sensorineural hearingloss, especially for speech in noise. The extent to which this reduction is due toreduced audibility or to supra-threshold deficits is still debated.The main goal of this PhD work was to investigate the specific influenceof supra-threshold deficits on speech intelligibility. The effect of audibility wascontrolled for by measuring speech intelligibility for hearing-impaired listenersusing nonsense speech signals filtered in low- and mid-frequency regions wherepure-tone sensitivity was normal. Hearing-impaired listeners with hearing loss inhigh-frequency regions showed mild to severe intelligibility deficits for speechboth in quiet and in noise in these frequency regions of normal audibility. Similardeficits were obtained for speech in steady and fluctuating masking noises. Thisprovides additional evidence that speech intelligibility may be strongly influencedby supra-threshold auditory deficits.The second aim of this PhD work was to investigate the origin of thesesupra-threshold deficits. Results showed that reduced frequency selectivity cannotentirely explain the speech intelligibility deficits of the hearing-impaired listeners.The influence of temporal fine structure sensitivity remained unclear
|
5 |
Frequency modulation coding in the auditory system / Codage de la modulation de fréquence dans le système auditifParaouty, Nihaad 27 November 2017 (has links)
Cette recherche visait à clarifier les mécanismes de bas niveau impliqués dans la détection de la modulation de fréquence (FM). Les sons naturels véhiculent des modulations d’amplitude et de fréquence saillantes essentielles à la communication. L’analyse des réponses de neurones auditifs du noyau cochléaire montre que les propriétés spectro-temporelles des stimuli de FM de basse cadence sont représentées par deux mécanismes distincts basés sur le verrouillage en phase à l’enveloppe temporelle (ENV) et à la structure temporelle fine (TFS). La contribution relative de chaque mécanisme s’avère très dépendante des paramètres de stimulation (fréquence porteuse, cadence de modulation et profondeur de modulation) mais aussi du type de neurones, chacun étant spécialisé pour un type de représentation ou l'autre. L’existence de ces deux mécanismes de codage neuronal a été confirmée chez les auditeurs humains en utilisant deux paradigmes psychophysiques. Les résultats de ces études démontrent également que le mécanisme de codage de TFS est efficace dans des conditions d'écoute défavorables (e.g. en présence de modulations interférentes). Cependant, le mécanisme de codage de TFS est susceptible de se dégrader avec l'âge et plus encore avec la perte auditive, alors que le mécanisme de codage d’ENV semble relativement épargné. Deux modèles computationnels ont été développés afin d’expliquer les contributions des indices d’ENV et de TFS dans le système auditif normal et malentendant. / This research aimed at clarifying the low-level mechanisms involved in frequency-modulation (FM) detection. Natural sounds convey salient amplitude- and frequency-modulation patterns crucial for communication. Results from single auditory neurons in the cochlear nucleus show that the spectro-temporal properties of low-rate FM stimuli are accurately represented by two distinct mechanisms based on neural phase-locking to temporal envelope (ENV) and temporal fine structure (TFS) cues. The relative contribution of each mechanism was found to be highly dependent on stimulus parameters (carrier frequency, modulation rate and modulation depth) and also on the type of neuron, with clear specializations for one type of representation or the other. The validity of those two neural encoding mechanisms was confirmed for human listeners using two psychophysical paradigms. Results from those studies also demonstrate that the TFS coding mechanism is efficient in adverse listening conditions, like in the presence of interfering modulations. However, the TFS coding mechanism is prone to decline with age and even more with hearing loss, while the ENV coding mechanism seems relatively spared. Two computational models were developed to fully explain the contributions of ENV and TFS cues in the normal and impaired auditory system.
|
6 |
Intégration auditive des modulations temporelles : effets du vieillissement et de la perte auditive / Temporal integration of auditory temporal modulations : effects of age and hearing lossWallaert, Nicolas 28 November 2017 (has links)
Les signaux de communication, dont la parole, contiennent des modulations d'amplitude et de fréquence relativement lentes qui jouent un rôle capital dans l'identification et la discrimination des sons. Le but de ce programme de recherche doctorale est de comprendre plus finement les mécanismes impliqués dans la perception de l'AM et de la FM, et de clarifier les effets du vieillissement et de la perte auditive neurosensorielle sur ceux-ci. Les seuils auditifs de détection d'AM et de FM sont mesurés pour des sujets normo-entendants (NE) jeunes et âgés, ainsi que pour des sujets malentendants (ME) âgés, à basse fréquence porteuse (500 Hz) et à basse cadence de modulation (2 et 20 Hz). Le nombre de cycles de modulation, N, varie entre 2 et 9. Les seuils de détection de FM sont mesurés en présence d'une AM interférente, de façon à contraindre l'utilisation des indices d’enveloppe temporelle. Pour l'ensemble des groupes, les seuils de détection d'AM et de FM sont meilleurs à 2 qu'à 20 Hz. La sensibilité à l'AM et la FM s'améliore lorsque N augmente, démontrant une intégration temporelle des indices d’AM et de FM. Pour l’AM, les effets du vieillissement et de la perte auditive sont antagonistes: aux deux cadences de modulation, la sensibilité à l'AM décline avec l'âge, tandis qu'elle s'améliore en présence d'une perte auditive. L'intégration temporelle est similaire pour les deux groupes de NE, tandis que l'intégration temporelle est améliorée chez les sujets ME. Pour la sensibilité à la FM, l'âge dégrade sélectivement les seuils de détection de FM à basse cadence de modulation, tandis que la perte auditive a un effet délétère global aux deux cadences de modulation. L'intégration temporelle est similaire pour l'ensemble des groupes. Deux modèles computationnels (mono-bande et multi-bandes) utilisant un banc de filtres de modulation et un processus d'appariement de gabarit sont développés pour rendre compte des données. Pris ensemble, les données psychophysiques et de modélisation suggèrent que: 1) pour des cadences de modulation rapides, la détection d'AM et de FM sont encodés par un mécanisme commun, probablement basé sur les indices d’enveloppe temporelle. A l'inverse, à basse cadence de modulation, l'AM et la FM sont encodés par des mécanismes distincts, utilisant probablement et respectivement des indices d’enveloppe temporelle et de structure temporelle fine; 2) le vieillissement dégrade la sensibilité à l'AM et la FM (i.e. les indices d’enveloppe temporelle et de structure temporelle fine), mais affecte plus fortement ces derniers; 3) la perte auditive n'affecte pas la sensibilité à l'AM (indices d’enveloppe temporelle), mais dégrade la sensibilité à la FM aux deux cadences de modulation; 4) Les processus décisionnels et mnésiques impliqués dans l'intégration temporelle d'AM et de FM sont préservés par le vieillissement. En présence d'une perte auditive, l'intégration temporelle d'AM est améliorée, probablement en raison de la perte de compression cochléaire, tandis que l'intégration temporelle en FM reste préservée. Toutefois, certains aspects de l’efficacité de traitement (modélisés par un bruit interne) déclinent avec l’âge et encore plus fortement avec une perte auditive. Les implications de ces résultats pour la définition, le diagnostic et la réhabilitation de la presbyacousie sont discutés. / Communication sounds, including speech, contain relatively slow (<5-10 Hz) patterns of amplitude modulation (AM) and frequency modulation (FM) that play an important role in the discrimination and identification of sounds. The goal of this doctoral research program was to better understand the mechanisms involved in AM and FM perception and to clarify the effects of age and hearing loss on AM and FM perception. AM and FM detection thresholds were measured for young and older normal-hearing (NH) listeners and for older hearing-impaired (HI) listeners, using a low carrier frequency (500 Hz) and low modulation rates (2 and 20 Hz). The number of modulation cycles, N, varied between 2 to 9. FM detection thresholds were measured with and without an interfering AM to disrupt temporal-envelope cues. For all groups of listeners, AM and FM detection thresholds were lower for the 2-Hz than for the 20-Hz rate. AM and FM sensitivity improved with increasing N, demonstrating temporal integration for AM and FM detection. As for AM thresholds, opposite effects of age and hearing loss were observed: AM sensitivity declines with age, but improves with hearing loss at both modulation rates. Temporal integration of AM cues was similar across NH listeners, but better for HI listeners. As for FM sensitivity, ageing degrades FM thresholds at the low modulation rate only, whereas hearing loss has a deleterious effect at both modulation rates. Temporal integration of FM cues was similar across all groups. Two computational models (a single-band and a multi-band version) using the modulation filterbank concept and a template-matching decision strategy were developed in order to account for the data. Overall, the psychophysical and modeling data suggest that: 1) at high modulation rates, AM and FM detection are coded by a common underlying mechanism, possibly based on temporal-envelope cues. In contrast, at low modulation rates, AM and FM are coded by different mechanisms, possibly based on temporal-envelope cues and temporal-fine-structure cues, respectively. 2) Ageing reduces sensitivity to both AM and FM (i.e., both temporal-envelope and temporal-fine-structure cues), but more so for the latter. 3) Hearing loss does not affect sensitivity to AM (temporal-envelope cues) but impairs FM sensitivity at both rates. 4) The memory and decision processes involved in the temporal integration of AM and FM cues are preserved with age. With hearing loss, the temporal integration of AM cues is enhanced, probably due to the loss of amplitude compression, while the temporal integration of FM cues remains unchanged. Still, some aspects of processing efficiency (as modeled by internal noise) decline with age and even more following cochlear damage. The implications for the definition, diagnosis and rehabilitation of presbyacysis are discussed.
|
7 |
Neural representations of natural speech in a chinchilla model of noise-induced hearing lossSatyabrata Parida (9759374) 14 December 2020 (has links)
<div>Hearing loss hinders the communication ability of many individuals despite state-of-the-art interventions. Animal models of different hearing-loss etiologies can help improve the clinical outcomes of these interventions; however, several gaps exist. First, translational aspects of animal models are currently limited because anatomically and physiologically specific data obtained from animals are analyzed differently compared to noninvasive evoked responses that can be recorded from humans. Second, we lack a comprehensive understanding of the neural representation of everyday sounds (e.g., naturally spoken speech) in real-life settings (e.g., in background noise). This is even true at the level of the auditory nerve, which is the first bottleneck of auditory information flow to the brain and the first neural site to exhibit crucial effects of hearing-loss. </div><div><br></div><div>To address these gaps, we developed a unifying framework that allows direct comparison of invasive spike-train data and noninvasive far-field data in response to stationary and nonstationary sounds. We applied this framework to recordings from single auditory-nerve fibers and frequency-following responses from the scalp of anesthetized chinchillas with either normal hearing or noise-induced mild-moderate hearing loss in response to a speech sentence in noise. Key results for speech coding following hearing loss include: (1) coding deficits for voiced speech manifest as tonotopic distortions without a significant change in driven rate or spike-time precision, (2) linear amplification aimed at countering audiometric threshold shift is insufficient to restore neural activity for low-intensity consonants, (3) susceptibility to background noise increases as a direct result of distorted tonotopic mapping following acoustic trauma, and (4) temporal-place representation of pitch is also degraded. Finally, we developed a noninvasive metric to potentially diagnose distorted tonotopy in humans. These findings help explain the neural origins of common perceptual difficulties that listeners with hearing impairment experience, offer several insights to make hearing-aids more individualized, and highlight the importance of better clinical diagnostics and noise-reduction algorithms. </div>
|
8 |
The Role of Temporal Fine Structure in Everyday HearingAgudemu Borjigin (12468234) 28 April 2022 (has links)
<p>This thesis aims to investigate how one fundamental component of the inner-ear (cochlear) response to all sounds, the temporal fine structure (TFS), is used by the auditory system in everyday hearing. Although it is well known that neurons in the cochlea encode the TFS through exquisite phase locking, how this initial/peripheral temporal code contributes to everyday hearing and how its degradation contributes to perceptual deficits are foundational questions in auditory neuroscience and clinical audiology that remain unresolved despite extensive prior research. This is largely because the conventional approach to studying the role of TFS involves performing perceptual experiments with acoustic manipulations of stimuli (such as sub-band vocoding), rather than direct physiological or behavioral measurements of TFS coding, and hence is intrinsically limited. The present thesis addresses these gaps in three parts: 1) developing assays that can quantify TFS coding at the individual level 2) comparing individual differences in TFS coding to differences in speech-in-noise perception across a range of real-world listening conditions, and 3) developing deep neural network (DNN) models of speech separation/enhancement to complement the individual-difference approach. By comparing behavioral and electroencephalogram (EEG)-based measures, Part 1 of this work identified a robust test battery that measures TFS processing in individual humans. Using this battery, Part 2 subdivided a large sample of listeners (N=200) into groups with “good” and “poor” TFS sensitivity. A comparison of speech-in-noise scores under a range of listening conditions between the groups revealed that good TFS coding reduces the negative impact of reverberation on speech intelligibility, and leads to reduced reaction times suggesting lessened listening effort. These results raise the possibility that cochlear implant (CI) sound coding strategies could be improved by attempting to provide usable TFS information, and that these individualized TFS assays can also help predict listening outcomes in reverberant, real-world listening environments. Finally, the DNN models (Part 3) introduced significant improvements in speech quality and intelligibility, as evidenced by all acoustic evaluation metrics and test results from CI listeners (N=8). These models can be incorporated as “front-end” noise-reduction algorithms in hearing assistive devices, as well as complement other approaches by serving as a research tool to help generate and rapidly sub-select the most viable hypotheses about the role of TFS coding in complex listening scenarios.</p>
|
Page generated in 0.1017 seconds