• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 2
  • Tagged with
  • 6
  • 6
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Speech Segregation and Speech Unmasking in English- and Mandarin-Chinese-Speaking Listeners

Wang, Xianhui 16 September 2022 (has links)
No description available.
2

Audition et démasquage binaural chez l'homme / Binaural hearing and binaural masking release in human

Lorenzi, Antoine 14 December 2016 (has links)
Contexte : Le démasquage binaural est un processus indispensable pour la compréhension en environnement bruyant. Ce mécanisme ferait intervenir la comparaison d’indices temporels et fréquentiels tout au long des voies nerveuses auditives. Cependant, il n’existe pas de réel consensus évoquant un traitement du démasquage à un niveau sous-cortical et/ou cortical. L’objet de cette étude est d’étudier ces indices temporels et fréquentiels du démasquage par le biais d’une étude perceptive, puis d’une étude électroencéphalographique (EEG). Matériels et méthodes : Une population normoentendante a été évaluée lors d’une étude perceptive visant à estimer l’importance du démasquage en fonction de 1) la largeur fréquentielle du bruit controlatéral (de 1 octave, 3 octaves ou à large bande), 2) la cohérence temporelle des bruits bilatéraux (corrélation égale à 0 ou 1) et 3) la fréquence des stimuli cibles (0,5, 1, 2 et 4 kHz). Puis, le démasquage a été évalué en EEG par l’étude 1) des latences précoces (<10 ms, PEA-P), 2) des latences tardives (<50 ms, PEA-T) et 3) de l’onde de discordance (PEA-MMN). Pour ces trois études EEG, l’influence de la cohérence temporelle des bruits bilatéraux a été explorée.Résultats : L’étude perceptive traduit un démasquage croissant lorsque la largeur fréquentielle du bruit controlatéral augmente. L’ajout du bruit controlatéral non corrélé (corrélation=0) se traduit par une amélioration de détection de 1,28 dB, quelle que soit la fréquence des stimuli cibles (antimasquage), alors que l’ajout d’un bruit controlatéral corrélé (corrélation=1) évoque une amélioration de détection lorsque la fréquence des stimuli cibles diminue (démasquage) : 0,97 dB à 4 kHz et 9,25 dB à 0,5 kHz. En PEA-P, les latences des ondes III et V se raccourcissent lorsqu’un bruit controlatéral corrélé ou non corrélé est ajouté (≈0,1 ms). En PEA-T, les amplitudes des ondes P1, N1 et des complexes P1N1 et N1P2 augmentent lorsqu’un bruit controlatéral corrélé ou non corrélé est ajouté. Enfin, l’amplitude de la MMN est plus conséquente lorsque le bruit controlatéral ajouté est corrélé (versus non corrélé). Conclusion : L’étude perceptive explicite l’importance des indices spectraux (antimasquage) et temporels (démasquage), pour améliorer la perception d’un signal initialement masqué. L’étude EEG suggère, quant à elle, un traitement sous-cortical influencé uniquement par les indices spectraux (antimasquage) et un traitement plus cortical influencé par les indices temporels (démasquage). / Background: Binaural unmasking is an essential process for understanding in noisy environments. This mechanism would involve the comparison of time and frequency cues throughout the hearing nerve pathways. However, there is no real consensus evoking a treatment of a binaural masking release at a subcortical and/or a cortical level. The purpose of this study is to investigate the time and frequency cues of the binaural unmasking through a perceptual study, and then through an electroencephalographic study (EEG).Materials and Methods: Normal hearing people were evaluated with a perceptive study to estimate the importance of the binaural unmasking according to 1) the frequency width of the contralateral noise (1 octave, 3 octaves or broadband), 2) the temporal coherence of bilateral noises (correlation equal to 0 or 1) and 3) the frequency of the target stimuli (0.5, 1, 2 and 4 kHz). Binaural unmasking was then evaluated with EEG by studying 1) early latencies (<10 ms, PEA-P), 2) late latencies (<50 ms, PEA-T) and 3), the mismatch wave (PEA- MMN). For these three EEG studies, the influence of the temporal coherence of bilateral noise was investigated.Results: The study shows a growing perceptive binaural unmasking when the frequency width of the contralateral noise increases. The addition of an uncorrelated contralateral noise (correlation = 0) results in a 1.28 dB detection enhancement, regardless of the frequency of the target stimuli (antimasking), while adding a contralateral correlated noise (correlation = 1) refers to a detection enhancement when the frequency of the target stimuli decreases (unmasking): 0.97 dB at 4 kHz and 9.25 dB at 0.5 kHz. The latencies of waves III and V are shortened when a contralateral correlated or uncorrelated noise is added (≈0,1 ms) in the PEA-P. The amplitudes of P1, N1 waves and P1N1 and N1P2 complex increase when contralateral correlated or uncorrelated noise is added in PEA-T. Finally, the amplitude of the MMN is higher when a contralateral correlated noise is added (versus an uncorrelated one).Conclusion: The perceptual study shows the significance of spectral cues (antimasking) and temporal cues (unmasking), to improve the perception of an initially masked signal. The EEG study suggests a subcortical treatment which is only influenced by spectral cues (antimasking) and a cortical processing, influenced by temporal cues (unmasking).
3

Speech masking release in hybrid cochlear implant users: roles of spectral and temporal cues in residual acoustic hearing

Tejani, Viral Dinesh 01 December 2018 (has links)
Improved cochlear implant (CI) designs and surgical techniques have allowed CI patients to retain acoustic hearing in the implanted ear post-operatively. These EAS (electric-acoustic stimulation) CI users listen with a combination of acoustic and electric hearing in the same ear. While electric hearing alone improves speech recognition in quiet, preserved acoustic hearing allows EAS CI users to outperform traditional CI users in speech recognition in noise and demonstrate “speech masking release,” an improvement in speech recognition in temporally fluctuating noise relative to steady noise. Masking release is arguably an ecologically valid metric, as listeners often attend to target speech embedded in fluctuating competing speech. Improved speech recognition outcomes have been attributed to the spectral and temporal resolution provided by acoustic hearing. However, the relationship between spectral and temporal resolution and outcomes in EAS CI users is not clear. This study evaluated speech masking release, spectral ripple density discrimination thresholds, and fundamental frequency difference limens (f0DLs) in EAS CI users. Both the ripple and f0DL tasks are thought to measure underlying spectral resolution and temporal fine structure. EAS CI subjects underwent testing in three listening modes: acoustic-only, electric-only, and acoustic+electric. Comparisons across listening modes allowed the benefit provided by acoustic hearing to be quantified. It was hypothesized that speech masking release, spectral ripple density discrimination thresholds, and f0DLs would be poorest with electric-only hearing and would improve in the acoustic-only and acoustic+electric listening modes. This would reflect the benefit of preserved acoustic hearing. It was also hypothesized that speech masking release would correlate with spectral ripple density discrimination thresholds and f0DLs, reflecting the roles of spectral and temporal fine structure cues. Lastly, it was hypothesized that EAS CI users with more residual hearing (lower audiometric thresholds) would perform better on all three tasks. Speech masking release was evaluated using a 12-alternative-forced-choice (AFC) spondee recognition in noise task. The noise was a two-talker and a ten-talker babble presented at -5 dB SNR, and masking release was quantified as the difference in spondee recognition in two-talker babble relative to ten-talker babble. Spectral ripple density discrimination thresholds were assessed in a 3-AFC task using a broadband stimulus that contained spectral peaks and valleys logarithmically spaced on the frequency axis. The spacing between spectral peaks (ripple density) was varied to determine the threshold at which listeners could no longer resolve the individual spectral peaks. F0DLs were assessed via a 3-AFC task using a broadband harmonic complex with a baseline f0 = 110 Hz. The f0 of the test intervals was varied to determine the smallest change in f0 that the listener could detect. Results showed that performance in all three measures was poorest when EAS CI users were tested using electric-hearing only, with significant improvements when tested in the acoustic-only and acoustic+electric listening modes. F0DLs, but not spectral ripple density discrimination thresholds or audiometric thresholds, significantly correlated with speech masking release. Speech masking release also significantly correlated with open-set AzBio sentence recognition in noise scores obtained from clinical records. Results indicated that preservation of residual acoustic hearing allows for speech masking release, likely due to access to temporal fine structure cues provided by residual hearing. The significant correlation between speech masking release and sentence recognition in noise indicates that the ability to extract target speech embedded in temporally fluctuating competing speech is important for speech recognition in noise. Funded by National Institutes of Health/National Institutes on Deafness and Other Communication Disorders (NIH/NIDCD) P50 DC000242, American Speech-Language-Hearing Foundation Student Research Grant, and American Academy of Audiology Student Investigator Research Grant.
4

Intégration audio-visuelle de la parole: le poids de la vision varie-t-il en fonction de l'âge et du développement langagier? / Audio-visual speech integration: does the visual weight depend on age and language development?

Huyse, Aurélie 03 May 2012 (has links)
Pour percevoir la parole, le cerveau humain utilise les informations sensorielles provenant non seulement de la modalité auditive mais également de la modalité visuelle. En effet, de précédentes recherches ont mis en évidence l’importance de la lecture labiale dans la perception de la parole, en montrant sa capacité à améliorer et à modifier celle-ci. C’est ce que l’on appelle l’intégration audio-visuelle de la parole. L’objectif de cette thèse de doctorat était d’étudier la possibilité de faire varier ce processus d’intégration en fonction de différentes variables. Ce travail s’inscrit ainsi au cœur d’un débat régnant depuis plusieurs années, celui opposant l’hypothèse d’une intégration audio-visuelle universelle à l’hypothèse d’une intégration dépendante du contexte. C’est dans ce cadre que nous avons réalisé les cinq études constituant cette thèse, chacune d’elles investiguant l’impact d’une variable bien précise sur l’intégration bimodale :la qualité du signal visuel, l’âge des participants, le fait de porter un implant cochléaire, l’âge au moment de l’implantation cochléaire et le fait d’avoir des troubles spécifiques du langage. <p>Le paradigme expérimental utilisé consistait toujours en une tâche d’identification de syllabes présentées dans trois modalités :auditive seule, visuelle seule et audio-visuelle (congruente et incongruente). Les cinq études avaient également comme point commun la présentation de stimuli visuels dont la qualité était réduite, visant à empêcher une lecture labiale de bonne qualité. Le but de chacune de ces études était non seulement d’examiner si les performances variaient en fonction des variables investiguées mais également de déterminer si les différences provenaient bien du processus d’intégration lui-même et non uniquement de différences au niveau de la perception unimodale. Pour cela, les scores des participants ont été comparés à des scores prédits sur base d’un modèle prenant en compte les variations individuelles des poids auditifs et visuels, le weighted fuzzy-logical model of perception.<p>L’ensemble des résultats, discuté dans la dernière partie de ce travail, fait pencher la balance en faveur de l’hypothèse d’une intégration dépendante du contexte. Nous proposons alors une nouvelle architecture de fusion bimodale, prenant en compte ces dernières données. Enfin, les implications sont aussi d’ordre pratique, suggérant la nécessité d’incorporer des évaluations et rééducations à la fois auditives et visuelles dans le cadre des programmes de revalidation de personnes âgées, dysphasiques ou avec implant cochléaire./During face-to-face conversation, perception of auditory speech is influenced by the visual speech cues contained in lip movements. Indeed, previous research has highlighted the ability of lip-reading to enhance and even modify speech perception. This phenomenon is known as audio-visual integration. The aim of this doctoral thesis is to study the possibility of modifying this audio-visual integration according to several variables. This work lies into the scope of an important debate between invariant versus subject-dependent audio-visual integration in speech processing. Each study of this dissertation investigates the impact of a specific variable on bimodal integration: the quality of the visual input, age of participants, the use of a cochlear implant, age at cochlear implantation and the presence of specific language impairments. <p>The paradigm used always consisted of a syllable identification task, where syllables were presented in three modalities: auditory only, visual only and audio-visual (congruent and incongruent). There was also a condition where the quality of the visual input was reduced, in order to prevent a lip-reading of good quality. The aim of each of the five studies was not only to examine whether performances were modified according to the variable under study but also to ascertain that differences were indeed issued from the integration process itself. Thereby, our results were analyzed in the framework of model predictive of audio-visual speech performance (weighted fuzzy-logical model of perception) in order to disentangle unisensory effects from audio-visual integration effects. <p>Taken together, our data suggest that speech integration is not automatic but rather depends on the context. We propose a new architecture of bimodal fusions, taking these considerations into account. Finally, there are also practical implications suggesting the need to incorporate not only auditory but also visual exercise in the rehabilitation programs of older adults and children with cochlear implants or with specific language impairements. <p> / Doctorat en Sciences Psychologiques et de l'éducation / info:eu-repo/semantics/nonPublished
5

The Role of Temporal Fine Structure in Everyday Hearing

Agudemu Borjigin (12468234) 28 April 2022 (has links)
<p>This thesis aims to investigate how one fundamental component of the inner-ear (cochlear) response to all sounds, the temporal fine structure (TFS), is used by the auditory system in everyday hearing. Although it is well known that neurons in the cochlea encode the TFS through exquisite phase locking, how this initial/peripheral temporal code contributes to everyday hearing and how its degradation contributes to perceptual deficits are foundational questions in auditory neuroscience and clinical audiology that remain unresolved despite extensive prior research. This is largely because the conventional approach to studying the role of TFS involves performing perceptual experiments with acoustic manipulations of stimuli (such as sub-band vocoding), rather than direct physiological or behavioral measurements of TFS coding, and hence is intrinsically limited. The present thesis addresses these gaps in three parts: 1) developing assays that can quantify TFS coding at the individual level 2) comparing individual differences in TFS coding to differences in speech-in-noise perception across a range of real-world listening conditions, and 3) developing deep neural network (DNN) models of speech separation/enhancement to complement the individual-difference approach. By comparing behavioral and electroencephalogram (EEG)-based measures, Part 1 of this work identified a robust test battery that measures TFS processing in individual humans. Using this battery, Part 2 subdivided a large sample of listeners (N=200) into groups with “good” and “poor” TFS sensitivity. A comparison of speech-in-noise scores under a range of listening conditions between the groups revealed that good TFS coding reduces the negative impact of reverberation on speech intelligibility, and leads to reduced reaction times suggesting lessened listening effort. These results raise the possibility that cochlear implant (CI) sound coding strategies could be improved by attempting to provide usable TFS information, and that these individualized TFS assays can also help predict listening outcomes in reverberant, real-world listening environments. Finally, the DNN models (Part 3) introduced significant improvements in speech quality and intelligibility, as evidenced by all acoustic evaluation metrics and test results from CI listeners (N=8). These models can be incorporated as “front-end” noise-reduction algorithms in hearing assistive devices, as well as complement other approaches by serving as a research tool to help generate and rapidly sub-select the most viable hypotheses about the role of TFS coding in complex listening scenarios.</p>
6

Neurophysiological Mechanisms of Speech Intelligibility under Masking and Distortion

Vibha Viswanathan (11189856) 29 July 2021 (has links)
<pre><p>Difficulty understanding speech in background noise is the most common hearing complaint. Elucidating the neurophysiological mechanisms underlying speech intelligibility in everyday environments with multiple sound sources and distortions is hence important for any technology that aims to improve real-world listening. Using a combination of behavioral, electroencephalography (EEG), and computational modeling experiments, this dissertation provides insight into how the brain analyzes such complex scenes, and what roles different acoustic cues play in facilitating this process and in conveying phonetic content. Experiment #1 showed that brain oscillations selectively track the temporal envelopes (i.e., modulations) of attended speech in a mixture of competing talkers, and that the strength and pattern of this attention effect differs between individuals. Experiment #2 showed that the fidelity of neural tracking of attended-speech envelopes is strongly shaped by the modulations in interfering sounds as well as the temporal fine structure (TFS) conveyed by the cochlea, and predicts speech intelligibility in diverse listening environments. Results from Experiments #1 and #2 support the theory that temporal coherence of sound elements across envelopes and/or TFS shapes scene analysis and speech intelligibility. Experiment #3 tested this theory further by measuring and computationally modeling consonant categorization behavior in a range of background noises and distortions. We found that a physiologically plausible model that incorporated temporal-coherence effects predicted consonant confusions better than conventional speech-intelligibility models, providing independent evidence that temporal coherence influences scene analysis. Finally, results from Experiment #3 also showed that TFS is used to extract speech content (voicing) for consonant categorization even when intact envelope cues are available. Together, the novel insights provided by our results can guide future models of speech intelligibility and scene analysis, clinical diagnostics, improved assistive listening devices, and other audio technologies.</p></pre>

Page generated in 0.082 seconds