Spelling suggestions: "subject:"mcgurk effect"" "subject:"mcturk effect""
1 |
Music expert-novice differences in speech perceptionVassallo, Juan Sebastian 22 August 2019 (has links)
It has been demonstrated that early, formal and extensive musical training induces changes both at the structural and functional levels in the brain. Previous evidence suggests that musicians are particularly skilled in auditory analysis tasks. In this study, I aimed to find evidence that musical training affects the perception of acoustic cues in audiovisual speech processing for Native-English speakers. Using the McGurk paradigm –an experimental procedure based on the perceptual illusion that occurs when an auditory speech message is paired to incongruent visual facial gestures, participants were required to identify the auditory component from an audiovisual speech presentation in four conditions: (1) Congruent auditory and visual modalities, (2) incongruent, (3) auditory only, and (4) visual only. Our data showed no significant differences in accuracy between groups differentiated by musical training. These findings have significant theoretical implications suggesting that auditory cues for speech and music are processed by separable cognitive domains and that musical training might not have a positive effect in speech perception. / Graduate / 2020-08-12
|
2 |
Non-auditory Influences on the Auditory PeripheryGruters, Kurtis G. January 2016 (has links)
<p>Once thought to be predominantly the domain of cortex, multisensory integration has now been found at numerous sub-cortical locations in the auditory pathway. Prominent ascending and descending connection within the pathway suggest that the system may utilize non-auditory activity to help filter incoming sounds as they first enter the ear. Active mechanisms in the periphery, particularly the outer hair cells (OHCs) of the cochlea and middle ear muscles (MEMs), are capable of modulating the sensitivity of other peripheral mechanisms involved in the transduction of sound into the system. Through indirect mechanical coupling of the OHCs and MEMs to the eardrum, motion of these mechanisms can be recorded as acoustic signals in the ear canal. Here, we utilize this recording technique to describe three different experiments that demonstrate novel multisensory interactions occurring at the level of the eardrum. 1) In the first experiment, measurements in humans and monkeys performing a saccadic eye movement task to visual targets indicate that the eardrum oscillates in conjunction with eye movements. The amplitude and phase of the eardrum movement, which we dub the Oscillatory Saccadic Eardrum Associated Response or OSEAR, depended on the direction and horizontal amplitude of the saccade and occurred in the absence of any externally delivered sounds. 2) For the second experiment, we use an audiovisual cueing task to demonstrate a dynamic change to pressure levels in the ear when a sound is expected versus when one is not. Specifically, we observe a drop in frequency power and variability from 0.1 to 4kHz around the time when the sound is expected to occur in contract to a slight increase in power at both lower and higher frequencies. 3) For the third experiment, we show that seeing a speaker say a syllable that is incongruent with the accompanying audio can alter the response patterns of the auditory periphery, particularly during the most relevant moments in the speech stream. These visually influenced changes may contribute to the altered percept of the speech sound. Collectively, we presume that these findings represent the combined effect of OHCs and MEMs acting in tandem in response to various non-auditory signals in order to manipulate the receptive properties of the auditory system. These influences may have a profound, and previously unrecognized, impact on how the auditory system processes sounds from initial sensory transduction all the way to perception and behavior. Moreover, we demonstrate that the entire auditory system is, fundamentally, a multisensory system.</p> / Dissertation
|
3 |
Noise reduction limits the McGurk EffectDeonarine, Justin January 2011 (has links)
In the McGurk Effect (McGurk & MacDonald, 1976), a visual depiction of a speaker silently mouthing the syllable [ga]/[ka] is presented concurrently with the auditory input [ba]/[pa], resulting in “fused” [da]/[ta] being heard. Deonarine (2010) found that increasing the intensity (volume) of the auditory input changes the perception of the auditory input from [ga] (at quiet volume levels) to [da], and then to [ba] (at loud volume levels). The present experiments show that reducing both ambient noise (additional frequencies in the environment) and stimulus noise (excess frequencies in the sound wave which accompany the intended auditory signal) prevents the illusory percept. This suggests that noise is crucial to audiovisual integration and that the McGurk effect depends on the existence of auditory ambiguity.
|
4 |
Noise reduction limits the McGurk EffectDeonarine, Justin January 2011 (has links)
In the McGurk Effect (McGurk & MacDonald, 1976), a visual depiction of a speaker silently mouthing the syllable [ga]/[ka] is presented concurrently with the auditory input [ba]/[pa], resulting in “fused” [da]/[ta] being heard. Deonarine (2010) found that increasing the intensity (volume) of the auditory input changes the perception of the auditory input from [ga] (at quiet volume levels) to [da], and then to [ba] (at loud volume levels). The present experiments show that reducing both ambient noise (additional frequencies in the environment) and stimulus noise (excess frequencies in the sound wave which accompany the intended auditory signal) prevents the illusory percept. This suggests that noise is crucial to audiovisual integration and that the McGurk effect depends on the existence of auditory ambiguity.
|
5 |
Analyse de scènes de parole multisensorielle : mise en évidence et caractérisation d'un processus de liage audiovisuel préalable à la fusion / Analysis of multisensory speech scenes : behavioral demonstration and characterization of the audiovisual binding systemNahorna, Olha 02 October 2013 (has links)
Dans la parole audiovisuelle, les flux auditifs et visuels cohérents sont généralement fusionnés en un percept unifié. Il en résulte une meilleure intelligibilité dans le bruit, et cela peut induire une modification visuelle du percept auditif dans le célèbre « effet McGurk » (le montage d'un son « ba » avec une image d'un locuteur prononçant « ga » est souvent perçu comme « da »). La vision classique considère que le traitement est effectué indépendamment dans les systèmes auditif et visuel avant que l'interaction ne se produise à un certain niveau de représentation, ce qui résulte en un percept intégré. Cependant certaines données comportementales et neurophysiologiques suggèrent l'existence d'un processus à deux niveaux. Le premier niveau implique le liage des éléments d'information auditive et visuelle appropriés avant de donner naissance à un percept fusionné au second niveau. Pour démontrer l'existence de ce premier niveau, nous avons élaboré un paradigme original qui vise à tenter de « délier » ces deux flux. Notre paradigme consiste à faire précéder l'effet McGurk (indicateur de la fusion audiovisuelle) par un contexte soit cohérent soit incohérent. Dans le cas du contexte incohérent on observe une diminution significative de perception d'effet McGurk, donc une décroissance de la fusion audiovisuelle. Les différent types d'incohérence (syllabes acoustiques superposées à des phrases visuelles, modifications phonétiques ou temporelles du contenu acoustique de séquences régulières de syllabes audiovisuelles) peuvent réduire significativement l'effet McGurk. Le processus de déliage est rapide, une unique syllabe incohérente suffisant pour obtenir un résultat de déliage maximal. Par contre le processus inverse de « reliage » par un contexte cohérent suivant le déliage est progressif, puisqu'il apparaît qu'au minimum trois syllabes cohérentes sont nécessaires. Nous pouvons également geler le sujet dans son état délié en rajoutant une pause entre un contexte incohérent et l'effet McGurk. Au total 7 expériences ont été effectuées pour démontrer et décrire le processus de liage dans la parole audiovisuelle. Les données sont interprétées dans le cadre du modèle à deux niveaux « liage et fusion ». / In audiovisual speech the coherent auditory and visual streams are generally fused into a single percept. This results in enhanced intelligibility in noise, or in visual modification of the auditory percept in the famous “McGurk effect” (the dubbing of the sound “ba” on the image of the speaker uttering “ga” is often perceived as “da”). It is classically considered that processing is done independently in the auditory and visual systems before interaction occurs at a certain representational stage, resulting in an integrated percept. However, some behavioral and neurophysiological data suggest the existence of a two-stage process. A first stage would involve binding together the appropriate pieces of audio and video information, before fusion in a second stage. To demonstrate the existence of this first stage, we have designed an original paradigm aiming at possibly “unbinding” the audio and visual streams. Our paradigm consists in providing before a McGurk stimulus (used as an indicator of audiovisual fusion) an audiovisual context either coherent or incoherent. In the case of an incoherent context we observe a significant decrease of the McGurk effect, implying a reduction of the amount of audiovisual fusion. Various kinds of incoherence (acoustic syllables dubbed on video sentences, phonetic or temporal modifications of the acoustic content of a regular sequence of audiovisual syllables) can significantly reduce the McGurk effect. The unbinding process is fast since one incoherent syllable is enough to produce maximal unbinding. On the other side, the inverse process of “rebinding” by a coherent context following unbinding is progressive, since it appears that at least three coherent syllables are needed to completely recover from unbinding. The subject can also be “freezed” in an “unbound” state by adding a pause between an incoherent context and the McGurk target. In total seven experiments were performed to demonstrate and describe the binding process in audiovisual speech perception. The data are interpreted in the framework of a two-stage “binding and fusion” model.
|
6 |
Characterization of audiovisual binding and fusion in the framework of audiovisual speech scene analysis / Caractérisation du liage et de la fusion audiovisuels dans le cadre de l'analyse de la scène audiovisuelleAttigodu Chandrashekara, Ganesh 29 February 2016 (has links)
Cette thèse porte sur l’intégration de deux concepts : l’Analyse de Scènes Auditives (ASA) et la fusion audiovisuelle (AV) en perception de parole. Nous introduisons "l’Analyse de Scènes de Parole Audio Visuelles" (AVSSA) comme une extension du modèle à deux étages caractéristique de l’ASA vers des scènes audiovisuelles et nous proposons qu'un indice de cohérence entre modalités auditive et visuelle est calculé avant la fusion AV, ce qui permet de déterminer si les entrées sensorielles doivent être cognitivement liées : c’est le « modèle à deux étages » de la fusion AV. Des expériences antérieures sur la modulation de l'effet McGurk par des contextes AV cohérents vs. incohérents présentés avant la cible McGurk ont permis de valider le modèle à deux étages. Dans ce travail de thèse, nous étudions le processus AVSSA au sein de l'architecture à deux étages dans différentes dimensions telles que l'introduction de bruit, le mélange de sources AV, la recherche de corrélats neurophysiologiques et l’évaluation sur différentes populations.Une première série d'expériences chez les jeunes adultes a permis la caractérisation du mécanisme de liage AV en introduisant du bruit et les résultats ont montré que les participants étaient en mesure d'évaluer à la fois le niveau de bruit acoustique et la cohérence AV et de contrôler la fusion AV en conséquence. Dans une deuxième série d'expériences comportementales impliquant une compétition entre sources AV, nous avons montré que l’AVSSA permet d'évaluer la cohérence entre caractéristiques visuelles et auditives dans une scène complexe, afin d'associer les composants adéquats d'une source de parole AV donné, et de fournir pour le processus de fusion une évaluation de la cohérence de la source AV extraite. Il apparaît également que la fusion dépend du focus attentionnel sur une source ou l'autre. Puis une expérience EEG a cherché à mettre en évidence un marqueur neurophysiologique du processus de liage-déliage et a montré qu’un contexte AV incohérent peut moduler l'effet de l'entrée visuelle sur la composante N1 / P2. Une dernière série d'expériences a été axée sur l’évaluation du liage AV et de sa dynamique dans une population âgée, et a fourni des résultats similaires à ceux des adultes plus jeunes mais avec une plus grande dynamique de déliage. L'ensemble des résultats a permis de mieux caractériser le processus AVSSA et a été intégré dans la proposition d'une architecture neurocognitive améliorée pour la fusion AV dans la perception de la parole. / The present doctoral work is focused on a tentative fusion between two separate concepts: Auditory Scene Analysis (ASA) and Audiovisual (AV) fusion in speech perception. We introduce “Audio Visual Speech Scene Analysis” (AVSSA) as an extension of the two-stage ASA model to- wards AV scenes, and we propose that a coherence index between the auditory and the visual input is computed prior to AV fusion, enabling to determine whether the sensory inputs should be bound together. This is the “two-stage model of AV fusion”. Previous experiments on the modulation of the McGurk effect by AV coherent vs. incoherent contexts presented before the McGurk target have provided experimental evidence supporting the two-stage model. In this doctoral work, we further evaluate the AVSSA process within the two-stage architecture in various dimensions such as introducing noise, considering multiple sources, assessing neurophysiological correlates and testing in different populations.A first set of experiments in younger adults was focused on behavioral characterization of the AV binding process by introducing noise and results showed that the participants were able to evaluate both the level of acoustic noise and AV coherence and to monitor the AV fusion accordingly. In a second set of behavioral experiments involving competing AV sources, we showed that the AVSSA process enables to evaluate the coherence between auditory and visual features within a complex scene, in order to properly associate the adequate components of a given AV speech source, and provide to the fusion process an assessment of the AV coherence of the extracted source. It also appears that the modulation of fusion depends on the attentional focus on one source or the other.Then an EEG experiment aimed to display a neurophysiological marker of the binding and un- binding process and showed that an incoherent AV context could modulate the effect of the visual input on the N1/P2 component. The last set of experiments were focused on measurement of AV binding and its dynamics in the older population, and provided similar results as in younger adults though with a higher amount of unbinding. The whole set of results enabled better characterize the AVSSA process and were embedded in the proposal of an improved neurocognitive architecture for AV fusion in speech perception.
|
7 |
MUSIC TO OUR EYES: ASSESSING THE ROLE OF EXPERIENCE FOR MULTISENSORY INTEGRATION IN MUSIC PERCEPTIONGraham, Robert Edward 01 December 2017 (has links)
Based on research on the “McGurk Effect” (McGurk & McDonald, 1976) in speech perception, some researchers (e.g. Liberman & Mattingly, 1985) have argued that humans uniquely interpret auditory and visual (motor) speech signals as a single intended audiovisual articulatory gesture, and that such multisensory integration is innate and specific to language. Our goal for the present study was to determine if a McGurk-like Effect holds true for music perception as well, as a domain for which innateness and experience can be disentangled more easily than in language. We sought to investigate the effects of visual musical information on auditory music perception and judgment, the impact of music experience on such audiovisual integration, and the possible role of eye gaze patterns as a potential mediator for music experience and the extent of visual influence on auditory judgments. 108 participants (ages 18-40) completed a questionnaire and melody/rhythm perception tasks to determine music experience and abilities, and then completed speech and musical McGurk tasks. Stimuli were recorded from five sounds produced by a speaker or musician (cellist and trombonist) that ranged incrementally along a continuum from one type to another (e.g. non-vibrato to strong vibrato). In the audiovisual condition, these sounds were paired with videos of the speaker/performer producing one type of sound or another (representing either end of the continuum) such that the audio and video matched or mismatched to varying degrees. Participants indicated, on a 100-point scale, the extent to which the auditory presentation represents one end of the continuum or the other. Auditory judgments for each sound were then compared based on their visual pairings to determine the impact of visual cues on auditory judgments. Additionally, several types of music experience were evaluated as potential predictors of the degree of influence visual stimuli had on auditory judgments. Finally, eye gaze patterns were measured in a different sample of 15 participants to assess relationships between music experience and eye gaze patterns, and eye gaze patterns and extent of visual on auditory judgments. Results indicated a reliable “musical McGurk Effect” in the context of cello vibrato sounds, but weaker overall effects for trombone vibrato sounds and cello pluck and bow sounds. Limited evidence was found to suggest that music experience impacts the extent to which individuals are influenced by visual stimuli when making auditory judgments. The support that was obtained, however, indicated the possibility for diminished visual influence on auditory judgments based on variables associated with music “production” experience. Potential relationships between music experience and eye-gaze patterns were identified. Implications for audiovisual integration in the context of speech and music perception are discussed, and future directions advised.
|
8 |
Neural indices and looking behaviors of audiovisual speech processing in infancy and early childhoodFinch, Kayla 12 November 2019 (has links)
Language is a multimodal process with visual and auditory cues playing important roles in understanding speech. A well-controlled paradigm with audiovisually matched and mismatched syllables is often used to capture audiovisual (AV) speech processing. The ability to detect and integrate mismatching cues shows large individual variability across development and is linked to later language in typical development (TD) and social abilities in autism spectrum disorder (ASD). However, no study has used a multimethod approach to better understand AV speech processing in early development. The studies’ aims were to examine behavioral performance, gaze patterns, and neural indices of AV speech in: 1) TD preschoolers (N=60; females=35) and 2) infants at risk for developing ASD (high-risk, HR; N=37; females=10) and TD controls (low-risk, LR; N=42; females=21).
In Study 1, I investigated preschoolers’ gaze patterns and behavioral performance when presented with matched and mismatched AV speech and visual-only (lipreading) speech. As hypothesized, lipreading abilities were associated with children’s ability to integrate mismatching AV cues, and children looked towards the mouth when visual cues were helpful, specifically in lipreading conditions. Unexpectedly, looking time towards the mouth was not associated with the children’s ability to integrate mismatching AV cues. Study 2 examined how visual cues of AV speech modulated auditory event-related potentials (ERPs), and associations between ERPs and preschoolers’ behavioral performance during an AV speech task. As hypothesized, the auditory ERPs were attenuated during AV speech compared to auditory-only speech. Additionally, individual differences in their neural processing of auditory and visual cues predicted which cue the child attended to in mismatched AV speech. In Study 3, I investigated ERPs of AV speech in LR and HR 12-month-olds and their association with language abilities at 18-months. Unexpectedly, I found no group differences: all infants were able to detect mismatched AV speech as measured through a more negative ERP response. As hypothesized, more mature neural processing of AV speech integration, measured as a more positive ERP response to fusible AV cues, predicted later language across all infants. These results highlight the importance of using multimethod approaches to understand variability in AV speech processing at two developmental stages. / 2021-11-12T00:00:00Z
|
Page generated in 0.0533 seconds