Return to search

The facilitatory crossmodal effect of auditory stimuli on visual perception

The aim of the experiments reported in this thesis was to investigate the multisensory interactions taking place between vision and audition. The focus is on the modulatory role of the temporal coincidence and semantic congruency of pairs of auditory and visual stimuli. With regards to the temporal coincidence factor, whether, and how, the presentation of a simultaneous sound facilitates visual target perception was tested using the equivalent noise paradigm (Chapter 3) and the backward masking paradigm (Chapter 4). The results demonstrate that crossmodal facilitation can be observed in both visual detection and identification tasks. Importantly, however, the results also reveal that the sound not only had to be presented simultaneously, but also reliably, with the visual target. The suggestion is made that the reliable co-occurrence of the auditory and visual stimuli provides observers with the statistical regularity needed to assume that the visual and auditory stimuli likely originate from the same perceptual event (i.e., that they in some sense 'belong together'). The experiments reported in Chapters 5 through 8 were designed to investigate the role of semantic congruency on audiovisual interactions. The results of the experiments reported in Chapter 5 revealed that the semantic context provided by the soundtrack that a person happens to be listening to can modulate his/her visual conscious perception in the binocular rivalry situation. In Chapters 6-8, the timecourse of audiovisual semantic interactions were investigated using categorization, detection, and identification tasks on visual pictures. The results suggested that when the presentation of the sound leads the presentation of a picture by more than 240 ms, it induces a crossmodal semantic priming effect. In addition, when the presentation of the sound lags a semantically-congruent picture by about 300 ms, it enhances performance, presumably by helping to maintain the visual representation in short-term memory. The results indicate that audiovisual semantic interactions constitute a heterogeneous group of phenomena. A crossmodal type-token binding framework is proposed to account for the parallel processing of the spatiotemporal and semantic interactions of multisensory inputs. The suggestion is that the congruent information in the type and token representation systems would integrate, and they finally bind into a unified multisensory object representation.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:543556
Date January 2011
CreatorsChen, Yi-Chuan
ContributorsSpence, Charles
PublisherUniversity of Oxford
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://ora.ox.ac.uk/objects/uuid:36dcc0ec-d655-423d-8191-a83d9fd76886

Page generated in 0.0023 seconds