The present research sought to investigate the correspondence between auditory and visual feature dimensions and to utilise this knowledge in order to inform the design of audio-visual mappings for visual control of sound synthesis. The first stage of the research involved the design and implementation of Morpheme, a novel interface for interaction with corpus-based concatenative synthesis. Morpheme uses sketching as a model for interaction between the user and the computer. The purpose of the system is to facilitate the expression of sound design ideas by describing the qualities of the sound to be synthesised in visual terms, using a set of perceptually meaningful audio-visual feature associations. The second stage of the research involved the preparation of two multidimensional mappings for the association between auditory and visual dimensions. The third stage of this research involved the evaluation of the Audio-Visual (A/V) mappings and of Morpheme's user interface. The evaluation comprised two controlled experiments, an online study and a user study. Our findings suggest that the strength of the perceived correspondence between the A/V associations prevails over the timbre characteristics of the sounds used to render the complementary polar features. Hence, the empirical evidence gathered by previous research is generalizable/ applicable to different contexts and the overall dimensionality of the sound used to render should not have a very significant effect on the comprehensibility and usability of an A/V mapping. However, the findings of the present research also show that there is a non-linear interaction between the harmonicity of the corpus and the perceived correspondence of the audio-visual associations. For example, strongly correlated cross-modal cues such as size-loudness or vertical position-pitch are affected less by the harmonicity of the audio corpus in comparison to weaker correlated dimensions (e.g. texture granularity-sound dissonance). No significant differences were revealed as a result of musical/audio training. The third study consisted of an evaluation of Morpheme's user interface were participants were asked to use the system to design a sound for a given video footage. The usability of the system was found to be satisfactory. An interface for drawing visual queries was developed for high level control of the retrieval and signal processing algorithms of concatenative sound synthesis. This thesis elaborates on previous research findings and proposes two methods for empirically driven validation of audio-visual mappings for sound synthesis. These methods could be applied to a wide range of contexts in order to inform the design of cognitively useful multi-modal interfaces and representation and rendering of multimodal data. Moreover this research contributes to the broader understanding of multimodal perception by gathering empirical evidence about the correspondence between auditory and visual feature dimensions and by investigating which factors affect the perceived congruency between aural and visual structures.
Identifer | oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:725116 |
Date | January 2016 |
Creators | Tsiros, Augoustinos |
Contributors | Leplâtre, Grégory ; Smyth, Michael |
Publisher | Edinburgh Napier University |
Source Sets | Ethos UK |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Source | http://researchrepository.napier.ac.uk/Output/463438 |
Page generated in 0.0017 seconds