Sensory information processing is an important feature of robotic agents that must interact with humans or the environment. For example, numerous attempts have been made to develop robots that have the capability of performing interactive communication. In most cases, individual sensory information is processed and based on this, an output action is performed. In many robotic applications, visual and audio sensors are used to emulate human-like communication. The Superior Colliculus, located in the mid-brain region of the nervous system, carries out similar functionality of audio and visual stimuli integration in both humans and animals. In recent years numerous researchers have attempted integration of sensory information using biological inspiration. A common focus lies in generating a single output state (i.e. a multimodal output) that can localize the source of the audio and visual stimuli. This research addresses the problem and attempts to find an effective solution by investigating various computational and biological mechanisms involved in the generation of multimodal output. A primary goal is to develop a biologically inspired computational architecture using artificial neural networks. The advantage of this approach is that it mimics the behaviour of the Superior Colliculus, which has the potential of enabling more effective human-like communication with robotic agents. The thesis describes the design and development of the architecture, which is constructed from artificial neural networks using radial basis functions. The primary inspiration for the architecture came from emulating the function top and deep layers of the Superior Colliculus, due to their visual and audio stimuli localization mechanisms, respectively. The integration experimental results have successfully demonstrated the key issues, including low-level multimodal stimuli localization, dimensionality reduction of audio and visual input-space without affecting stimuli strength, and stimuli localization with enhancement and depression phenomena. Comparisons have been made between computational and neural network based methods, and unimodal verses multimodal integrated outputs in order to determine the effectiveness of the approach.
Identifer | oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:574246 |
Date | January 2012 |
Creators | Ravulakollu, Kiran Kumar |
Publisher | University of Sunderland |
Source Sets | Ethos UK |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Source | http://sure.sunderland.ac.uk/3759/ |
Page generated in 0.0301 seconds