Return to search

3D Object Representation and Recognition Based on Biologically Inspired Combined Use of Visual and Tactile Data

Recent research makes use of biologically inspired computation and artificial intelligence as efficient means to solve real-world problems. Humans show a significant performance in extracting and interpreting visual information. In the cases where visual data is not available, or, for example, if it fails to provide comprehensive information due to occlusions, tactile exploration assists in the interpretation and better understanding of the environment. This cooperation between human senses can serve as an inspiration to embed a higher level of intelligence in computational models.
In the context of this research, in the first step, computational models of visual attention are explored to determine salient regions on the surface of objects. Two different approaches are proposed. The first approach takes advantage of a series of contributing features in guiding human visual attention, namely color, contrast, curvature, edge, entropy, intensity, orientation, and symmetry are efficiently integrated to identify salient features on the surface of 3D objects. This model of visual attention also learns to adaptively weight each feature based on ground-truth data to ensure a better compatibility with human visual exploration capabilities. The second approach uses a deep Convolutional Neural Network (CNN) for feature extraction from images collected from 3D objects and formulates saliency as a fusion map of regions where the CNN looks at, while classifying the object based on their geometrical and semantic characteristics. The main difference between the outcomes of the two algorithms is that the first approach results in saliencies spread over the surface of the objects while the second approach highlights one or two regions with concentrated saliency. Therefore, the first approach is an appropriate simulation of visual exploration of objects, while the second approach successfully simulates the eye fixation locations on objects.
In the second step, the first computational model of visual attention is used to determine scattered salient points on the surface of objects based on which simplified versions of 3D object models preserving the important visual characteristics of objects are constructed. Subsequently, the thesis focuses on the topic of tactile object recognition, leveraging the proposed model of visual attention. Beyond the sensor technologies which are instrumental in ensuring data quality, biological models can also assist in guiding the placement of sensors and support various selective data sampling strategies that allow exploring an object’s surface faster. Therefore, the possibility to guide the acquisition of tactile data based on the identified visually salient features is tested and validated in this research. Different object exploration and data processing approaches were used to identify the most promising solution.
Our experiments confirm the effectiveness of computational models of visual attention as a guide for data selection for both simplifying 3D representation of objects as well as enhancing tactile object recognition. In particular, the current research demonstrates that: (1) the simplified representation of objects by preserving visually salient characteristics shows a better compatibility with human visual capabilities compared to uniformly simplified models, and (2) tactile data acquired based on salient visual features are more informative about the objects’ characteristics and can be employed in tactile object manipulation and recognition scenarios.
In the last section, the thesis addresses the issue of transfer of learning from vision to touch. Inspired from biological studies that attest similarities between the processing of visual and tactile stimuli in human brain, the thesis studies the possibility of transfer of learning from vision to touch using deep learning architectures and proposes a hybrid CNN that handles both visual and tactile object recognition.

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/42122
Date13 May 2021
CreatorsRouhafzay, Ghazal
ContributorsPayeur, Pierre, Cretu, Ana-Maria
PublisherUniversité d'Ottawa / University of Ottawa
Source SetsUniversité d’Ottawa
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Formatapplication/pdf

Page generated in 0.0021 seconds