Global ETD Search

Return to search

Multiple cue object recognition

Nature is rich in examples of how vision can be successfully used for sensing and perceiving the world and how the gathered information can be utilized to perform a variety of different objectives. The key to successful vision is the internal representations of the visual agent, which enable the agent to successfully perceive properties about the world. Humans perceive a multitude of properties of the world through our visual sense, such as motion, shape, texture, and color. In addition we also perceive the world to be structured into objects which are clustered into different classes - categories. For such a rich perception of the world many different internal representations that can be combined in different ways are necessary. So far much work in computer vision has been focused on finding new and, out of some perspective, better descriptors and not much work has been done on how to combine different representations.In this thesis a purposive approach in the context of a visual agent to object recognition is taken. When considering object recognition from this view point the situatedness in form of the context and task of the agent becomes central. Further a multiple feature representation of objects is proposed, since a single feature might not be pertinent to the task at hand nor be robust in a given context.The first contribution of this thesis is an evaluation of single feature object representations that have previously been used in computer vision for object recognition. In the evaluation different interest operators combined with different photometric descriptors are tested together with a shape representation and a statistical representation of the whole appearance. Further a color representation, inspired from human color perception, is presented and used in combination with the shape descriptor to increase the robustness of object recognition in cluttered scenes.In the last part, which contains the second contribution, of this thesis a vision system for object recognition based on multiple feature object representation is presented together with an architecture of the agent that utilizes the proposed representation. By taking a system perspective to object recognition we will consider the representations performance under a given context and task. The scenario considered here is derived from a fetch scenario performed by a service robot.

Datavetenskap

Computer science

Datavetenskap

Identifer	oai:union.ndltd.org:UPSALLA/oai:DiVA.org:kth-277
Date	January 2005
Creators	Furesjö, Fredrik
Publisher	KTH, Numerical Analysis and Computer Science, NADA
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Licentiate thesis, monograph, text
Relation	Trita-NA, 0348-2952 ; 0504

Page generated in 0.0016 seconds

Multiple cue object recognition

Description

Links & Downloads

Tags

Additional Fields