Knowing the semantic category of a robot's current position not only facilitates the robot's navigation, but also greatly improves its ability to serve human needs and to interpret the scene. Visual Place Categorization (VPC) is addressed in this dissertation, which refers to the problem of predicting the semantic category of a place using visual information collected from an autonomous robot platform.
Census Transform (CT) histogram and Histogram Intersection Kernel (HIK) based visual codebooks are proposed to represent an image. CT histogram encodes the stable spatial structure of an image that reflects the functionality of a location. It is suitable for categorizing places and has shown better performance than commonly used descriptors such as SIFT or Gist in the VPC task.
HIK has been shown to work better than the Euclidean distance in classifying histograms. We extend it in an unsupervised manner to generate visual codebooks for the CT histogram descriptor. HIK codebooks help CT histogram to deal with the huge variations in VPC and improve system accuracy. A computational method is also proposed to generate HIK codebooks in an efficient way.
The first significant VPC dataset in home environments is collected and is made publicly available, which is also used to evaluate the VPC system based on the proposed techniques. The VPC system achieves promising results for this challenging problem, especially for important categories such as bedroom, bathroom, and kitchen. The proposed techniques achieved higher accuracies than competing descriptors and visual codebook generation methods.
Identifer | oai:union.ndltd.org:GATECH/oai:smartech.gatech.edu:1853/29784 |
Date | 06 July 2009 |
Creators | Wu, Jianxin |
Publisher | Georgia Institute of Technology |
Source Sets | Georgia Tech Electronic Thesis and Dissertation Archive |
Detected Language | English |
Type | Dissertation |
Page generated in 0.0016 seconds