A critical practical problem in the field of object recognition is an insufficient number of labeled training images, as manually labeling images is a time consuming task. For this reason, unsupervised learning techniques are used to take advantage of unlabeled training images to extract image representations that are useful for classification. However, unsupervised learning is in general difficult. We propose simplifying the unsupervised training problem considerably by taking the advance of motion information. The output of our method is a model that can generate a vector representation from any static image. However, the model is trained using images with additional motion information. To demonstrate the flobject analysis framework, we extend the latent Dirichlet allocation model to account for word-specific flow vectors. We show that the static image representations extracted using our model achieve higher classification rates and better generalization than standard topic models, spatial pyramid matching, and Gist descriptors.
Identifer | oai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/31310 |
Date | 14 December 2011 |
Creators | Li, Patrick |
Contributors | Frey, Brendan J. |
Source Sets | University of Toronto |
Language | en_ca |
Detected Language | English |
Type | Thesis |
Page generated in 0.0017 seconds