Return to search

Discovering object categories in image collections

Given a set of images containing multiple object categories,we seek to discover those categories and their image locations withoutsupervision. We achieve this using generative modelsfrom the statistical text literature: probabilistic Latent SemanticAnalysis (pLSA), and Latent Dirichlet Allocation (LDA). In text analysisthese are used to discover topics in a corpus using the bag-of-wordsdocument representation. Here we discover topics as object categories, sothat an image containing instances of several categories is modelled as amixture of topics.The models are applied to images by using avisual analogue of a word, formed by vector quantizing SIFT like regiondescriptors. We investigate a set of increasingly demanding scenarios,starting with image sets containing only two object categories through tosets containing multiple categories (including airplanes, cars, faces,motorbikes, spotted cats) and background clutter. The object categoriessample both intra-class and scale variation, and both the categories andtheir approximate spatial layout are found without supervision.We also demonstrate classification of unseen images and images containingmultiple objects. Performance of the proposed unsupervised method is compared tothe semi-supervised approach of Fergus et al.

Identiferoai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/30525
Date25 February 2005
CreatorsSivic, Josef, Russell, Bryan C., Efros, Alexei A., Zisserman, Andrew, Freeman, William T.
Source SetsM.I.T. Theses and Dissertation
Languageen_US
Detected LanguageEnglish
Format0 p., 29582654 bytes, 2849946 bytes, application/postscript, application/pdf
RelationMassachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory

Page generated in 0.002 seconds