Global ETD Search

Return to search

Visual Object Recognition Using Generative Models of Images

Visual object recognition is one of the key human capabilities that we would like machines to have. The problem is the following: given an image of an object (e.g. someone's face), predict its label (e.g. that person's name) from a set of possible object labels. The predominant approach to solving the recognition problem has been to learn a discriminative model, i.e. a model of the conditional probability $P(l|v)$ over possible object labels $l$ given an image $v$.

Here we consider an alternative class of models, broadly referred to as \emph{generative models}, that learns the latent structure of the image so as to explain how it was generated. This is in contrast to discriminative models, which dedicate their parameters exclusively to representing the conditional distribution $P(l|v)$. Making finer distinctions among generative models, we consider a supervised generative model of the joint distribution $P(v,l)$ over image-label pairs, an unsupervised generative model of the distribution $P(v)$ over images alone, and an unsupervised \emph{reconstructive} model, which includes models such as autoencoders that can reconstruct a given image, but do not define a proper distribution over images. The goal of this thesis is to empirically demonstrate various ways of using these models for object recognition. Its main conclusion is that such models are not only useful for recognition, but can even outperform purely discriminative models on difficult recognition tasks.

We explore four types of applications of generative/reconstructive models for recognition: 1) incorporating complex domain knowledge into the learning by inverting a synthesis model, 2) using the latent image representations of generative/reconstructive models for recognition, 3) optimizing a hybrid generative-discriminative loss function, and 4) creating additional synthetic data for training more accurate discriminative models. Taken together, the results for these applications support the idea that generative/reconstructive models and unsupervised learning have a key role to play in building object recognition systems.

http://hdl.handle.net/1807/24839

machine learning

computer vision

0800

Identifer	oai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/24839
Date	01 September 2010
Creators	Nair, Vinod
Contributors	Hinton, Geoffrey
Source Sets	University of Toronto
Language	en_ca
Detected Language	English
Type	Thesis

Page generated in 0.0022 seconds

Visual Object Recognition Using Generative Models of Images

Description

Links & Downloads

Tags

Additional Fields