Return to search

Visual search and recognition of objects, scenes and people

The objective of this work is to make a step towards an artificial system with human-like visual intelligence capabilities. We consider the following three visual recognition problems. First, we show how to identify the same object or scene instance in a large database of images despite significant changes in appearance due to viewpoint, illumination but also aging, seasonal changes, or depiction style. Second, we consider recognition of object classes such as "chairs" or "windows" (as opposed to a specific instance of a chair or a window). We investigate how to name object classes present in the image, identify their locations as well as predict their approximate 3D model and fine-grained style ("Is this a bar stool or a folding chair?"; "Is this a bay window or a French window?"). In particular, we investigate different levels of supervision for this task starting from just observing images without any supervision to having millions of labelled images or a set of full 3D models. Finally, we consider recognition of people and their actions in unconstrained videos such as TV or feature length films. In detail, we investigate how to identify individual people in the video using their faces ("Who is this?") as well as recognize what they do ("Is this person walking or sitting?").

Identiferoai:union.ndltd.org:CCSD/oai:tel.archives-ouvertes.fr:tel-01064559
Date13 February 2014
CreatorsSivic, Josef
PublisherEcole Normale Supérieure de Paris - ENS Paris
Source SetsCCSD theses-EN-ligne, France
LanguageEnglish
Detected LanguageEnglish
Typehabilitation ࠤiriger des recherches

Page generated in 0.0023 seconds