Recognizing objects through vision is an important part of our lives: we recognize people when we talk to them, we recognize our cup on the breakfast table, our car in a parking lot, and so on. While this task is performed with great accuracy and apparently little effort by humans, it is still unclear how this performance is achieved. Creating computer methods for automatic object recognition gives rise to challenging theoretical problems such as how to model the visual appearance of the objects or categories we want to recognize, so that the resulting algorithm will perform robustly in realistic scenarios; to this end, how to use effectively multiple cues (such as shape, color, textural properties and many others), so that the algorithm uses uses the best subset of cues in the most effective manner; how to use specific features and/or specific strategies for different classes. The present work is devoted to the above issues. We propose to model the visual appearance of objects and visual categories via probability density functions. The model is developed on the basis of concepts and results obtained in three different research areas: computer vision, machine learning and statistical physics of spin glasses. It consists of a fully connected Markov random field with energy function derived from results of statistical physics of spin glasses. Markov random fields and spin glass energy functions are combined together via nonlinear kernel functions; we call the model Spin Glass-Markov Random Fields. Full connectivity enables to take into account the global appearance of the object, and its specific local characteristics at the same time, resulting in robustness to noise, occlusions and cluttered background. Because of properties of some classes of spin glasslike energy functions, our model allows to use easily and effectively multiple cues, and to employ class specific strategies. We show with theoretical analysis and experiments that this new model is competitive with state-of-the-art algorithms for object recognition.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-58 |
Date | January 2004 |
Creators | Caputo, Barbara |
Publisher | KTH, Numerisk analys och datalogi, NADA, Stockholm : Numerisk analys och datalogi |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Doctoral thesis, monograph, info:eu-repo/semantics/doctoralThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | Trita-NA, 0348-2952 ; 0430 |
Page generated in 0.0054 seconds