Visual motion carries information about the dynamics of ascene. Automatic interpretation of this information isimportant when designing computer systems forvisualnavigation, surveillance, human-computer interaction, browsingof video databases and other growing applications. In this thesis, we address the issue of motionrepresentation for the purpose of detecting and recognizingmotion patterns in video sequences. We localize the motion inspace and time and propose to use local spatio-temporal imagefeatures as primitives when representing and recognizingmotions. To detect such features, we propose to maximize ameasure of local variation of the image function over space andtime and show that such a method detects meaningful events inimage sequences. Due to its local nature, the proposed methodavoids the in.uence of global variations in the scene andovercomes the need for spatial segmentation and tracking priorto motion recognition. These properties are shown to be highlyuseful when recognizing human actions in complexscen es. Variations in scale and in relative motions of the cameramay strongly in.uence the structure of image sequences andtherefore the performance of recognition schemes. To addressthis problem, we develop a theory of local spatio-temporaladaptation and show that this approach provides invariance whenanalyzing image sequences under scaling and velocitytransformations. To obtain discriminative representations ofmotion patterns, we also develop several types of motiondescriptors and use them for classifying and matching localfeatures in image sequences. An extensive evaluation of thisapproach is performed and results in the context of the problemof human action recognition are presented. I n summary, this thesis provides the following contributions:(i) it introduces the notion of local features in space-timeand demonstrates the successful application of such featuresfor motion interpretation; (ii) it presents a theory and anevaluation of methods for local adaptation with respect toscale and velocity transformations in image sequences and (iii)it presents and evaluates a set of local motion descriptors,which in combination with methods for feature detection andfeature adaptation allow for robust recognition of humanactions in complexs cenes with cluttered and non-stationarybackgrounds as well as camera motion.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-3797 |
Date | January 2004 |
Creators | Laptev, Ivan |
Publisher | KTH, Numerisk analys och datalogi, NADA, Stockholm : Numerisk analys och datalogi |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Doctoral thesis, monograph, info:eu-repo/semantics/doctoralThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | Trita-NA, 0348-2952 ; 0413 |
Page generated in 0.0021 seconds