Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Architecture, 2000. / Includes bibliographical references (p. 279-297). / This thesis presents a computational framework for the automatic recognition and prediction of different kinds of human behaviors from video cameras and other sensors, via perceptually intelligent systems that automatically sense and correctly classify human behaviors, by means of Machine Perception and Machine Learning techniques. In the thesis I develop the statistical machine learning algorithms (dynamic graphical models) necessary for detecting and recognizing individual and interactive behaviors. In the case of the interactions two Hidden Markov Models (HMMs) are coupled in a novel architecture called Coupled Hidden Markov Models (CHMMs) that explicitly captures the interactions between them. The algorithms for learning the parameters from data as well as for doing inference with those models are developed and described. Four systems that experimentally evaluate the proposed paradigm are presented: (1) LAFTER, an automatic face detection and tracking system with facial expression recognition; (2) a Tai-Chi gesture recognition system; (3) a pedestrian surveillance system that recognizes typical human to human interactions; (4) and a SmartCar for driver maneuver recognition. These systems capture human behaviors of different nature and increasing complexity: first, isolated, single-user facial expressions, then, two-hand gestures and human-to-human interactions, and finally complex behaviors where human performance is mediated by a machine, more specifically, a car. The metric that is used for quantifying the quality of the behavior models is their accuracy: how well they are able to recognize the behaviors on testing data. Statistical machine learning usually suffers from lack of data for estimating all the parameters in the models. In order to alleviate this problem, synthetically generated data are used to bootstrap the models creating 'prior models' that are further trained using much less real data than otherwise it would be required. The Bayesian nature of the approach let us do so. The predictive power of these models lets us categorize human actions very soon after the beginning of the action. Because of the generic nature of the typical behaviors of each of the implemented systems there is a reason to believe that this approach to modeling human behavior would generalize to other dynamic human-machine systems. This would allow us to recognize automatically people's intended action, and thus build control systems that dynamically adapt to suit the human's purposes better. / by Nuria M. Oliver. / Ph.D.
Identifer | oai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/69423 |
Date | January 2000 |
Creators | Oliver, Nuria, 1970- |
Contributors | Alex P. Pentland., Massachusetts Institute of Technology. Dept. of Architecture., Massachusetts Institute of Technology. Dept. of Architecture. |
Publisher | Massachusetts Institute of Technology |
Source Sets | M.I.T. Theses and Dissertation |
Language | English |
Detected Language | English |
Type | Thesis |
Format | 297 p., application/pdf |
Rights | M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission., http://dspace.mit.edu/handle/1721.1/7582 |
Page generated in 0.0029 seconds