Global ETD Search

Return to search

Advancing human pose and gesture recognition

This thesis presents new methods in two closely related areas of computer vision: human pose estimation, and gesture recognition in videos. In human pose estimation, we show that random forests can be used to estimate human pose in monocular videos. To this end, we propose a co-segmentation algorithm for segmenting humans out of videos, and an evaluator that predicts whether the estimated poses are correct or not. We further extend this pose estimator to new domains (with a transfer learning approach), and enhance its predictions by predicting the joint positions sequentially (rather than independently) in an image, and using temporal information in the videos (rather than predicting the poses from a single frame). Finally, we go beyond random forests, and show that convolutional neural networks can be used to estimate human pose even more accurately and efficiently. We propose two new convolutional neural network architectures, and show how optical flow can be employed in convolutional nets to further improve the predictions. In gesture recognition, we explore the idea of using weak supervision to learn gestures. We show that we can learn sign language automatically from signed TV broadcasts with subtitles by letting algorithms 'watch' the TV broadcasts and 'match' the signs with the subtitles. We further show that if even a small amount of strong supervision is available (as there is for sign language, in the form of sign language video dictionaries), this strong supervision can be combined with weak supervision to learn even better models.

http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.658521

006.3

Identifer	oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:658521
Date	January 2015
Creators	Pfister, Tomas
Contributors	Zisserman, Andrew
Publisher	University of Oxford
Source Sets	Ethos UK
Detected Language	English
Type	Electronic Thesis or Dissertation
Source	http://ora.ox.ac.uk/objects/uuid:64e5b1be-231e-49ed-b385-e87db6dbeed8

Page generated in 0.0013 seconds

Advancing human pose and gesture recognition

Description

Links & Downloads

Tags

Additional Fields