Return to search

Implicit shape representation for 2D/3D tracking and reconstruction

This thesis develops and describes methods for real-time tracking, segmentation and 3-dimensional (3D) model acquisition, in the context of developing games for stroke patients that are rehabilitating at home. Real-time tracking and reconstruction of a stroke patient's feet, hands and the control objects that they are touching can enable not only the graphical visualization of the virtual avatar in the rehabilitation games, but also permits measurement of the patient's performs. Depth or combined colour and depth imagery from a Kinect sensor is used as input data. The 3D signed distance function (SDF) is used as implicit shape representation, and a series of probabilistic graphical models are developed for the problem of model-based 3D tracking, simultaneous 3D tracking and reconstruction and 3D tracking of multiple objects with identical appearance. The work is based on the assumption that the observed imagery is generated jointly by the pose(s) and the shape(s). The depth of each pixel is randomly and independently sampled from the likelihood of the pose(s) and the shape(s). The pose(s) tracking and 3D shape reconstruction problems are then cast as the maximum likelihood (ML) or maximum a posterior (MAP) estimate of the pose(s) or 3D shape. This methodology first leads to a novel probabilistic model for tracking rigid 3D objects with only depth data. For a known 3D shape, optimization aims to find the optimal pose that back projects all object region pixels onto the zero level set of the 3D shape, thus effectively maximising the likelihood of the pose. The method is extended to consider colour information for more robust tracking in the presence of outliers and occlusions. Initialised with a coarse 3D model, the extended method is also able to simultaneously reconstruct and track an unknown 3D object in real time. Finally, the concept of `shape union' is introduced to solve the problem of tracking multiple 3D objects with identical appearance. This is formulated as the minimum value of all SDFs in camera coordinates, which (i) leads to a per-pixel soft membership weight for each object thus providing an elegant solution for the data association in multi-target tracking and (ii) it allows for probabilistic physical constraints that avoid collisions between objects to be naturally enforced. The thesis also explore the possibility of using implicit shape representation for online shape learning. We use the harmonics of 2D discrete cosine transform (DCT) to represent 2D shapes. High frequency harmonics are decoupled from low ones to represent the coarse information and the details of the 2D shape. A regression model is learnt online to model the relationship between the high and low frequency harmonics using Locally Weighted Projection Regression (LWPR). We have demonstrated that the learned regression model is able to detect occlusion and recover them to the complete shape.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:640073
Date January 2014
CreatorsRen, Yuheng
ContributorsMurray, David; Reid, Ian
PublisherUniversity of Oxford
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://ora.ox.ac.uk/objects/uuid:c70dc663-ee7c-4100-b492-3a85bf8640d1

Page generated in 0.0017 seconds