Global ETD Search

Return to search

Living in a dynamic world : semantic segmentation of large scale 3D environments

As we navigate the world, for example when driving a car from our home to the work place, we continuously perceive the 3D structure of our surroundings and intuitively recognise the objects we see. Such capabilities help us in our everyday lives and enable free and accurate movement even in completely unfamiliar places. We largely take these abilities for granted, but for robots, the task of understanding large outdoor scenes remains extremely challenging. In this thesis, I develop novel algorithms for (near) real-time dense 3D reconstruction and semantic segmentation of large-scale outdoor scenes from passive cameras. Motivated by "smart glasses" for partially sighted users, I show how such modeling can be integrated into an interactive augmented reality system which puts the user in the loop and allows her to physically interact with the world to learn personalized semantically segmented dense 3D models. In the next part, I show how sparse but very accurate 3D measurements can be incorporated directly into the dense depth estimation process and propose a probabilistic model for incremental dense scene reconstruction. To relax the assumption of a stereo camera, I address dense 3D reconstruction in its monocular form and show how the local model can be improved by joint optimization over depth and pose. The world around us is not stationary. However, reconstructing dynamically moving and potentially non-rigidly deforming texture-less objects typically require "contour correspondences" for shape-from-silhouettes. Hence, I propose a video segmentation model which encodes a single object instance as a closed curve, maintains correspondences across time and provide very accurate segmentation close to object boundaries. Finally, instead of evaluating the performance in an isolated setup (IoU scores) which does not measure the impact on decision-making, I show how semantic 3D reconstruction can be incorporated into standard Deep Q-learning to improve decision-making of agents navigating complex 3D environments.

https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.748719

Identifer	oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:748719
Date	January 2017
Creators	Miksik, Ondrej
Contributors	Perez, Patrick ; Torr, Philip H.
Publisher	University of Oxford
Source Sets	Ethos UK
Detected Language	English
Type	Electronic Thesis or Dissertation
Source	http://ora.ox.ac.uk/objects/uuid:28050b9e-5e42-46b5-9a54-004450f812ec

Page generated in 0.0029 seconds

Living in a dynamic world : semantic segmentation of large scale 3D environments

Description

Links & Downloads

Tags

Additional Fields