In this work we develop optical flow templates. In doing so, we introduce a practical tool for inferring robot egomotion and semantic superpixel labeling using optical flow in imaging systems with arbitrary optics. In order to do this we develop valuable understanding of geometric relationships and mathematical methods that are useful in interpreting optical flow to the robotics and computer vision communities.
This work is motivated by what we perceive as directions for advancing the current state of the art in obstacle detection and scene understanding for mobile robots. Specifically, many existing methods build 3D point clouds, which are not directly useful for autonomous navigation and require further processing. Both the step of building the point clouds and the later processing steps are challenging and computationally intensive. Additionally, many current methods require a calibrated camera, which introduces calibration challenges and places limitations on the types of camera optics that may be used. Wide-angle lenses, systems with mirrors, and multiple cameras all require different calibration models and can be difficult or impossible to calibrate at all. Finally, current pixel and superpixel obstacle labeling algorithms typically rely on image appearance. While image appearance is informative, image motion is a direct effect of the scene structure that determines whether a region of the environment is an obstacle.
The egomotion estimation and obstacle labeling methods we develop here based on optical flow templates require very little computation per frame and do not require building point clouds. Additionally, they do not require any specific type of camera optics, nor a calibrated camera. Finally, they label obstacles using optical flow alone without image appearance.
In this thesis we start with optical flow subspaces for egomotion estimation and detection of “motion anomalies”. We then extend this to multiple subspaces and develop mathematical reasoning to select between them, comprising optical flow templates. Using these we classify environment shapes and label superpixels. Finally, we show how performing all learning and inference directly from image spatio-temporal gradients greatly improves computation time and accuracy.
Identifer | oai:union.ndltd.org:GATECH/oai:smartech.gatech.edu:1853/53473 |
Date | 08 June 2015 |
Creators | Roberts, Richard Joseph William |
Contributors | Dellaert, Frank |
Publisher | Georgia Institute of Technology |
Source Sets | Georgia Tech Electronic Thesis and Dissertation Archive |
Language | en_US |
Detected Language | English |
Type | Dissertation |
Format | application/pdf |
Page generated in 0.0119 seconds