Return to search

Modeling of structured 3-D environments from monocular image sequences

Abstract
The purpose of this research has been to show with applications that polyhedral scenes can be modeled in real time with a single video camera. Sometimes this can be done very efficiently without any special image processing hardware. The developed vision sensor estimates its three-dimensional position with respect to the environment and models it simultaneously. Estimates become recursively more accurate when objects are approached and observed from different viewpoints.

The modeling process starts by extracting interesting tokens, like lines and corners, from the first image. Those features are then tracked in subsequent image frames. Also some previously taught patterns can be used in tracking. A few features in the same image are extracted. By this way the processing can be done at a video frame rate. New features appearing can also be added to the environment structure.

Kalman filtering is used in estimation. The parameters in motion estimation are location and orientation and their first derivates. The environment is considered a rigid object in respect to the camera. The environment structure consists of 3-D coordinates of the tracked features. The initial model lacks depth information. The relational depth is obtained by utilizing facts such as closer points move faster on the image plane than more distant ones during translational motion. Additional information is needed to obtain absolute coordinates.

Special attention has been paid to modeling uncertainties. Measurements with high uncertainty get less weight when updating the motion and environment model. The rigidity assumption is utilized by using shapes of a thin pencil for initial model structure uncertainties. By observing continuously motion uncertainties, the performance of the modeler can be monitored.

In contrast to the usual solution, the estimations are done in separate state vectors, which allows motion and 3-D structure to be estimated asynchronously. In addition to having a more distributed solution, this technique provides an efficient failure detection mechanism. Several trackers can estimate motion simultaneously, and only those with the most confident estimates are allowed to update the common environment model.

Tests showed that motion with six degrees of freedom can be estimated in an unknown environment. The 3-D structure of the environment is estimated simultaneously. The achieved accuracies were millimeters at a distance of 1-2 meters, when simple toy-scenes and more demanding industrial pallet scenes were used in tests. This is enough to manipulate objects when the modeler is used to offer visual feedback.

Identiferoai:union.ndltd.org:oulo.fi/oai:oulu.fi:isbn951-42-6857-1
Date08 November 2002
CreatorsRepo, T. (Tapio)
PublisherUniversity of Oulu
Source SetsUniversity of Oulu
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/doctoralThesis, info:eu-repo/semantics/publishedVersion
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess, © University of Oulu, 2002
Relationinfo:eu-repo/semantics/altIdentifier/pissn/0355-3213, info:eu-repo/semantics/altIdentifier/eissn/1796-2226

Page generated in 0.002 seconds