1 |
Broadband World Modeling and Scene ReconstructionGoldman, Benjamin Joseph 24 May 2013 (has links)
Perception is a key feature in how any creature or autonomous system relates to its environment. While there are many types of perception, this thesis focuses on the improvement of the visual robotics perception systems. By implementing a broadband passive sensing system in conjunction with current perception algorithms, this thesis explores scene reconstruction and world modeling.
The process involves two main steps. The first is stereo correspondence using block matching algorithms with filtering to improve the quality of this matching process. The disparity maps are then transformed into 3D point clouds. These point clouds are filtered again before the registration process is done. The registration uses a SAC-IA matching technique to align the point clouds with minimum error. The registered final cloud is then filtered again to smooth and down sample the large amount of data. This process was implemented through software architecture that utilizes Qt, OpenCV, and Point Cloud Library. It was tested using a variety of experiments on each of the components of the process. It shows promise for being able to replace or augment existing UGV perception systems in the future. / Master of Science
|
2 |
From shape-based object recognition and discovery to 3D scene interpretationPayet, Nadia 12 May 2011 (has links)
This dissertation addresses a number of inter-related and fundamental problems in computer vision. Specifically, we address object discovery, recognition, segmentation, and 3D pose estimation in images, as well as 3D scene reconstruction and scene interpretation. The key ideas behind our approaches include using shape as a basic object feature, and using structured prediction modeling paradigms for representing objects and scenes.
In this work, we make a number of new contributions both in computer vision and machine learning. We address the vision problems of shape matching, shape-based mining of objects in arbitrary image collections, context-aware object recognition, monocular estimation of 3D object poses, and monocular 3D scene reconstruction using shape from texture. Our work on shape-based object discovery is the first to show that meaningful objects can be extracted from a collection of arbitrary images, without any human supervision, by shape matching. We also show that a spatial repetition of objects in images (e.g., windows on a building facade, or cars lined up along a street) can be used for 3D scene reconstruction from a single image. The aforementioned topics have never been addressed in the literature.
The dissertation also presents new algorithms and object representations for the aforementioned vision problems. We fuse two traditionally different modeling paradigms Conditional Random Fields (CRF) and Random Forests (RF) into a unified framework, referred to as (RF)^2. We also derive theoretical error bounds of estimating distribution ratios by a two-class RF, which is then used to derive the theoretical performance bounds of a two-class
(RF)^2.
Thorough experimental evaluation of individual aspects of all our approaches is presented. In general, the experiments demonstrate that we outperform the state of the art on the benchmark datasets, without increasing complexity and supervision in training. / Graduation date: 2011 / Access restricted to the OSU Community at author's request from May 12, 2011 - May 12, 2012
|
3 |
Robust Extraction Of Sparse 3d Points From Image SequencesVural, Elif 01 September 2008 (has links) (PDF)
In this thesis, the extraction of sparse 3D points from calibrated image sequences is studied. The presented method for sparse 3D reconstruction is examined in two steps, where the first part addresses the problem of two-view reconstruction, and the second part is the extension of the two-view reconstruction algorithm to include multiple views. The examined two-view reconstruction method consists of some basic building blocks, such as feature detection and matching, epipolar geometry estimation, and the reconstruction of cameras and scene structure. Feature detection and matching is achieved by Scale Invariant Feature Transform (SIFT) method. For the estimation of epipolar geometry, the 7-point and 8-point algorithms are examined for Fundamental matrix (F-matrix) computation, while RANSAC and PROSAC are utilized for the robustness and accuracy for model estimation. In the final stage of two-view reconstruction, the camera projection matrices are computed from the F-matrix, and the locations of 3D scene points are estimated by triangulation / hence, determining the scene structure and cameras up to a projective transformation. The extension of the two-view reconstruction to multiple views is achieved by estimating the camera projection matrix of each additional view from the already reconstructed matches, and then adding new points to the scene structure by triangulating the unreconstructed matches. Finally, the reconstruction is upgraded from projective to metric by a rectifying homography computed from the camera calibration information. In order to obtain a refined reconstruction, two different methods are suggested for the removal of erroneous points from the scene structure. In addition to the examination of the solution to the reconstruction problem, experiments have been conducted that compare the performances of competing algorithms used in various stages of reconstruction. In connection with sparse reconstruction, a rate-distortion efficient piecewise planar scene representation algorithm that generates mesh models of scenes from reconstructed point clouds is examined, and its performance is evaluated through experiments.
|
4 |
Multiview 3d Reconstruction Of A Scene Containing Independently Moving ObjectsTola, Engin 01 August 2005 (has links) (PDF)
In this thesis, the structure from motion problem for calibrated scenes containing independently moving objects (IMO) has been studied. For this purpose, the overall reconstruction process is partitioned into various stages. The first stage deals with the fundamental problem of estimating structure and motion by using only two views. This process starts with finding some salient features using a sub-pixel version of the Harris corner detector. The features are matched by the help of a similarity and neighborhood-based matcher. In order to reject the outliers and estimate the fundamental matrix of the two images, a robust estimation is performed via RANSAC and normalized 8-point algorithms. Two-view reconstruction is finalized by decomposing the fundamental matrix and estimating the 3D-point locations as a result of triangulation. The second stage of the reconstruction is the generalization of the two-view algorithm for the N-view case. This goal is accomplished by first reconstructing an initial framework from the first stage and then relating the additional views by finding correspondences between the new view and already reconstructed views. In this way, 3D-2D projection pairs are determined and the projection matrix of this new view is estimated by using a robust procedure. The final section deals with scenes containing IMOs. In order to reject the correspondences due to moving objects, parallax-based rigidity constraint is used. In utilizing this constraint, an automatic background pixel selection algorithm is developed and an IMO rejection algorithm is also proposed. The results of the proposed algorithm are compared against that of a robust outlier rejection algorithm and found to be quite promising in terms of execution time vs. reconstruction quality.
|
5 |
Sensor Fused Scene Reconstruction and Surface InspectionMoodie, Daniel Thien-An 17 April 2014 (has links)
Optical three dimensional (3D) mapping routines are used in inspection robots to detect faults by creating 3D reconstructions of environments. To detect surface faults, sub millimeter depth resolution is required to determine minute differences caused by coating loss and pitting. Sensors that can detect these small depth differences cannot quickly create contextual maps of large environments.
To solve the 3D mapping problem, a sensor fused approach is proposed that can gather contextual information about large environments with one depth sensor and a SLAM routine; while local surface defects can be measured with an actuated optical profilometer. The depth sensor uses a modified Kinect Fusion to create a contextual map of the environment. A custom actuated optical profilometer is created and then calibrated. The two systems are then registered to each other to place local surface scans from the profilometer into a scene context created by Kinect Fusion.
The resulting system can create a contextual map of large scale features (0.4 m) with less than 10% error while the optical profilometer can create surface reconstructions with sub millimeter resolution. The combination of the two allows for the detection and quantification of surface faults with the profilometer placed in a contextual reconstruction. / Master of Science
|
6 |
Dense 3D Point Cloud Representation of a Scene Using Uncalibrated Monocular VisionDiskin, Yakov 23 May 2013 (has links)
No description available.
|
Page generated in 0.0853 seconds