Global ETD Search

81	Structureless Camera Motion Estimation of Unordered Omnidirectional Images Sastuba, Mark 08 August 2022 (has links) This work aims at providing a novel camera motion estimation pipeline from large collections of unordered omnidirectional images. In oder to keep the pipeline as general and flexible as possible, cameras are modelled as unit spheres, allowing to incorporate any central camera type. For each camera an unprojection lookup is generated from intrinsics, which is called P2S-map (Pixel-to-Sphere-map), mapping pixels to their corresponding positions on the unit sphere. Consequently the camera geometry becomes independent of the underlying projection model. The pipeline also generates P2S-maps from world map projections with less distortion effects as they are known from cartography. Using P2S-maps from camera calibration and world map projection allows to convert omnidirectional camera images to an appropriate world map projection in oder to apply standard feature extraction and matching algorithms for data association. The proposed estimation pipeline combines the flexibility of SfM (Structure from Motion) - which handles unordered image collections - with the efficiency of PGO (Pose Graph Optimization), which is used as back-end in graph-based Visual SLAM (Simultaneous Localization and Mapping) approaches to optimize camera poses from large image sequences. SfM uses BA (Bundle Adjustment) to jointly optimize camera poses (motion) and 3d feature locations (structure), which becomes computationally expensive for large-scale scenarios. On the contrary PGO solves for camera poses (motion) from measured transformations between cameras, maintaining optimization managable. The proposed estimation algorithm combines both worlds. It obtains up-to-scale transformations between image pairs using two-view constraints, which are jointly scaled using trifocal constraints. A pose graph is generated from scaled two-view transformations and solved by PGO to obtain camera motion efficiently even for large image collections. Obtained results can be used as input data to provide initial pose estimates for further 3d reconstruction purposes e.g. to build a sparse structure from feature correspondences in an SfM or SLAM framework with further refinement via BA. The pipeline also incorporates fixed extrinsic constraints from multi-camera setups as well as depth information provided by RGBD sensors. The entire camera motion estimation pipeline does not need to generate a sparse 3d structure of the captured environment and thus is called SCME (Structureless Camera Motion Estimation).:1 Introduction 1.1 Motivation 1.1.1 Increasing Interest of Image-Based 3D Reconstruction 1.1.2 Underground Environments as Challenging Scenario 1.1.3 Improved Mobile Camera Systems for Full Omnidirectional Imaging 1.2 Issues 1.2.1 Directional versus Omnidirectional Image Acquisition 1.2.2 Structure from Motion versus Visual Simultaneous Localization and Mapping 1.3 Contribution 1.4 Structure of this Work 2 Related Work 2.1 Visual Simultaneous Localization and Mapping 2.1.1 Visual Odometry 2.1.2 Pose Graph Optimization 2.2 Structure from Motion 2.2.1 Bundle Adjustment 2.2.2 Structureless Bundle Adjustment 2.3 Corresponding Issues 2.4 Proposed Reconstruction Pipeline 3 Cameras and Pixel-to-Sphere Mappings with P2S-Maps 3.1 Types 3.2 Models 3.2.1 Unified Camera Model 3.2.2 Polynomal Camera Model 3.2.3 Spherical Camera Model 3.3 P2S-Maps - Mapping onto Unit Sphere via Lookup Table 3.3.1 Lookup Table as Color Image 3.3.2 Lookup Interpolation 3.3.3 Depth Data Conversion 4 Calibration 4.1 Overview of Proposed Calibration Pipeline 4.2 Target Detection 4.3 Intrinsic Calibration 4.3.1 Selected Examples 4.4 Extrinsic Calibration 4.4.1 3D-2D Pose Estimation 4.4.2 2D-2D Pose Estimation 4.4.3 Pose Optimization 4.4.4 Uncertainty Estimation 4.4.5 PoseGraph Representation 4.4.6 Bundle Adjustment 4.4.7 Selected Examples 5 Full Omnidirectional Image Projections 5.1 Panoramic Image Stitching 5.2 World Map Projections 5.3 World Map Projection Generator for P2S-Maps 5.4 Conversion between Projections based on P2S-Maps 5.4.1 Proposed Workflow 5.4.2 Data Storage Format 5.4.3 Real World Example 6 Relations between Two Camera Spheres 6.1 Forward and Backward Projection 6.2 Triangulation 6.2.1 Linear Least Squares Method 6.2.2 Alternative Midpoint Method 6.3 Epipolar Geometry 6.4 Transformation Recovery from Essential Matrix 6.4.1 Cheirality 6.4.2 Standard Procedure 6.4.3 Simplified Procedure 6.4.4 Improved Procedure 6.5 Two-View Estimation 6.5.1 Evaluation Strategy 6.5.2 Error Metric 6.5.3 Evaluation of Estimation Algorithms 6.5.4 Concluding Remarks 6.6 Two-View Optimization 6.6.1 Epipolar-Based Error Distances 6.6.2 Projection-Based Error Distances 6.6.3 Comparison between Error Distances 6.7 Two-View Translation Scaling 6.7.1 Linear Least Squares Estimation 6.7.2 Non-Linear Least Squares Optimization 6.7.3 Comparison between Initial and Optimized Scaling Factor 6.8 Homography to Identify Degeneracies 6.8.1 Homography for Spherical Cameras 6.8.2 Homography Estimation 6.8.3 Homography Optimization 6.8.4 Homography and Pure Rotation 6.8.5 Homography in Epipolar Geometry 7 Relations between Three Camera Spheres 7.1 Three View Geometry 7.2 Crossing Epipolar Planes Geometry 7.3 Trifocal Geometry 7.4 Relation between Trifocal, Three-View and Crossing Epipolar Planes 7.5 Translation Ratio between Up-To-Scale Two-View Transformations 7.5.1 Structureless Determination Approaches 7.5.2 Structure-Based Determination Approaches 7.5.3 Comparison between Proposed Approaches 8 Pose Graphs 8.1 Optimization Principle 8.2 Solvers 8.2.1 Additional Graph Solvers 8.2.2 False Loop Closure Detection 8.3 Pose Graph Generation 8.3.1 Generation of Synthetic Pose Graph Data 8.3.2 Optimization of Synthetic Pose Graph Data 9 Structureless Camera Motion Estimation 9.1 SCME Pipeline 9.2 Determination of Two-View Translation Scale Factors 9.3 Integration of Depth Data 9.4 Integration of Extrinsic Camera Constraints 10 Camera Motion Estimation Results 10.1 Directional Camera Images 10.2 Omnidirectional Camera Images 11 Conclusion 11.1 Summary 11.2 Outlook and Future Work Appendices A.1 Additional Extrinsic Calibration Results A.2 Linear Least Squares Scaling A.3 Proof Rank Deficiency A.4 Alternative Derivation Midpoint Method A.5 Simplification of Depth Calculation A.6 Relation between Epipolar and Circumferential Constraint A.7 Covariance Estimation A.8 Uncertainty Estimation from Epipolar Geometry A.9 Two-View Scaling Factor Estimation: Uncertainty Estimation A.10 Two-View Scaling Factor Optimization: Uncertainty Estimation A.11 Depth from Adjoining Two-View Geometries A.12 Alternative Three-View Derivation A.12.1 Second Derivation Approach A.12.2 Third Derivation Approach A.13 Relation between Trifocal Geometry and Alternative Midpoint Method A.14 Additional Pose Graph Generation Examples A.15 Pose Graph Solver Settings A.16 Additional Pose Graph Optimization Examples Bibliography info:eu-repo/classification/ddc/004 ddc:004 Mobiler Roboter Kamera Panoramafotografie Bergwerk Dreidimensionale Rekonstruktion SLAM-Verfahren
82	A Novel Approach for Spherical Stereo Vision Findeisen, Michel 23 April 2015 (has links) The Professorship of Digital Signal Processing and Circuit Technology of Chemnitz University of Technology conducts research in the field of three-dimensional space measurement with optical sensors. In recent years this field has made major progress. For example innovative, active techniques such as the “structured light“-principle are able to measure even homogeneous surfaces and find its way into the consumer electronic market in terms of Microsoft’s Kinect® at the present time. Furthermore, high-resolution optical sensors establish powerful, passive stereo vision systems in the field of indoor surveillance. Thereby they induce new application domains such as security and assistance systems for domestic environments. However, the constraint field of view can be still considered as an essential characteristic of all these technologies. For instance, in order to measure a volume in size of a living space, two to three deployed 3D sensors have to be applied nowadays. This is due to the fact that the commonly utilized perspective projection principle constrains the visible area to a field of view of approximately 120°. On the contrary, novel fish-eye lenses allow the realization of omnidirectional projection models. Therewith, the visible field of view can be enlarged up to more than 180°. In combination with a 3D measurement approach, thus, the number of required sensors for entire room coverage can be reduced considerably. Motivated by the requirements of the field of indoor surveillance, the present work focuses on the combination of the established stereo vision principle and omnidirectional projection methods. The entire 3D measurement of a living space by means of one single sensor can be considered as major objective. As a starting point for this thesis chapter 1 discusses the underlying requirement, referring to various relevant fields of application. Based on this, the distinct purpose for the present work is stated. The necessary mathematical foundations of computer vision are reflected in Chapter 2 subsequently. Based on the geometry of the optical imaging process, the projection characteristics of relevant principles are discussed and a generic method for modeling fish-eye cameras is selected. Chapter 3 deals with the extraction of depth information using classical (perceptively imaging) binocular stereo vision configurations. In addition to a complete recap of the processing chain, especially occurring measurement uncertainties are investigated. In the following, Chapter 4 addresses special methods to convert different projection models. The example of mapping an omnidirectional to a perspective projection is employed, in order to develop a method for accelerating this process and, hereby, for reducing the computational load associated therewith. Any errors that occur, as well as the necessary adjustment of image resolution, are an integral part of the investigation. As a practical example, an application for person tracking is utilized in order to demonstrate to which extend the usage of “virtual views“ can increase the recognition rate for people detectors in the context of omnidirectional monitoring. Subsequently, an extensive search with respect to omnidirectional imaging stereo vision techniques is conducted in chapter 5. It turns out that the complete 3D capture of a room is achievable by the generation of a hemispherical depth map. Therefore, three cameras have to be combined in order to form a trinocular stereo vision system. As a basis for further research, a known trinocular stereo vision method is selected. Furthermore, it is hypothesized that, applying a modified geometric constellation of cameras, more precisely in the form of an equilateral triangle, and using an alternative method to determine the depth map, the performance can be increased considerably. A novel method is presented, which shall require fewer operations to calculate the distance information and which is to avoid a computational costly step for depth map fusion as necessary in the comparative method. In order to evaluate the presented approach as well as the hypotheses, a hemispherical depth map is generated in Chapter 6 by means of the new method. Simulation results, based on artificially generated 3D space information and realistic system parameters, are presented and subjected to a subsequent error estimate. A demonstrator for generating real measurement information is introduced in Chapter 7. In addition, the methods that are applied for calibrating the system intrinsically as well as extrinsically are explained. It turns out that the calibration procedure utilized cannot estimate the extrinsic parameters sufficiently. Initial measurements present a hemispherical depth map and thus con.rm the operativeness of the concept, but also identify the drawbacks of the calibration used. The current implementation of the algorithm shows almost real-time behaviour. Finally, Chapter 8 summarizes the results obtained along the studies and discusses them in the context of comparable binocular and trinocular stereo vision approaches. For example the results of the simulations carried out produced a saving of up to 30% in terms of stereo correspondence operations in comparison with a referred trinocular method. Furthermore, the concept introduced allows the avoidance of a weighted averaging step for depth map fusion based on precision values that have to be calculated costly. The achievable accuracy is still comparable for both trinocular approaches. In summary, it can be stated that, in the context of the present thesis, a measurement system has been developed, which has great potential for future application fields in industry, security in public spaces as well as home environments.:Abstract 7 Zusammenfassung 11 Acronyms 27 Symbols 29 Acknowledgement 33 1 Introduction 35 1.1 Visual Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 1.2 Challenges in Visual Surveillance . . . . . . . . . . . . . . . . . . . . . . . 38 1.3 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2 Fundamentals of Computer Vision Geometry 43 2.1 Projective Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.1.1 Euclidean Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.1.2 Projective Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.2 Camera Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.2.1 Geometrical Imaging Process . . . . . . . . . . . . . . . . . . . . . 45 2.2.1.1 Projection Models . . . . . . . . . . . . . . . . . . . . . . 46 2.2.1.2 Intrinsic Model . . . . . . . . . . . . . . . . . . . . . . . . 47 2.2.1.3 Extrinsic Model . . . . . . . . . . . . . . . . . . . . . . . 50 2.2.1.4 Distortion Models . . . . . . . . . . . . . . . . . . . . . . 51 2.2.2 Pinhole Camera Model . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.2.2.1 Complete Forward Model . . . . . . . . . . . . . . . . . . 52 2.2.2.2 Back Projection . . . . . . . . . . . . . . . . . . . . . . . 53 2.2.3 Equiangular Camera Model . . . . . . . . . . . . . . . . . . . . . . 54 2.2.4 Generic Camera Models . . . . . . . . . . . . . . . . . . . . . . . . 55 2.2.4.1 Complete Forward Model . . . . . . . . . . . . . . . . . . 56 2.2.4.2 Back Projection . . . . . . . . . . . . . . . . . . . . . . . 58 2.3 Camera Calibration Methods . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.3.1 Perspective Camera Calibration . . . . . . . . . . . . . . . . . . . . 59 2.3.2 Omnidirectional Camera Calibration . . . . . . . . . . . . . . . . . 59 2.4 Two-View Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.4.1 Epipolar Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.4.2 The Fundamental Matrix . . . . . . . . . . . . . . . . . . . . . . . 63 2.4.3 Epipolar Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3 Fundamentals of Stereo Vision 67 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.1.1 The Concept Stereo Vision . . . . . . . . . . . . . . . . . . . . . . 67 3.1.2 Overview of a Stereo Vision Processing Chain . . . . . . . . . . . . 68 3.2 Stereo Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.2.1 Extrinsic Stereo Calibration With Respect to the Projective Error 70 3.3 Stereo Rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.3.1 A Compact Algorithm for Rectification of Stereo Pairs . . . . . . . 73 3.4 Stereo Correspondence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.4.1 Disparity Computation . . . . . . . . . . . . . . . . . . . . . . . . 76 3.4.2 The Correspondence Problem . . . . . . . . . . . . . . . . . . . . . 77 3.5 Triangulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.5.1 Depth Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.5.2 Range Field of Measurement . . . . . . . . . . . . . . . . . . . . . 80 3.5.3 Measurement Accuracy . . . . . . . . . . . . . . . . . . . . . . . . 80 3.5.4 Measurement Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.5.4.1 Quantization Error . . . . . . . . . . . . . . . . . . . . . 82 3.5.4.2 Statistical Distribution of Quantization Errors . . . . . . 83 4 Virtual Cameras 87 4.1 Introduction and Related Works . . . . . . . . . . . . . . . . . . . . . . . 88 4.2 Omni to Perspective Vision . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2.1 Forward Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2.2 Backward Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.2.3 Fast Backward Mapping . . . . . . . . . . . . . . . . . . . . . . . . 96 4.3 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.4 Accuracy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.4.1 Intrinsics of the Source Camera . . . . . . . . . . . . . . . . . . . . 102 4.4.2 Intrinsics of the Target Camera . . . . . . . . . . . . . . . . . . . . 102 4.4.3 Marginal Virtual Pixel Size . . . . . . . . . . . . . . . . . . . . . . 104 4.5 Performance Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.6 Virtual Perspective Views for Real-Time People Detection . . . . . . . . . 110 5 Omnidirectional Stereo Vision 113 5.1 Introduction and Related Works . . . . . . . . . . . . . . . . . . . . . . . 113 5.1.1 Geometrical Configuration . . . . . . . . . . . . . . . . . . . . . . . 116 5.1.1.1 H-Binocular Omni-Stereo with Panoramic Views . . . . . 117 5.1.1.2 V-Binocular Omnistereo with Panoramic Views . . . . . 119 5.1.1.3 Binocular Omnistereo with Hemispherical Views . . . . . 120 5.1.1.4 Trinocular Omnistereo . . . . . . . . . . . . . . . . . . . 122 5.1.1.5 Miscellaneous Configurations . . . . . . . . . . . . . . . . 125 5.2 Epipolar Rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5.2.1 Cylindrical Rectification . . . . . . . . . . . . . . . . . . . . . . . . 127 5.2.2 Epipolar Equi-Distance Rectification . . . . . . . . . . . . . . . . . 128 5.2.3 Epipolar Stereographic Rectification . . . . . . . . . . . . . . . . . 128 5.2.4 Comparison of Rectification Methods . . . . . . . . . . . . . . . . 129 5.3 A Novel Spherical Stereo Vision Setup . . . . . . . . . . . . . . . . . . . . 129 5.3.1 Physical Omnidirectional Camera Configuration . . . . . . . . . . 131 5.3.2 Virtual Rectified Cameras . . . . . . . . . . . . . . . . . . . . . . . 131 6 A Novel Spherical Stereo Vision Algorithm 135 6.1 Matlab Simulation Environment . . . . . . . . . . . . . . . . . . . . . . . 135 6.2 Extrinsic Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 6.3 Physical Camera Configuration . . . . . . . . . . . . . . . . . . . . . . . . 137 6.4 Virtual Camera Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 137 6.4.1 The Focal Length . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 6.4.2 Prediscussion of the Field of View . . . . . . . . . . . . . . . . . . 138 6.4.3 Marginal Virtual Pixel Sizes . . . . . . . . . . . . . . . . . . . . . . 139 6.4.4 Calculation of the Field of View . . . . . . . . . . . . . . . . . . . 142 6.4.5 Calculation of the Virtual Pixel Size Ratios . . . . . . . . . . . . . 143 6.4.6 Results of the Virtual Camera Parameters . . . . . . . . . . . . . . 144 6.5 Spherical Depth Map Generation . . . . . . . . . . . . . . . . . . . . . . . 147 6.5.1 Omnidirectional Imaging Process . . . . . . . . . . . . . . . . . . . 148 6.5.2 Rectification Process . . . . . . . . . . . . . . . . . . . . . . . . . . 148 6.5.3 Rectified Depth Map Generation . . . . . . . . . . . . . . . . . . . 150 6.5.4 Spherical Depth Map Generation . . . . . . . . . . . . . . . . . . . 151 6.5.5 3D Reprojection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 6.6 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 7 Stereo Vision Demonstrator 163 7.1 Physical System Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 7.2 System Calibration Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 165 7.2.1 Intrinsic Calibration of the Physical Cameras . . . . . . . . . . . . 165 7.2.2 Extrinsic Calibration of the Physical and the Virtual Cameras . . 166 7.2.2.1 Extrinsic Initialization of the Physical Cameras . . . . . 167 7.2.2.2 Extrinsic Initialization of the Virtual Cameras . . . . . . 167 7.2.2.3 Two-View Stereo Calibration and Rectification . . . . . . 167 7.2.2.4 Three-View Stereo Rectification . . . . . . . . . . . . . . 168 7.2.2.5 Extrinsic Calibration Results . . . . . . . . . . . . . . . . 169 7.3 Virtual Camera Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 7.4 Software Realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 7.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 7.5.1 Qualitative Assessment . . . . . . . . . . . . . . . . . . . . . . . . 172 7.5.2 Performance Measurements . . . . . . . . . . . . . . . . . . . . . . 174 8 Discussion and Outlook 177 8.1 Discussion of the Current Results and Further Need for Research . . . . . 177 8.1.1 Assessment of the Geometrical Camera Configuration . . . . . . . 178 8.1.2 Assessment of the Depth Map Computation . . . . . . . . . . . . . 179 8.1.3 Assessment of the Depth Measurement Error . . . . . . . . . . . . 182 8.1.4 Assessment of the Spherical Stereo Vision Demonstrator . . . . . . 183 8.2 Review of the Different Approaches for Hemispherical Depth Map Generation184 8.2.1 Comparison of the Equilateral and the Right-Angled Three-View Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 8.2.2 Review of the Three-View Approach in Comparison with the Two- View Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 8.3 A Sample Algorithm for Human Behaviour Analysis . . . . . . . . . . . . 187 8.4 Closing Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 A Relevant Mathematics 191 A.1 Cross Product by Skew Symmetric Matrix . . . . . . . . . . . . . . . . . . 191 A.2 Derivation of the Quantization Error . . . . . . . . . . . . . . . . . . . . . 191 A.3 Derivation of the Statistical Distribution of Quantization Errors . . . . . . 192 A.4 Approximation of the Quantization Error for Equiangular Geometry . . . 194 B Further Relevant Publications 197 B.1 H-Binocular Omnidirectional Stereo Vision with Panoramic Views . . . . 197 B.2 V-Binocular Omnidirectional Stereo Vision with Panoramic Views . . . . 198 B.3 Binocular Omnidirectional Stereo Vision with Hemispherical Views . . . . 200 B.4 Trinocular Omnidirectional Stereo Vision . . . . . . . . . . . . . . . . . . 201 B.5 Miscellaneous Configurations . . . . . . . . . . . . . . . . . . . . . . . . . 202 Bibliography 209 List of Figures 223 List of Tables 229 Affidavit 231 Theses 233 Thesen 235 Curriculum Vitae 237 info:eu-repo/classification/ddc/004 ddc:004 info:eu-repo/classification/ddc/005 ddc:005 info:eu-repo/classification/ddc/006 ddc:006 info:eu-repo/classification/ddc/600 ddc:600

Search results

Structureless Camera Motion Estimation of Unordered Omnidirectional Images

A Novel Approach for Spherical Stereo Vision