51 |
A Novel Approach for Spherical Stereo VisionFindeisen, Michel 23 April 2015 (has links)
The Professorship of Digital Signal Processing and Circuit Technology of Chemnitz University of Technology conducts research in the field of three-dimensional space measurement with optical sensors. In recent years this field has made major progress.
For example innovative, active techniques such as the “structured light“-principle are able to measure even homogeneous surfaces and find its way into the consumer electronic market in terms of Microsoft’s Kinect® at the present time. Furthermore, high-resolution optical sensors establish powerful, passive stereo vision systems in the field of indoor surveillance. Thereby they induce new application domains such as security and assistance systems for domestic environments.
However, the constraint field of view can be still considered as an essential characteristic of all these technologies. For instance, in order to measure a volume in size of a living space, two to three deployed 3D sensors have to be applied nowadays. This is due to the fact that the commonly utilized perspective projection principle constrains the visible area to a field of view of approximately 120°. On the contrary, novel fish-eye lenses allow the realization of omnidirectional projection models. Therewith, the visible field of view can be enlarged up to more than 180°. In combination with a 3D measurement approach, thus, the number of required sensors for entire room coverage can be reduced considerably.
Motivated by the requirements of the field of indoor surveillance, the present work focuses on the combination of the established stereo vision principle and omnidirectional projection methods. The entire 3D measurement of a living space by means of one single sensor can be considered as major objective.
As a starting point for this thesis chapter 1 discusses the underlying requirement, referring to various relevant fields of application. Based on this, the distinct purpose for the present work is stated.
The necessary mathematical foundations of computer vision are reflected in Chapter 2 subsequently. Based on the geometry of the optical imaging process, the projection characteristics of relevant principles are discussed and a generic method for modeling fish-eye cameras is selected.
Chapter 3 deals with the extraction of depth information using classical (perceptively imaging) binocular stereo vision configurations. In addition to a complete recap of the processing chain, especially occurring measurement uncertainties are investigated.
In the following, Chapter 4 addresses special methods to convert different projection models. The example of mapping an omnidirectional to a perspective projection is employed, in order to develop a method for accelerating this process and, hereby, for reducing the computational load associated therewith. Any errors that occur, as well as the necessary adjustment of image resolution, are an integral part of the investigation. As a practical example, an application for person tracking is utilized in order to demonstrate to which extend the usage of “virtual views“ can increase the recognition rate for people detectors in the context of omnidirectional monitoring.
Subsequently, an extensive search with respect to omnidirectional imaging stereo vision techniques is conducted in chapter 5. It turns out that the complete 3D capture of a room is achievable by the generation of a hemispherical depth map. Therefore, three cameras have to be combined in order to form a trinocular stereo vision system. As a basis for further research, a known trinocular stereo vision method is selected. Furthermore, it is hypothesized that, applying a modified geometric constellation of cameras, more precisely in the form of an equilateral triangle, and using an alternative method to determine the depth map, the performance can be increased considerably. A novel method is presented, which shall require fewer operations to calculate the distance information and which is to avoid a computational costly step for depth map fusion as necessary in the comparative method.
In order to evaluate the presented approach as well as the hypotheses, a hemispherical depth map is generated in Chapter 6 by means of the new method. Simulation results, based on artificially generated 3D space information and realistic system parameters, are presented and subjected to a subsequent error estimate.
A demonstrator for generating real measurement information is introduced in Chapter 7. In addition, the methods that are applied for calibrating the system intrinsically as well as extrinsically are explained. It turns out that the calibration procedure utilized cannot estimate the extrinsic parameters sufficiently. Initial measurements present a hemispherical depth map and thus con.rm the operativeness of the concept, but also identify the drawbacks of the calibration used. The current implementation of the algorithm shows almost real-time behaviour.
Finally, Chapter 8 summarizes the results obtained along the studies and discusses them in the context of comparable binocular and trinocular stereo vision approaches. For example the results of the simulations carried out produced a saving of up to 30% in terms of stereo correspondence operations in comparison with a referred trinocular method. Furthermore, the concept introduced allows the avoidance of a weighted averaging step for depth map fusion based on precision values that have to be calculated costly. The achievable accuracy is still comparable for both trinocular approaches.
In summary, it can be stated that, in the context of the present thesis, a measurement system has been developed, which has great potential for future application fields in industry, security in public spaces as well as home environments.:Abstract 7
Zusammenfassung 11
Acronyms 27
Symbols 29
Acknowledgement 33
1 Introduction 35
1.1 Visual Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.2 Challenges in Visual Surveillance . . . . . . . . . . . . . . . . . . . . . . . 38
1.3 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2 Fundamentals of Computer Vision Geometry 43
2.1 Projective Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.1.1 Euclidean Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.1.2 Projective Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2 Camera Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2.1 Geometrical Imaging Process . . . . . . . . . . . . . . . . . . . . . 45
2.2.1.1 Projection Models . . . . . . . . . . . . . . . . . . . . . . 46
2.2.1.2 Intrinsic Model . . . . . . . . . . . . . . . . . . . . . . . . 47
2.2.1.3 Extrinsic Model . . . . . . . . . . . . . . . . . . . . . . . 50
2.2.1.4 Distortion Models . . . . . . . . . . . . . . . . . . . . . . 51
2.2.2 Pinhole Camera Model . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.2.2.1 Complete Forward Model . . . . . . . . . . . . . . . . . . 52
2.2.2.2 Back Projection . . . . . . . . . . . . . . . . . . . . . . . 53
2.2.3 Equiangular Camera Model . . . . . . . . . . . . . . . . . . . . . . 54
2.2.4 Generic Camera Models . . . . . . . . . . . . . . . . . . . . . . . . 55
2.2.4.1 Complete Forward Model . . . . . . . . . . . . . . . . . . 56
2.2.4.2 Back Projection . . . . . . . . . . . . . . . . . . . . . . . 58
2.3 Camera Calibration Methods . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.3.1 Perspective Camera Calibration . . . . . . . . . . . . . . . . . . . . 59
2.3.2 Omnidirectional Camera Calibration . . . . . . . . . . . . . . . . . 59
2.4 Two-View Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.4.1 Epipolar Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.4.2 The Fundamental Matrix . . . . . . . . . . . . . . . . . . . . . . . 63
2.4.3 Epipolar Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3 Fundamentals of Stereo Vision 67
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.1.1 The Concept Stereo Vision . . . . . . . . . . . . . . . . . . . . . . 67
3.1.2 Overview of a Stereo Vision Processing Chain . . . . . . . . . . . . 68
3.2 Stereo Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.2.1 Extrinsic Stereo Calibration With Respect to the Projective Error 70
3.3 Stereo Rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.3.1 A Compact Algorithm for Rectification of Stereo Pairs . . . . . . . 73
3.4 Stereo Correspondence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.4.1 Disparity Computation . . . . . . . . . . . . . . . . . . . . . . . . 76
3.4.2 The Correspondence Problem . . . . . . . . . . . . . . . . . . . . . 77
3.5 Triangulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.5.1 Depth Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.5.2 Range Field of Measurement . . . . . . . . . . . . . . . . . . . . . 80
3.5.3 Measurement Accuracy . . . . . . . . . . . . . . . . . . . . . . . . 80
3.5.4 Measurement Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.5.4.1 Quantization Error . . . . . . . . . . . . . . . . . . . . . 82
3.5.4.2 Statistical Distribution of Quantization Errors . . . . . . 83
4 Virtual Cameras 87
4.1 Introduction and Related Works . . . . . . . . . . . . . . . . . . . . . . . 88
4.2 Omni to Perspective Vision . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.2.1 Forward Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.2.2 Backward Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.2.3 Fast Backward Mapping . . . . . . . . . . . . . . . . . . . . . . . . 96
4.3 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.4 Accuracy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.4.1 Intrinsics of the Source Camera . . . . . . . . . . . . . . . . . . . . 102
4.4.2 Intrinsics of the Target Camera . . . . . . . . . . . . . . . . . . . . 102
4.4.3 Marginal Virtual Pixel Size . . . . . . . . . . . . . . . . . . . . . . 104
4.5 Performance Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.6 Virtual Perspective Views for Real-Time People Detection . . . . . . . . . 110
5 Omnidirectional Stereo Vision 113
5.1 Introduction and Related Works . . . . . . . . . . . . . . . . . . . . . . . 113
5.1.1 Geometrical Configuration . . . . . . . . . . . . . . . . . . . . . . . 116
5.1.1.1 H-Binocular Omni-Stereo with Panoramic Views . . . . . 117
5.1.1.2 V-Binocular Omnistereo with Panoramic Views . . . . . 119
5.1.1.3 Binocular Omnistereo with Hemispherical Views . . . . . 120
5.1.1.4 Trinocular Omnistereo . . . . . . . . . . . . . . . . . . . 122
5.1.1.5 Miscellaneous Configurations . . . . . . . . . . . . . . . . 125
5.2 Epipolar Rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.2.1 Cylindrical Rectification . . . . . . . . . . . . . . . . . . . . . . . . 127
5.2.2 Epipolar Equi-Distance Rectification . . . . . . . . . . . . . . . . . 128
5.2.3 Epipolar Stereographic Rectification . . . . . . . . . . . . . . . . . 128
5.2.4 Comparison of Rectification Methods . . . . . . . . . . . . . . . . 129
5.3 A Novel Spherical Stereo Vision Setup . . . . . . . . . . . . . . . . . . . . 129
5.3.1 Physical Omnidirectional Camera Configuration . . . . . . . . . . 131
5.3.2 Virtual Rectified Cameras . . . . . . . . . . . . . . . . . . . . . . . 131
6 A Novel Spherical Stereo Vision Algorithm 135
6.1 Matlab Simulation Environment . . . . . . . . . . . . . . . . . . . . . . . 135
6.2 Extrinsic Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.3 Physical Camera Configuration . . . . . . . . . . . . . . . . . . . . . . . . 137
6.4 Virtual Camera Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.4.1 The Focal Length . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
6.4.2 Prediscussion of the Field of View . . . . . . . . . . . . . . . . . . 138
6.4.3 Marginal Virtual Pixel Sizes . . . . . . . . . . . . . . . . . . . . . . 139
6.4.4 Calculation of the Field of View . . . . . . . . . . . . . . . . . . . 142
6.4.5 Calculation of the Virtual Pixel Size Ratios . . . . . . . . . . . . . 143
6.4.6 Results of the Virtual Camera Parameters . . . . . . . . . . . . . . 144
6.5 Spherical Depth Map Generation . . . . . . . . . . . . . . . . . . . . . . . 147
6.5.1 Omnidirectional Imaging Process . . . . . . . . . . . . . . . . . . . 148
6.5.2 Rectification Process . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.5.3 Rectified Depth Map Generation . . . . . . . . . . . . . . . . . . . 150
6.5.4 Spherical Depth Map Generation . . . . . . . . . . . . . . . . . . . 151
6.5.5 3D Reprojection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.6 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
7 Stereo Vision Demonstrator 163
7.1 Physical System Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.2 System Calibration Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 165
7.2.1 Intrinsic Calibration of the Physical Cameras . . . . . . . . . . . . 165
7.2.2 Extrinsic Calibration of the Physical and the Virtual Cameras . . 166
7.2.2.1 Extrinsic Initialization of the Physical Cameras . . . . . 167
7.2.2.2 Extrinsic Initialization of the Virtual Cameras . . . . . . 167
7.2.2.3 Two-View Stereo Calibration and Rectification . . . . . . 167
7.2.2.4 Three-View Stereo Rectification . . . . . . . . . . . . . . 168
7.2.2.5 Extrinsic Calibration Results . . . . . . . . . . . . . . . . 169
7.3 Virtual Camera Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.4 Software Realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.5.1 Qualitative Assessment . . . . . . . . . . . . . . . . . . . . . . . . 172
7.5.2 Performance Measurements . . . . . . . . . . . . . . . . . . . . . . 174
8 Discussion and Outlook 177
8.1 Discussion of the Current Results and Further Need for Research . . . . . 177
8.1.1 Assessment of the Geometrical Camera Configuration . . . . . . . 178
8.1.2 Assessment of the Depth Map Computation . . . . . . . . . . . . . 179
8.1.3 Assessment of the Depth Measurement Error . . . . . . . . . . . . 182
8.1.4 Assessment of the Spherical Stereo Vision Demonstrator . . . . . . 183
8.2 Review of the Different Approaches for Hemispherical Depth Map Generation184
8.2.1 Comparison of the Equilateral and the Right-Angled Three-View
Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
8.2.2 Review of the Three-View Approach in Comparison with the Two-
View Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
8.3 A Sample Algorithm for Human Behaviour Analysis . . . . . . . . . . . . 187
8.4 Closing Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
A Relevant Mathematics 191
A.1 Cross Product by Skew Symmetric Matrix . . . . . . . . . . . . . . . . . . 191
A.2 Derivation of the Quantization Error . . . . . . . . . . . . . . . . . . . . . 191
A.3 Derivation of the Statistical Distribution of Quantization Errors . . . . . . 192
A.4 Approximation of the Quantization Error for Equiangular Geometry . . . 194
B Further Relevant Publications 197
B.1 H-Binocular Omnidirectional Stereo Vision with Panoramic Views . . . . 197
B.2 V-Binocular Omnidirectional Stereo Vision with Panoramic Views . . . . 198
B.3 Binocular Omnidirectional Stereo Vision with Hemispherical Views . . . . 200
B.4 Trinocular Omnidirectional Stereo Vision . . . . . . . . . . . . . . . . . . 201
B.5 Miscellaneous Configurations . . . . . . . . . . . . . . . . . . . . . . . . . 202
Bibliography 209
List of Figures 223
List of Tables 229
Affidavit 231
Theses 233
Thesen 235
Curriculum Vitae 237
|
52 |
Bayes Filters with Improved Measurements for Visual Object Tracking / Bayes Filter mit verbesserter Messung für das Tracken visueller ObjekteLiu, Guoliang 20 March 2012 (has links)
No description available.
|
53 |
Visual attention in primates and for machines - neuronal mechanismsBeuth, Frederik 09 December 2020 (has links)
Visual attention is an important cognitive concept for the daily life of humans, but still not fully understood. Due to this, it is also rarely utilized in computer vision systems. However, understanding visual attention is challenging as it has many and seemingly-different aspects, both at neuronal and behavioral level. Thus, it is very hard to give a uniform explanation of visual attention that can account for all aspects. To tackle this problem, this thesis has the goal to identify a common set of neuronal mechanisms, which underlie both neuronal and behavioral aspects. The mechanisms are simulated by neuro-computational models, thus, resulting in a single modeling approach to explain a wide range of phenomena at once. In the thesis, the chosen aspects are multiple neurophysiological effects, real-world object localization, and a visual masking paradigm (OSM). In each of the considered fields, the work also advances the current state-of-the-art to better understand this aspect of attention itself. The three chosen aspects highlight that the approach can account for crucial neurophysiological, functional, and behavioral properties, thus the mechanisms might constitute the general neuronal substrate of visual attention in the cortex. As outlook, our work provides for computer vision a deeper understanding and a concrete prototype of attention to incorporate this crucial aspect of human perception in future systems.:1. General introduction
2. The state-of-the-art in modeling visual attention
3. Microcircuit model of attention
4. Object localization with a model of visual attention
5. Object substitution masking
6. General conclusion / Visuelle Aufmerksamkeit ist ein wichtiges kognitives Konzept für das tägliche Leben des Menschen. Es ist aber immer noch nicht komplett verstanden, so dass es ein langjähriges Ziel der Neurowissenschaften ist, das Phänomen grundlegend zu durchdringen. Gleichzeitig wird es aufgrund des mangelnden Verständnisses nur selten in maschinellen Sehsystemen in der Informatik eingesetzt. Das Verständnis von visueller Aufmerksamkeit ist jedoch eine komplexe Herausforderung, da Aufmerksamkeit äußerst vielfältige und scheinbar unterschiedliche Aspekte besitzt. Sie verändert multipel sowohl die neuronalen Feuerraten als auch das menschliche Verhalten. Daher ist es sehr schwierig, eine einheitliche Erklärung von visueller Aufmerksamkeit zu finden, welche für alle Aspekte gleichermaßen gilt. Um dieses Problem anzugehen, hat diese Arbeit das Ziel, einen gemeinsamen Satz neuronaler Mechanismen zu identifizieren, welche sowohl den neuronalen als auch den verhaltenstechnischen Aspekten zugrunde liegen. Die Mechanismen werden in neuro-computationalen Modellen simuliert, wodurch ein einzelnes Modellierungsframework entsteht, welches zum ersten Mal viele und verschiedenste Phänomene von visueller Aufmerksamkeit auf einmal erklären kann. Als Aspekte wurden in dieser Dissertation multiple neurophysiologische Effekte, Realwelt Objektlokalisation und ein visuelles Maskierungsparadigma (OSM) gewählt. In jedem dieser betrachteten Felder wird gleichzeitig der State-of-the-Art verbessert, um auch diesen Teilbereich von Aufmerksamkeit selbst besser zu verstehen. Die drei gewählten Gebiete zeigen, dass der Ansatz grundlegende neurophysiologische, funktionale und verhaltensbezogene Eigenschaften von visueller Aufmerksamkeit erklären kann. Da die gefundenen Mechanismen somit ausreichend sind, das Phänomen so umfassend zu erklären, könnten die Mechanismen vielleicht sogar das essentielle neuronale Substrat von visueller Aufmerksamkeit im Cortex darstellen. Für die Informatik stellt die Arbeit damit ein tiefergehendes Verständnis von visueller Aufmerksamkeit dar. Darüber hinaus liefert das Framework mit seinen neuronalen Mechanismen sogar eine Referenzimplementierung um Aufmerksamkeit in zukünftige Systeme integrieren zu können. Aufmerksamkeit könnte laut der vorliegenden Forschung sehr nützlich für diese sein, da es im Gehirn eine Aufgabenspezifische Optimierung des visuellen Systems bereitstellt. Dieser Aspekt menschlicher Wahrnehmung fehlt meist in den aktuellen, starken Computervisionssystemen, so dass eine Integration in aktuelle Systeme deren Leistung sprunghaft erhöhen und eine neue Klasse definieren dürfte.:1. General introduction
2. The state-of-the-art in modeling visual attention
3. Microcircuit model of attention
4. Object localization with a model of visual attention
5. Object substitution masking
6. General conclusion
|
Page generated in 0.0958 seconds