Global ETD Search

51	Design and Calibration of a Network of RGB-D Sensors for Robotic Applications over Large Workspaces Rizwan, Macknojia 21 March 2013 (has links) This thesis presents an approach for configuring and calibrating a network of RGB-D sensors used to guide a robotic arm to interact with objects that get rapidly modeled in 3D. The system is based on Microsoft Kinect sensors for 3D data acquisition. The work presented here also details an analysis and experimental study of the Kinect’s depth sensor capabilities and performance. The study comprises examination of the resolution, quantization error, and random distribution of depth data. In addition, the effects of color and reflectance characteristics of an object are also analyzed. The study examines two versions of Kinect sensors, one dedicated to operate with the Xbox 360 video game console and the more recent Microsoft Kinect for Windows version. The study of the Kinect sensor is extended to the design of a rapid acquisition system dedicated to large workspaces by the linkage of multiple Kinect units to collect 3D data over a large object, such as an automotive vehicle. A customized calibration method for this large workspace is proposed which takes advantage of the rapid 3D measurement technology embedded in the Kinect sensor and provides registration accuracy between local sections of point clouds that is within the range of the depth measurements accuracy permitted by the Kinect technology. The method is developed to calibrate all Kinect units with respect to a reference Kinect. The internal calibration of the sensor in between the color and depth measurements is also performed to optimize the alignment between the modalities. The calibration of the 3D vision system is also extended to formally estimate its configuration with respect to the base of a manipulator robot, therefore allowing for seamless integration between the proposed vision platform and the kinematic control of the robot. The resulting vision-robotic system defines the comprehensive calibration of reference Kinect with the robot. The latter can then be used to interact under visual guidance with large objects, such as vehicles, that are positioned within a significantly enlarged field of view created by the network of RGB-D sensors. The proposed design and calibration method is validated in a real world scenario where five Kinect sensors operate collaboratively to rapidly and accurately reconstruct a 180 degrees coverage of the surface shape of various types of vehicles from a set of individual acquisitions performed in a semi-controlled environment, that is an underground parking garage. The vehicle geometrical properties generated from the acquired 3D data are compared with the original dimensions of the vehicle. camera calibration RGB-D imaging 3D reconstruction 3D profiling Kinect vehicle inspection robotic guidance point cloud registration OpenNI depth measurement multi camera system
52	Design and Calibration of a Network of RGB-D Sensors for Robotic Applications over Large Workspaces Macknojia, Rizwan 21 March 2013 (has links) This thesis presents an approach for configuring and calibrating a network of RGB-D sensors used to guide a robotic arm to interact with objects that get rapidly modeled in 3D. The system is based on Microsoft Kinect sensors for 3D data acquisition. The work presented here also details an analysis and experimental study of the Kinect’s depth sensor capabilities and performance. The study comprises examination of the resolution, quantization error, and random distribution of depth data. In addition, the effects of color and reflectance characteristics of an object are also analyzed. The study examines two versions of Kinect sensors, one dedicated to operate with the Xbox 360 video game console and the more recent Microsoft Kinect for Windows version. The study of the Kinect sensor is extended to the design of a rapid acquisition system dedicated to large workspaces by the linkage of multiple Kinect units to collect 3D data over a large object, such as an automotive vehicle. A customized calibration method for this large workspace is proposed which takes advantage of the rapid 3D measurement technology embedded in the Kinect sensor and provides registration accuracy between local sections of point clouds that is within the range of the depth measurements accuracy permitted by the Kinect technology. The method is developed to calibrate all Kinect units with respect to a reference Kinect. The internal calibration of the sensor in between the color and depth measurements is also performed to optimize the alignment between the modalities. The calibration of the 3D vision system is also extended to formally estimate its configuration with respect to the base of a manipulator robot, therefore allowing for seamless integration between the proposed vision platform and the kinematic control of the robot. The resulting vision-robotic system defines the comprehensive calibration of reference Kinect with the robot. The latter can then be used to interact under visual guidance with large objects, such as vehicles, that are positioned within a significantly enlarged field of view created by the network of RGB-D sensors. The proposed design and calibration method is validated in a real world scenario where five Kinect sensors operate collaboratively to rapidly and accurately reconstruct a 180 degrees coverage of the surface shape of various types of vehicles from a set of individual acquisitions performed in a semi-controlled environment, that is an underground parking garage. The vehicle geometrical properties generated from the acquired 3D data are compared with the original dimensions of the vehicle. camera calibration RGB-D imaging 3D reconstruction 3D profiling Kinect vehicle inspection robotic guidance point cloud registration OpenNI depth measurement multi camera system
53	3D detection and pose estimation of medical staff in operating rooms using RGB-D images / Détection et estimation 3D de la pose des personnes dans la salle opératoire à partir d'images RGB-D Kadkhodamohammadi, Abdolrahim 01 December 2016 (has links) Dans cette thèse, nous traitons des problèmes de la détection des personnes et de l'estimation de leurs poses dans la Salle Opératoire (SO), deux éléments clés pour le développement d'applications d'assistance chirurgicale. Nous percevons la salle grâce à des caméras RGB-D qui fournissent des informations visuelles complémentaires sur la scène. Ces informations permettent de développer des méthodes mieux adaptées aux difficultés propres aux SO, comme l'encombrement, les surfaces sans texture et les occlusions. Nous présentons des nouvelles approches qui tirent profit des informations temporelles, de profondeur et des vues multiples afin de construire des modèles robustes pour la détection des personnes et de leurs poses. Une évaluation est effectuée sur plusieurs jeux de données complexes enregistrés dans des salles opératoires avec une ou plusieurs caméras. Les résultats obtenus sont très prometteurs et montrent que nos approches surpassent les méthodes de l'état de l'art sur ces données cliniques. / In this thesis, we address the two problems of person detection and pose estimation in Operating Rooms (ORs), which are key ingredients in the development of surgical assistance applications. We perceive the OR using compact RGB-D cameras that can be conveniently integrated in the room. These sensors provide complementary information about the scene, which enables us to develop methods that can cope with numerous challenges present in the OR, e.g. clutter, textureless surfaces and occlusions. We present novel part-based approaches that take advantage of depth, multi-view and temporal information to construct robust human detection and pose estimation models. Evaluation is performed on new single- and multi-view datasets recorded in operating rooms. We demonstrate very promising results and show that our approaches outperform state-of-the-art methods on this challenging data acquired during real surgeries. Imagerie médicale Salle opératoire Estimation 3D de la pose Détection de personnes Image RVB Medical computer vision Pictorial structures Surgical workflow analysis 3D Body pose estimation Person detection Multi-view RGB-D data 006.4 621.36
54	Unsupervised construction of 4D semantic maps in a long-term autonomy scenario Ambrus, Rares January 2017 (has links) Robots are operating for longer times and collecting much more data than just a few years ago. In this setting we are interested in exploring ways of modeling the environment, segmenting out areas of interest and keeping track of the segmentations over time, with the purpose of building 4D models (i.e. space and time) of the relevant parts of the environment. Our approach relies on repeatedly observing the environment and creating local maps at specific locations. The first question we address is how to choose where to build these local maps. Traditionally, an operator defines a set of waypoints on a pre-built map of the environment which the robot visits autonomously. Instead, we propose a method to automatically extract semantically meaningful regions from a point cloud representation of the environment. The resulting segmentation is purely geometric, and in the context of mobile robots operating in human environments, the semantic label associated with each segment (i.e. kitchen, office) can be of interest for a variety of applications. We therefore also look at how to obtain per-pixel semantic labels given the geometric segmentation, by fusing probabilistic distributions over scene and object types in a Conditional Random Field. For most robotic systems, the elements of interest in the environment are the ones which exhibit some dynamic properties (such as people, chairs, cups, etc.), and the ability to detect and segment such elements provides a very useful initial segmentation of the scene. We propose a method to iteratively build a static map from observations of the same scene acquired at different points in time. Dynamic elements are obtained by computing the difference between the static map and new observations. We address the problem of clustering together dynamic elements which correspond to the same physical object, observed at different points in time and in significantly different circumstances. To address some of the inherent limitations in the sensors used, we autonomously plan, navigate around and obtain additional views of the segmented dynamic elements. We look at methods of fusing the additional data and we show that both a combined point cloud model and a fused mesh representation can be used to more robustly recognize the dynamic object in future observations. In the case of the mesh representation, we also show how a Convolutional Neural Network can be trained for recognition by using mesh renderings. Finally, we present a number of methods to analyse the data acquired by the mobile robot autonomously and over extended time periods. First, we look at how the dynamic segmentations can be used to derive a probabilistic prior which can be used in the mapping process to further improve and reinforce the segmentation accuracy. We also investigate how to leverage spatial-temporal constraints in order to cluster dynamic elements observed at different points in time and under different circumstances. We show that by making a few simple assumptions we can increase the clustering accuracy even when the object appearance varies significantly between observations. The result of the clustering is a spatial-temporal footprint of the dynamic object, defining an area where the object is likely to be observed spatially as well as a set of time stamps corresponding to when the object was previously observed. Using this data, predictive models can be created and used to infer future times when the object is more likely to be observed. In an object search scenario, this model can be used to decrease the search time when looking for specific objects. / <p>QC 20171009</p> Mobile robotics autonomous systems perception computer vision RGB-D object segmentation modelling and recognition semantic segmentation long-term autonomy mapping temporal modeling
55	Design and Calibration of a Network of RGB-D Sensors for Robotic Applications over Large Workspaces Macknojia, Rizwan January 2013 (has links) This thesis presents an approach for configuring and calibrating a network of RGB-D sensors used to guide a robotic arm to interact with objects that get rapidly modeled in 3D. The system is based on Microsoft Kinect sensors for 3D data acquisition. The work presented here also details an analysis and experimental study of the Kinect’s depth sensor capabilities and performance. The study comprises examination of the resolution, quantization error, and random distribution of depth data. In addition, the effects of color and reflectance characteristics of an object are also analyzed. The study examines two versions of Kinect sensors, one dedicated to operate with the Xbox 360 video game console and the more recent Microsoft Kinect for Windows version. The study of the Kinect sensor is extended to the design of a rapid acquisition system dedicated to large workspaces by the linkage of multiple Kinect units to collect 3D data over a large object, such as an automotive vehicle. A customized calibration method for this large workspace is proposed which takes advantage of the rapid 3D measurement technology embedded in the Kinect sensor and provides registration accuracy between local sections of point clouds that is within the range of the depth measurements accuracy permitted by the Kinect technology. The method is developed to calibrate all Kinect units with respect to a reference Kinect. The internal calibration of the sensor in between the color and depth measurements is also performed to optimize the alignment between the modalities. The calibration of the 3D vision system is also extended to formally estimate its configuration with respect to the base of a manipulator robot, therefore allowing for seamless integration between the proposed vision platform and the kinematic control of the robot. The resulting vision-robotic system defines the comprehensive calibration of reference Kinect with the robot. The latter can then be used to interact under visual guidance with large objects, such as vehicles, that are positioned within a significantly enlarged field of view created by the network of RGB-D sensors. The proposed design and calibration method is validated in a real world scenario where five Kinect sensors operate collaboratively to rapidly and accurately reconstruct a 180 degrees coverage of the surface shape of various types of vehicles from a set of individual acquisitions performed in a semi-controlled environment, that is an underground parking garage. The vehicle geometrical properties generated from the acquired 3D data are compared with the original dimensions of the vehicle. camera calibration RGB-D imaging 3D reconstruction 3D profiling Kinect vehicle inspection robotic guidance point cloud registration OpenNI depth measurement multi camera system
56	Detekce objektů pomocí Kinectu / Object Detection Using Kinect Řehánek, Martin January 2012 (has links) With the release of the Kinect device new possibilities appeared, allowing a simple use of image depth in image processing. The aim of this thesis is to propose a method for object detection and recognition in a depth map. Well known method Bag of Words and a descriptor based on Spin Image method are used for the object recognition. The Spin Image method is one of several existing approaches to depth map which are described in this thesis. Detection of object in picture is ensured by the sliding window technique. That is improved and speeded up by utilization of the depth information.
57	Terrain Mapping for Autonomous Vehicles / Terrängkartläggning för autonoma fordon Pedreira Carabel, Carlos Javier January 2015 (has links) Autonomous vehicles have become the forefront of the automotive industry nowadays, looking to have safer and more efficient transportation systems. One of the main issues for every autonomous vehicle consists in being aware of its position and the presence of obstacles along its path. The current project addresses the pose and terrain mapping problem integrating a visual odometry method and a mapping technique. An RGB-D camera, the Kinect v2 from Microsoft, was chosen as sensor for capturing information from the environment. It was connected to an Intel mini-PC for real-time processing. Both pieces of hardware were mounted on-board of a four-wheeled research concept vehicle (RCV) to test the feasibility of the current solution at outdoor locations. The Robot Operating System (ROS) was used as development environment with C++ as programming language. The visual odometry strategy consisted in a frame registration algorithm called Adaptive Iterative Closest Keypoint (AICK) based on Iterative Closest Point (ICP) using Oriented FAST and Rotated BRIEF (ORB) as image keypoint extractor. A grid-based local costmap rolling window type was implemented to have a two-dimensional representation of the obstacles close to the vehicle within a predefined area, in order to allow further path planning applications. Experiments were performed both offline and in real-time to test the system at indoors and outdoors scenarios. The results confirmed the viability of using the designed framework to keep tracking the pose of the camera and detect objects in indoor environments. However, outdoor environments evidenced the limitations of the features of the RGB-D sensor, making the current system configuration unfeasible for outdoor purposes. / Autonoma fordon har blivit spetsen för bilindustrin i dag i sökandet efter säkrare och effektivare transportsystem. En av de viktigaste sakerna för varje autonomt fordon består i att vara medveten om sin position och närvaron av hinder längs vägen. Det aktuella projektet behandlar position och riktning samt terrängkartläggningsproblemet genom att integrera en visuell distansmätnings och kartläggningsmetod. RGB-D kameran Kinect v2 från Microsoft valdes som sensor för att samla in information från omgivningen. Den var ansluten till en Intel mini PC för realtidsbehandling. Båda komponenterna monterades på ett fyrhjuligt forskningskonceptfordon (RCV) för att testa genomförbarheten av den nuvarande lösningen i utomhusmiljöer. Robotoperativsystemet (ROS) användes som utvecklingsmiljö med C++ som programmeringsspråk. Den visuella distansmätningsstrategin bestod i en bildregistrerings-algoritm som kallas Adaptive Iterative Closest Keypoint (AICK) baserat på Iterative Closest Point (ICP) med hjälp av Oriented FAST och Rotated BRIEF (ORB) som nyckelpunktsutvinning från bilder. En rutnätsbaserad lokalkostnadskarta av rullande-fönster-typ implementerades för att få en tvådimensionell representation av de hinder som befinner sig nära fordonet inom ett fördefinierat område, i syfte att möjliggöra ytterligare applikationer för körvägen. Experiment utfördes både offline och i realtid för att testa systemet i inomhus- och utomhusscenarier. Resultaten bekräftade möjligheten att använda den utvecklade metoden för att spåra position och riktning av kameran samt upptäcka föremål i inomhusmiljöer. Men utomhus visades begränsningar i RGB-D-sensorn som gör att den aktuella systemkonfigurationen är värdelös för utomhusbruk. mapping costmap ROS ORB registration obstacles detection visual odometry adaptive iterative keypoint RCV vehicle Kinect RGB-D AICK ICP gridmap autonomous vehicle pose estimation estimate point-cloud Computer Sciences Datavetenskap (datalogi)
58	A Novel Approach for Spherical Stereo Vision Findeisen, Michel 23 April 2015 (has links) The Professorship of Digital Signal Processing and Circuit Technology of Chemnitz University of Technology conducts research in the field of three-dimensional space measurement with optical sensors. In recent years this field has made major progress. For example innovative, active techniques such as the “structured light“-principle are able to measure even homogeneous surfaces and find its way into the consumer electronic market in terms of Microsoft’s Kinect® at the present time. Furthermore, high-resolution optical sensors establish powerful, passive stereo vision systems in the field of indoor surveillance. Thereby they induce new application domains such as security and assistance systems for domestic environments. However, the constraint field of view can be still considered as an essential characteristic of all these technologies. For instance, in order to measure a volume in size of a living space, two to three deployed 3D sensors have to be applied nowadays. This is due to the fact that the commonly utilized perspective projection principle constrains the visible area to a field of view of approximately 120°. On the contrary, novel fish-eye lenses allow the realization of omnidirectional projection models. Therewith, the visible field of view can be enlarged up to more than 180°. In combination with a 3D measurement approach, thus, the number of required sensors for entire room coverage can be reduced considerably. Motivated by the requirements of the field of indoor surveillance, the present work focuses on the combination of the established stereo vision principle and omnidirectional projection methods. The entire 3D measurement of a living space by means of one single sensor can be considered as major objective. As a starting point for this thesis chapter 1 discusses the underlying requirement, referring to various relevant fields of application. Based on this, the distinct purpose for the present work is stated. The necessary mathematical foundations of computer vision are reflected in Chapter 2 subsequently. Based on the geometry of the optical imaging process, the projection characteristics of relevant principles are discussed and a generic method for modeling fish-eye cameras is selected. Chapter 3 deals with the extraction of depth information using classical (perceptively imaging) binocular stereo vision configurations. In addition to a complete recap of the processing chain, especially occurring measurement uncertainties are investigated. In the following, Chapter 4 addresses special methods to convert different projection models. The example of mapping an omnidirectional to a perspective projection is employed, in order to develop a method for accelerating this process and, hereby, for reducing the computational load associated therewith. Any errors that occur, as well as the necessary adjustment of image resolution, are an integral part of the investigation. As a practical example, an application for person tracking is utilized in order to demonstrate to which extend the usage of “virtual views“ can increase the recognition rate for people detectors in the context of omnidirectional monitoring. Subsequently, an extensive search with respect to omnidirectional imaging stereo vision techniques is conducted in chapter 5. It turns out that the complete 3D capture of a room is achievable by the generation of a hemispherical depth map. Therefore, three cameras have to be combined in order to form a trinocular stereo vision system. As a basis for further research, a known trinocular stereo vision method is selected. Furthermore, it is hypothesized that, applying a modified geometric constellation of cameras, more precisely in the form of an equilateral triangle, and using an alternative method to determine the depth map, the performance can be increased considerably. A novel method is presented, which shall require fewer operations to calculate the distance information and which is to avoid a computational costly step for depth map fusion as necessary in the comparative method. In order to evaluate the presented approach as well as the hypotheses, a hemispherical depth map is generated in Chapter 6 by means of the new method. Simulation results, based on artificially generated 3D space information and realistic system parameters, are presented and subjected to a subsequent error estimate. A demonstrator for generating real measurement information is introduced in Chapter 7. In addition, the methods that are applied for calibrating the system intrinsically as well as extrinsically are explained. It turns out that the calibration procedure utilized cannot estimate the extrinsic parameters sufficiently. Initial measurements present a hemispherical depth map and thus con.rm the operativeness of the concept, but also identify the drawbacks of the calibration used. The current implementation of the algorithm shows almost real-time behaviour. Finally, Chapter 8 summarizes the results obtained along the studies and discusses them in the context of comparable binocular and trinocular stereo vision approaches. For example the results of the simulations carried out produced a saving of up to 30% in terms of stereo correspondence operations in comparison with a referred trinocular method. Furthermore, the concept introduced allows the avoidance of a weighted averaging step for depth map fusion based on precision values that have to be calculated costly. The achievable accuracy is still comparable for both trinocular approaches. In summary, it can be stated that, in the context of the present thesis, a measurement system has been developed, which has great potential for future application fields in industry, security in public spaces as well as home environments.:Abstract 7 Zusammenfassung 11 Acronyms 27 Symbols 29 Acknowledgement 33 1 Introduction 35 1.1 Visual Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 1.2 Challenges in Visual Surveillance . . . . . . . . . . . . . . . . . . . . . . . 38 1.3 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2 Fundamentals of Computer Vision Geometry 43 2.1 Projective Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.1.1 Euclidean Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.1.2 Projective Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.2 Camera Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.2.1 Geometrical Imaging Process . . . . . . . . . . . . . . . . . . . . . 45 2.2.1.1 Projection Models . . . . . . . . . . . . . . . . . . . . . . 46 2.2.1.2 Intrinsic Model . . . . . . . . . . . . . . . . . . . . . . . . 47 2.2.1.3 Extrinsic Model . . . . . . . . . . . . . . . . . . . . . . . 50 2.2.1.4 Distortion Models . . . . . . . . . . . . . . . . . . . . . . 51 2.2.2 Pinhole Camera Model . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.2.2.1 Complete Forward Model . . . . . . . . . . . . . . . . . . 52 2.2.2.2 Back Projection . . . . . . . . . . . . . . . . . . . . . . . 53 2.2.3 Equiangular Camera Model . . . . . . . . . . . . . . . . . . . . . . 54 2.2.4 Generic Camera Models . . . . . . . . . . . . . . . . . . . . . . . . 55 2.2.4.1 Complete Forward Model . . . . . . . . . . . . . . . . . . 56 2.2.4.2 Back Projection . . . . . . . . . . . . . . . . . . . . . . . 58 2.3 Camera Calibration Methods . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.3.1 Perspective Camera Calibration . . . . . . . . . . . . . . . . . . . . 59 2.3.2 Omnidirectional Camera Calibration . . . . . . . . . . . . . . . . . 59 2.4 Two-View Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.4.1 Epipolar Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.4.2 The Fundamental Matrix . . . . . . . . . . . . . . . . . . . . . . . 63 2.4.3 Epipolar Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3 Fundamentals of Stereo Vision 67 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.1.1 The Concept Stereo Vision . . . . . . . . . . . . . . . . . . . . . . 67 3.1.2 Overview of a Stereo Vision Processing Chain . . . . . . . . . . . . 68 3.2 Stereo Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.2.1 Extrinsic Stereo Calibration With Respect to the Projective Error 70 3.3 Stereo Rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.3.1 A Compact Algorithm for Rectification of Stereo Pairs . . . . . . . 73 3.4 Stereo Correspondence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.4.1 Disparity Computation . . . . . . . . . . . . . . . . . . . . . . . . 76 3.4.2 The Correspondence Problem . . . . . . . . . . . . . . . . . . . . . 77 3.5 Triangulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.5.1 Depth Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.5.2 Range Field of Measurement . . . . . . . . . . . . . . . . . . . . . 80 3.5.3 Measurement Accuracy . . . . . . . . . . . . . . . . . . . . . . . . 80 3.5.4 Measurement Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.5.4.1 Quantization Error . . . . . . . . . . . . . . . . . . . . . 82 3.5.4.2 Statistical Distribution of Quantization Errors . . . . . . 83 4 Virtual Cameras 87 4.1 Introduction and Related Works . . . . . . . . . . . . . . . . . . . . . . . 88 4.2 Omni to Perspective Vision . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2.1 Forward Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2.2 Backward Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.2.3 Fast Backward Mapping . . . . . . . . . . . . . . . . . . . . . . . . 96 4.3 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.4 Accuracy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.4.1 Intrinsics of the Source Camera . . . . . . . . . . . . . . . . . . . . 102 4.4.2 Intrinsics of the Target Camera . . . . . . . . . . . . . . . . . . . . 102 4.4.3 Marginal Virtual Pixel Size . . . . . . . . . . . . . . . . . . . . . . 104 4.5 Performance Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.6 Virtual Perspective Views for Real-Time People Detection . . . . . . . . . 110 5 Omnidirectional Stereo Vision 113 5.1 Introduction and Related Works . . . . . . . . . . . . . . . . . . . . . . . 113 5.1.1 Geometrical Configuration . . . . . . . . . . . . . . . . . . . . . . . 116 5.1.1.1 H-Binocular Omni-Stereo with Panoramic Views . . . . . 117 5.1.1.2 V-Binocular Omnistereo with Panoramic Views . . . . . 119 5.1.1.3 Binocular Omnistereo with Hemispherical Views . . . . . 120 5.1.1.4 Trinocular Omnistereo . . . . . . . . . . . . . . . . . . . 122 5.1.1.5 Miscellaneous Configurations . . . . . . . . . . . . . . . . 125 5.2 Epipolar Rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5.2.1 Cylindrical Rectification . . . . . . . . . . . . . . . . . . . . . . . . 127 5.2.2 Epipolar Equi-Distance Rectification . . . . . . . . . . . . . . . . . 128 5.2.3 Epipolar Stereographic Rectification . . . . . . . . . . . . . . . . . 128 5.2.4 Comparison of Rectification Methods . . . . . . . . . . . . . . . . 129 5.3 A Novel Spherical Stereo Vision Setup . . . . . . . . . . . . . . . . . . . . 129 5.3.1 Physical Omnidirectional Camera Configuration . . . . . . . . . . 131 5.3.2 Virtual Rectified Cameras . . . . . . . . . . . . . . . . . . . . . . . 131 6 A Novel Spherical Stereo Vision Algorithm 135 6.1 Matlab Simulation Environment . . . . . . . . . . . . . . . . . . . . . . . 135 6.2 Extrinsic Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 6.3 Physical Camera Configuration . . . . . . . . . . . . . . . . . . . . . . . . 137 6.4 Virtual Camera Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 137 6.4.1 The Focal Length . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 6.4.2 Prediscussion of the Field of View . . . . . . . . . . . . . . . . . . 138 6.4.3 Marginal Virtual Pixel Sizes . . . . . . . . . . . . . . . . . . . . . . 139 6.4.4 Calculation of the Field of View . . . . . . . . . . . . . . . . . . . 142 6.4.5 Calculation of the Virtual Pixel Size Ratios . . . . . . . . . . . . . 143 6.4.6 Results of the Virtual Camera Parameters . . . . . . . . . . . . . . 144 6.5 Spherical Depth Map Generation . . . . . . . . . . . . . . . . . . . . . . . 147 6.5.1 Omnidirectional Imaging Process . . . . . . . . . . . . . . . . . . . 148 6.5.2 Rectification Process . . . . . . . . . . . . . . . . . . . . . . . . . . 148 6.5.3 Rectified Depth Map Generation . . . . . . . . . . . . . . . . . . . 150 6.5.4 Spherical Depth Map Generation . . . . . . . . . . . . . . . . . . . 151 6.5.5 3D Reprojection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 6.6 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 7 Stereo Vision Demonstrator 163 7.1 Physical System Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 7.2 System Calibration Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 165 7.2.1 Intrinsic Calibration of the Physical Cameras . . . . . . . . . . . . 165 7.2.2 Extrinsic Calibration of the Physical and the Virtual Cameras . . 166 7.2.2.1 Extrinsic Initialization of the Physical Cameras . . . . . 167 7.2.2.2 Extrinsic Initialization of the Virtual Cameras . . . . . . 167 7.2.2.3 Two-View Stereo Calibration and Rectification . . . . . . 167 7.2.2.4 Three-View Stereo Rectification . . . . . . . . . . . . . . 168 7.2.2.5 Extrinsic Calibration Results . . . . . . . . . . . . . . . . 169 7.3 Virtual Camera Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 7.4 Software Realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 7.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 7.5.1 Qualitative Assessment . . . . . . . . . . . . . . . . . . . . . . . . 172 7.5.2 Performance Measurements . . . . . . . . . . . . . . . . . . . . . . 174 8 Discussion and Outlook 177 8.1 Discussion of the Current Results and Further Need for Research . . . . . 177 8.1.1 Assessment of the Geometrical Camera Configuration . . . . . . . 178 8.1.2 Assessment of the Depth Map Computation . . . . . . . . . . . . . 179 8.1.3 Assessment of the Depth Measurement Error . . . . . . . . . . . . 182 8.1.4 Assessment of the Spherical Stereo Vision Demonstrator . . . . . . 183 8.2 Review of the Different Approaches for Hemispherical Depth Map Generation184 8.2.1 Comparison of the Equilateral and the Right-Angled Three-View Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 8.2.2 Review of the Three-View Approach in Comparison with the Two- View Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 8.3 A Sample Algorithm for Human Behaviour Analysis . . . . . . . . . . . . 187 8.4 Closing Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 A Relevant Mathematics 191 A.1 Cross Product by Skew Symmetric Matrix . . . . . . . . . . . . . . . . . . 191 A.2 Derivation of the Quantization Error . . . . . . . . . . . . . . . . . . . . . 191 A.3 Derivation of the Statistical Distribution of Quantization Errors . . . . . . 192 A.4 Approximation of the Quantization Error for Equiangular Geometry . . . 194 B Further Relevant Publications 197 B.1 H-Binocular Omnidirectional Stereo Vision with Panoramic Views . . . . 197 B.2 V-Binocular Omnidirectional Stereo Vision with Panoramic Views . . . . 198 B.3 Binocular Omnidirectional Stereo Vision with Hemispherical Views . . . . 200 B.4 Trinocular Omnidirectional Stereo Vision . . . . . . . . . . . . . . . . . . 201 B.5 Miscellaneous Configurations . . . . . . . . . . . . . . . . . . . . . . . . . 202 Bibliography 209 List of Figures 223 List of Tables 229 Affidavit 231 Theses 233 Thesen 235 Curriculum Vitae 237 info:eu-repo/classification/ddc/004 ddc:004 info:eu-repo/classification/ddc/005 ddc:005 info:eu-repo/classification/ddc/006 ddc:006 info:eu-repo/classification/ddc/600 ddc:600

Search results