• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • Tagged with
  • 7
  • 7
  • 7
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Capture and generalisation of close interaction with objects

Sandilands, Peter James January 2015 (has links)
Robust manipulation capture and retargeting has been a longstanding goal in both the fields of animation and robotics. In this thesis I describe a new approach to capture both the geometry and motion of interactions with objects, dealing with the problems of occlusion by the use of magnetic systems, and performing the reconstruction of the geometry by an RGB-D sensor alongside visual markers. This ‘interaction capture’ allows the scene to be described in terms of the spatial relationships between the character and the object using novel topological representations such as the Electric Parameters, which parametrise the outer space of an object using properties of the surface of the object. I describe the properties of these representations for motion generalisation and discuss how they can be applied to the problems of human-like motion generation and programming by demonstration. These generalised interactions are shown to be valid by demonstration of retargeting grasping and manipulation to robots with dissimilar kinematics and morphology using only local, gradient-based planning.
2

Object detection and pose estimation from rectification of natural features using consumer RGB-D sensors

Lima, João Paulo Silva do Monte 31 January 2014 (has links)
Submitted by Nayara Passos (nayara.passos@ufpe.br) on 2015-03-12T13:53:20Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) TESE João Paulo Silva do Monte Lima.pdf: 5689244 bytes, checksum: e6e2c3f1da85d18b6bb7049e458f50e7 (MD5) / Made available in DSpace on 2015-03-12T13:53:20Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) TESE João Paulo Silva do Monte Lima.pdf: 5689244 bytes, checksum: e6e2c3f1da85d18b6bb7049e458f50e7 (MD5) Previous issue date: 2014 / CAPES , CNPq / Sistemas de Realidade Aumentada são capazes de realizar registro 3D em tempo real de objetos virtuais e reais, o que consiste em posicionar corretamente os objetos virtuais em relação aos reais de forma que os elementos virtuais pareçam ser reais. Uma maneira bastante popular de realizar esse registro é usando detecção e rastreamento de objetos baseado em vídeo a partir de marcadores fiduciais planares. Outra maneira de sensoriar o mundo real usando vídeo é utilizando características naturais do ambiente, o que é mais complexo que usar marcadores planares artificiais. Entretanto, detecção e rastreamento de características naturais é mandatório ou desejável em alguns cenários de aplicação de Realidade Aumentada. A detecção e o rastreamento de objetos a partir de características naturais pode fazer uso de um modelo 3D do objeto obtido a priori. Se tal modelo não está disponível, ele pode ser adquirido usando reconstrução 3D, por exemplo. Nesse caso, um sensor RGB-D pode ser usado, que se tornou nos últimos anos um produto de fácil acesso aos usuários em geral. Ele provê uma imagem em cores e uma imagem de profundidade da cena e, além de ser usado para modelagem de objetos, também pode oferecer informações importantes para a detecção e o rastreamento de objetos em tempo real. Nesse contexto, o trabalho proposto neste documento tem por finalidade investigar o uso de sensores RGB-D de consumo para detecção e estimação de pose de objetos a partir de características naturais, com o propósito de usar tais técnicas para desenvolver aplicações de Realidade Aumentada. Dois métodos baseados em retificação auxiliada por profundidade são propostos, que transformam características extraídas de uma imagem em cores para uma vista canônica usando dados de profundidade para obter uma representação invariante a rotação, escala e distorções de perspectiva. Enquanto um método é adequado a objetos texturizados, tanto planares como não-planares, o outro método foca em objetos planares não texturizados. Avaliações qualitativas e quantitativas dos métodos propostos são realizadas, mostrando que eles podem obter resultados melhores que alguns métodos existentes para detecção e estimação de pose de objetos, especialmente ao lidar com poses oblíquas. / Augmented Reality systems are able to perform real-time 3D registration of virtual and real objects, which consists in correctly positioning the virtual objects with respect to the real ones such that the virtual elements seem to be real. A very popular way to perform this registration is using video based object detection and tracking with planar fiducial markers. Another way of sensing the real world using video is by relying on natural features of the environment, which is more complex than using artificial planar markers. Nevertheless, natural feature detection and tracking is mandatory or desirable in some Augmented Reality application scenarios. Object detection and tracking from natural features can make use of a 3D model of the object which was obtained a priori. If such model is not available, it can be acquired using 3D reconstruction. In this case, an RGB-D sensor can be used, which has become in recent years a product of easy access to general users. It provides both a color image and a depth image of the scene and, besides being used for object modeling, it can also offer important cues for object detection and tracking in real-time. In this context, the work proposed in this document aims to investigate the use of consumer RGB-D sensors for object detection and pose estimation from natural features, with the purpose of using such techniques for developing Augmented Reality applications. Two methods based on depth-assisted rectification are proposed, which transform features extracted from the color image to a canonical view using depth data in order to obtain a representation invariant to rotation, scale and perspective distortions. While one method is suitable for textured objects, either planar or non-planar, the other method focuses on texture-less planar objects. Qualitative and quantitative evaluations of the proposed methods are performed, showing that they can obtain better results than some existing methods for object detection and pose estimation, especially when dealing with oblique poses.
3

Adaptive registration using 2D and 3D features for indoor scene reconstruction. / Registro adaptativo usando características 2D e 3D para reconstrução de cenas em ambientes internos.

Perafán Villota, Juan Carlos 27 October 2016 (has links)
Pairwise alignment between point clouds is an important task in building 3D maps of indoor environments with partial information. The combination of 2D local features with depth information provided by RGB-D cameras are often used to improve such alignment. However, under varying lighting or low visual texture, indoor pairwise frame registration with sparse 2D local features is not a particularly robust method. In these conditions, features are hard to detect, thus leading to misalignment between consecutive pairs of frames. The use of 3D local features can be a solution as such features come from the 3D points themselves and are resistant to variations in visual texture and illumination. Because varying conditions in real indoor scenes are unavoidable, we propose a new framework to improve the pairwise frame alignment using an adaptive combination of sparse 2D and 3D features based on both the levels of geometric structure and visual texture contained in each scene. Experiments with datasets including unrestricted RGB-D camera motion and natural changes in illumination show that the proposed framework convincingly outperforms methods using 2D or 3D features separately, as reflected in better level of alignment accuracy. / O alinhamento entre pares de nuvens de pontos é uma tarefa importante na construção de mapas de ambientes em 3D. A combinação de características locais 2D com informação de profundidade fornecida por câmeras RGB-D são frequentemente utilizadas para melhorar tais alinhamentos. No entanto, em ambientes internos com baixa iluminação ou pouca textura visual o método usando somente características locais 2D não é particularmente robusto. Nessas condições, as características 2D são difíceis de serem detectadas, conduzindo a um desalinhamento entre pares de quadros consecutivos. A utilização de características 3D locais pode ser uma solução uma vez que tais características são extraídas diretamente de pontos 3D e são resistentes a variações na textura visual e na iluminação. Como situações de variações em cenas reais em ambientes internos são inevitáveis, essa tese apresenta um novo sistema desenvolvido com o objetivo de melhorar o alinhamento entre pares de quadros usando uma combinação adaptativa de características esparsas 2D e 3D. Tal combinação está baseada nos níveis de estrutura geométrica e de textura visual contidos em cada cena. Esse sistema foi testado com conjuntos de dados RGB-D, incluindo vídeos com movimentos irrestritos da câmera e mudanças naturais na iluminação. Os resultados experimentais mostram que a nossa proposta supera aqueles métodos que usam características 2D ou 3D separadamente, obtendo uma melhora da precisão no alinhamento de cenas em ambientes internos reais.
4

Adaptive registration using 2D and 3D features for indoor scene reconstruction. / Registro adaptativo usando características 2D e 3D para reconstrução de cenas em ambientes internos.

Juan Carlos Perafán Villota 27 October 2016 (has links)
Pairwise alignment between point clouds is an important task in building 3D maps of indoor environments with partial information. The combination of 2D local features with depth information provided by RGB-D cameras are often used to improve such alignment. However, under varying lighting or low visual texture, indoor pairwise frame registration with sparse 2D local features is not a particularly robust method. In these conditions, features are hard to detect, thus leading to misalignment between consecutive pairs of frames. The use of 3D local features can be a solution as such features come from the 3D points themselves and are resistant to variations in visual texture and illumination. Because varying conditions in real indoor scenes are unavoidable, we propose a new framework to improve the pairwise frame alignment using an adaptive combination of sparse 2D and 3D features based on both the levels of geometric structure and visual texture contained in each scene. Experiments with datasets including unrestricted RGB-D camera motion and natural changes in illumination show that the proposed framework convincingly outperforms methods using 2D or 3D features separately, as reflected in better level of alignment accuracy. / O alinhamento entre pares de nuvens de pontos é uma tarefa importante na construção de mapas de ambientes em 3D. A combinação de características locais 2D com informação de profundidade fornecida por câmeras RGB-D são frequentemente utilizadas para melhorar tais alinhamentos. No entanto, em ambientes internos com baixa iluminação ou pouca textura visual o método usando somente características locais 2D não é particularmente robusto. Nessas condições, as características 2D são difíceis de serem detectadas, conduzindo a um desalinhamento entre pares de quadros consecutivos. A utilização de características 3D locais pode ser uma solução uma vez que tais características são extraídas diretamente de pontos 3D e são resistentes a variações na textura visual e na iluminação. Como situações de variações em cenas reais em ambientes internos são inevitáveis, essa tese apresenta um novo sistema desenvolvido com o objetivo de melhorar o alinhamento entre pares de quadros usando uma combinação adaptativa de características esparsas 2D e 3D. Tal combinação está baseada nos níveis de estrutura geométrica e de textura visual contidos em cada cena. Esse sistema foi testado com conjuntos de dados RGB-D, incluindo vídeos com movimentos irrestritos da câmera e mudanças naturais na iluminação. Os resultados experimentais mostram que a nossa proposta supera aqueles métodos que usam características 2D ou 3D separadamente, obtendo uma melhora da precisão no alinhamento de cenas em ambientes internos reais.
5

[en] GRAPH OPTIMIZATION AND PROBABILISTIC SLAM OF MOBILE ROBOTS USING AN RGB-D SENSOR / [pt] OTIMIZAÇÃO DE GRAFOS E SLAM PROBABILÍSTICO DE ROBÔS MÓVEIS USANDO UM SENSOR RGB-D

23 March 2021 (has links)
[pt] Robôs móveis têm uma grande gama de aplicações, incluindo veículos autônomos, robôs industriais e veículos aéreos não tripulados. Navegação móvel autônoma é um assunto desafiador devido à alta incerteza e nãolinearidade inerente a ambientes não estruturados, locomoção e medições de sensores. Para executar navegação autônoma, um robô precisa de um mapa do ambiente e de uma estimativa de sua própria localização e orientação em relação ao sistema de referência global. No entando, geralmente o robô não possui informações prévias sobre o ambiente e deve criar o mapa usando informações de sensores e se localizar ao mesmo tempo, um problema chamado Mapeamento e Localização Simultâneos (SLAM). As formulações de SLAM usam algoritmos probabilísticos para lidar com as incertezas do problema, e a abordagem baseada em grafos é uma das soluções estado-da-arte para SLAM. Por muitos anos os sensores LRF (laser range finders) eram as escolhas mais populares de sensores para SLAM. No entanto, sensores RGB-D são uma alternativa interessante, devido ao baixo custo. Este trabalho apresenta uma implementação de RGB-D SLAM com uma abordagem baseada em grafos. A metodologia proposta usa o Sistema Operacional de Robôs (ROS) como middleware do sistema. A implementação é testada num robô de baixo custo e com um conjunto de dados reais obtidos na literatura. Também é apresentada a implementação de uma ferramenta de otimização de grafos para MATLAB. / [en] Mobile robots have a wide range of applications, including autonomous vehicles, industrial robots and unmanned aerial vehicles. Autonomous mobile navigation is a challenging subject due to the high uncertainty and nonlinearity inherent to unstructured environments, robot motion and sensor measurements. To perform autonomous navigation, a robot need a map of the environment and an estimation of its own pose with respect to the global coordinate system. However, usually the robot has no prior knowledge about the environment, and has to create a map using sensor information and localize itself at the same time, a problem called Simultaneous Localization and Mapping (SLAM). The SLAM formulations use probabilistic algorithms to handle the uncertainties of the problem, and the graph-based approach is one of the state-of-the-art solutions for SLAM. For many years, the LRF (laser range finders) were the most popular sensor choice for SLAM. However, RGB-D sensors are an interesting alternative, due to their low cost. This work presents an RGB-D SLAM implementation with a graph-based probabilistic approach. The proposed methodology uses the Robot Operating System (ROS) as middleware. The implementation is tested in a low cost robot and with real-world datasets from literature. Also, it is presented the implementation of a pose-graph optimization tool for MATLAB.
6

A Novel Approach for Spherical Stereo Vision / Ein Neuer Ansatz für Sphärisches Stereo Vision

Findeisen, Michel 27 April 2015 (has links) (PDF)
The Professorship of Digital Signal Processing and Circuit Technology of Chemnitz University of Technology conducts research in the field of three-dimensional space measurement with optical sensors. In recent years this field has made major progress. For example innovative, active techniques such as the “structured light“-principle are able to measure even homogeneous surfaces and find its way into the consumer electronic market in terms of Microsoft’s Kinect® at the present time. Furthermore, high-resolution optical sensors establish powerful, passive stereo vision systems in the field of indoor surveillance. Thereby they induce new application domains such as security and assistance systems for domestic environments. However, the constraint field of view can be still considered as an essential characteristic of all these technologies. For instance, in order to measure a volume in size of a living space, two to three deployed 3D sensors have to be applied nowadays. This is due to the fact that the commonly utilized perspective projection principle constrains the visible area to a field of view of approximately 120°. On the contrary, novel fish-eye lenses allow the realization of omnidirectional projection models. Therewith, the visible field of view can be enlarged up to more than 180°. In combination with a 3D measurement approach, thus, the number of required sensors for entire room coverage can be reduced considerably. Motivated by the requirements of the field of indoor surveillance, the present work focuses on the combination of the established stereo vision principle and omnidirectional projection methods. The entire 3D measurement of a living space by means of one single sensor can be considered as major objective. As a starting point for this thesis chapter 1 discusses the underlying requirement, referring to various relevant fields of application. Based on this, the distinct purpose for the present work is stated. The necessary mathematical foundations of computer vision are reflected in Chapter 2 subsequently. Based on the geometry of the optical imaging process, the projection characteristics of relevant principles are discussed and a generic method for modeling fish-eye cameras is selected. Chapter 3 deals with the extraction of depth information using classical (perceptively imaging) binocular stereo vision configurations. In addition to a complete recap of the processing chain, especially occurring measurement uncertainties are investigated. In the following, Chapter 4 addresses special methods to convert different projection models. The example of mapping an omnidirectional to a perspective projection is employed, in order to develop a method for accelerating this process and, hereby, for reducing the computational load associated therewith. Any errors that occur, as well as the necessary adjustment of image resolution, are an integral part of the investigation. As a practical example, an application for person tracking is utilized in order to demonstrate to which extend the usage of “virtual views“ can increase the recognition rate for people detectors in the context of omnidirectional monitoring. Subsequently, an extensive search with respect to omnidirectional imaging stereo vision techniques is conducted in chapter 5. It turns out that the complete 3D capture of a room is achievable by the generation of a hemispherical depth map. Therefore, three cameras have to be combined in order to form a trinocular stereo vision system. As a basis for further research, a known trinocular stereo vision method is selected. Furthermore, it is hypothesized that, applying a modified geometric constellation of cameras, more precisely in the form of an equilateral triangle, and using an alternative method to determine the depth map, the performance can be increased considerably. A novel method is presented, which shall require fewer operations to calculate the distance information and which is to avoid a computational costly step for depth map fusion as necessary in the comparative method. In order to evaluate the presented approach as well as the hypotheses, a hemispherical depth map is generated in Chapter 6 by means of the new method. Simulation results, based on artificially generated 3D space information and realistic system parameters, are presented and subjected to a subsequent error estimate. A demonstrator for generating real measurement information is introduced in Chapter 7. In addition, the methods that are applied for calibrating the system intrinsically as well as extrinsically are explained. It turns out that the calibration procedure utilized cannot estimate the extrinsic parameters sufficiently. Initial measurements present a hemispherical depth map and thus con.rm the operativeness of the concept, but also identify the drawbacks of the calibration used. The current implementation of the algorithm shows almost real-time behaviour. Finally, Chapter 8 summarizes the results obtained along the studies and discusses them in the context of comparable binocular and trinocular stereo vision approaches. For example the results of the simulations carried out produced a saving of up to 30% in terms of stereo correspondence operations in comparison with a referred trinocular method. Furthermore, the concept introduced allows the avoidance of a weighted averaging step for depth map fusion based on precision values that have to be calculated costly. The achievable accuracy is still comparable for both trinocular approaches. In summary, it can be stated that, in the context of the present thesis, a measurement system has been developed, which has great potential for future application fields in industry, security in public spaces as well as home environments.
7

A Novel Approach for Spherical Stereo Vision

Findeisen, Michel 23 April 2015 (has links)
The Professorship of Digital Signal Processing and Circuit Technology of Chemnitz University of Technology conducts research in the field of three-dimensional space measurement with optical sensors. In recent years this field has made major progress. For example innovative, active techniques such as the “structured light“-principle are able to measure even homogeneous surfaces and find its way into the consumer electronic market in terms of Microsoft’s Kinect® at the present time. Furthermore, high-resolution optical sensors establish powerful, passive stereo vision systems in the field of indoor surveillance. Thereby they induce new application domains such as security and assistance systems for domestic environments. However, the constraint field of view can be still considered as an essential characteristic of all these technologies. For instance, in order to measure a volume in size of a living space, two to three deployed 3D sensors have to be applied nowadays. This is due to the fact that the commonly utilized perspective projection principle constrains the visible area to a field of view of approximately 120°. On the contrary, novel fish-eye lenses allow the realization of omnidirectional projection models. Therewith, the visible field of view can be enlarged up to more than 180°. In combination with a 3D measurement approach, thus, the number of required sensors for entire room coverage can be reduced considerably. Motivated by the requirements of the field of indoor surveillance, the present work focuses on the combination of the established stereo vision principle and omnidirectional projection methods. The entire 3D measurement of a living space by means of one single sensor can be considered as major objective. As a starting point for this thesis chapter 1 discusses the underlying requirement, referring to various relevant fields of application. Based on this, the distinct purpose for the present work is stated. The necessary mathematical foundations of computer vision are reflected in Chapter 2 subsequently. Based on the geometry of the optical imaging process, the projection characteristics of relevant principles are discussed and a generic method for modeling fish-eye cameras is selected. Chapter 3 deals with the extraction of depth information using classical (perceptively imaging) binocular stereo vision configurations. In addition to a complete recap of the processing chain, especially occurring measurement uncertainties are investigated. In the following, Chapter 4 addresses special methods to convert different projection models. The example of mapping an omnidirectional to a perspective projection is employed, in order to develop a method for accelerating this process and, hereby, for reducing the computational load associated therewith. Any errors that occur, as well as the necessary adjustment of image resolution, are an integral part of the investigation. As a practical example, an application for person tracking is utilized in order to demonstrate to which extend the usage of “virtual views“ can increase the recognition rate for people detectors in the context of omnidirectional monitoring. Subsequently, an extensive search with respect to omnidirectional imaging stereo vision techniques is conducted in chapter 5. It turns out that the complete 3D capture of a room is achievable by the generation of a hemispherical depth map. Therefore, three cameras have to be combined in order to form a trinocular stereo vision system. As a basis for further research, a known trinocular stereo vision method is selected. Furthermore, it is hypothesized that, applying a modified geometric constellation of cameras, more precisely in the form of an equilateral triangle, and using an alternative method to determine the depth map, the performance can be increased considerably. A novel method is presented, which shall require fewer operations to calculate the distance information and which is to avoid a computational costly step for depth map fusion as necessary in the comparative method. In order to evaluate the presented approach as well as the hypotheses, a hemispherical depth map is generated in Chapter 6 by means of the new method. Simulation results, based on artificially generated 3D space information and realistic system parameters, are presented and subjected to a subsequent error estimate. A demonstrator for generating real measurement information is introduced in Chapter 7. In addition, the methods that are applied for calibrating the system intrinsically as well as extrinsically are explained. It turns out that the calibration procedure utilized cannot estimate the extrinsic parameters sufficiently. Initial measurements present a hemispherical depth map and thus con.rm the operativeness of the concept, but also identify the drawbacks of the calibration used. The current implementation of the algorithm shows almost real-time behaviour. Finally, Chapter 8 summarizes the results obtained along the studies and discusses them in the context of comparable binocular and trinocular stereo vision approaches. For example the results of the simulations carried out produced a saving of up to 30% in terms of stereo correspondence operations in comparison with a referred trinocular method. Furthermore, the concept introduced allows the avoidance of a weighted averaging step for depth map fusion based on precision values that have to be calculated costly. The achievable accuracy is still comparable for both trinocular approaches. In summary, it can be stated that, in the context of the present thesis, a measurement system has been developed, which has great potential for future application fields in industry, security in public spaces as well as home environments.:Abstract 7 Zusammenfassung 11 Acronyms 27 Symbols 29 Acknowledgement 33 1 Introduction 35 1.1 Visual Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 1.2 Challenges in Visual Surveillance . . . . . . . . . . . . . . . . . . . . . . . 38 1.3 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2 Fundamentals of Computer Vision Geometry 43 2.1 Projective Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.1.1 Euclidean Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.1.2 Projective Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.2 Camera Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.2.1 Geometrical Imaging Process . . . . . . . . . . . . . . . . . . . . . 45 2.2.1.1 Projection Models . . . . . . . . . . . . . . . . . . . . . . 46 2.2.1.2 Intrinsic Model . . . . . . . . . . . . . . . . . . . . . . . . 47 2.2.1.3 Extrinsic Model . . . . . . . . . . . . . . . . . . . . . . . 50 2.2.1.4 Distortion Models . . . . . . . . . . . . . . . . . . . . . . 51 2.2.2 Pinhole Camera Model . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.2.2.1 Complete Forward Model . . . . . . . . . . . . . . . . . . 52 2.2.2.2 Back Projection . . . . . . . . . . . . . . . . . . . . . . . 53 2.2.3 Equiangular Camera Model . . . . . . . . . . . . . . . . . . . . . . 54 2.2.4 Generic Camera Models . . . . . . . . . . . . . . . . . . . . . . . . 55 2.2.4.1 Complete Forward Model . . . . . . . . . . . . . . . . . . 56 2.2.4.2 Back Projection . . . . . . . . . . . . . . . . . . . . . . . 58 2.3 Camera Calibration Methods . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.3.1 Perspective Camera Calibration . . . . . . . . . . . . . . . . . . . . 59 2.3.2 Omnidirectional Camera Calibration . . . . . . . . . . . . . . . . . 59 2.4 Two-View Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.4.1 Epipolar Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.4.2 The Fundamental Matrix . . . . . . . . . . . . . . . . . . . . . . . 63 2.4.3 Epipolar Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3 Fundamentals of Stereo Vision 67 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.1.1 The Concept Stereo Vision . . . . . . . . . . . . . . . . . . . . . . 67 3.1.2 Overview of a Stereo Vision Processing Chain . . . . . . . . . . . . 68 3.2 Stereo Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.2.1 Extrinsic Stereo Calibration With Respect to the Projective Error 70 3.3 Stereo Rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.3.1 A Compact Algorithm for Rectification of Stereo Pairs . . . . . . . 73 3.4 Stereo Correspondence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.4.1 Disparity Computation . . . . . . . . . . . . . . . . . . . . . . . . 76 3.4.2 The Correspondence Problem . . . . . . . . . . . . . . . . . . . . . 77 3.5 Triangulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.5.1 Depth Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.5.2 Range Field of Measurement . . . . . . . . . . . . . . . . . . . . . 80 3.5.3 Measurement Accuracy . . . . . . . . . . . . . . . . . . . . . . . . 80 3.5.4 Measurement Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.5.4.1 Quantization Error . . . . . . . . . . . . . . . . . . . . . 82 3.5.4.2 Statistical Distribution of Quantization Errors . . . . . . 83 4 Virtual Cameras 87 4.1 Introduction and Related Works . . . . . . . . . . . . . . . . . . . . . . . 88 4.2 Omni to Perspective Vision . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2.1 Forward Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2.2 Backward Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.2.3 Fast Backward Mapping . . . . . . . . . . . . . . . . . . . . . . . . 96 4.3 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.4 Accuracy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.4.1 Intrinsics of the Source Camera . . . . . . . . . . . . . . . . . . . . 102 4.4.2 Intrinsics of the Target Camera . . . . . . . . . . . . . . . . . . . . 102 4.4.3 Marginal Virtual Pixel Size . . . . . . . . . . . . . . . . . . . . . . 104 4.5 Performance Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.6 Virtual Perspective Views for Real-Time People Detection . . . . . . . . . 110 5 Omnidirectional Stereo Vision 113 5.1 Introduction and Related Works . . . . . . . . . . . . . . . . . . . . . . . 113 5.1.1 Geometrical Configuration . . . . . . . . . . . . . . . . . . . . . . . 116 5.1.1.1 H-Binocular Omni-Stereo with Panoramic Views . . . . . 117 5.1.1.2 V-Binocular Omnistereo with Panoramic Views . . . . . 119 5.1.1.3 Binocular Omnistereo with Hemispherical Views . . . . . 120 5.1.1.4 Trinocular Omnistereo . . . . . . . . . . . . . . . . . . . 122 5.1.1.5 Miscellaneous Configurations . . . . . . . . . . . . . . . . 125 5.2 Epipolar Rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5.2.1 Cylindrical Rectification . . . . . . . . . . . . . . . . . . . . . . . . 127 5.2.2 Epipolar Equi-Distance Rectification . . . . . . . . . . . . . . . . . 128 5.2.3 Epipolar Stereographic Rectification . . . . . . . . . . . . . . . . . 128 5.2.4 Comparison of Rectification Methods . . . . . . . . . . . . . . . . 129 5.3 A Novel Spherical Stereo Vision Setup . . . . . . . . . . . . . . . . . . . . 129 5.3.1 Physical Omnidirectional Camera Configuration . . . . . . . . . . 131 5.3.2 Virtual Rectified Cameras . . . . . . . . . . . . . . . . . . . . . . . 131 6 A Novel Spherical Stereo Vision Algorithm 135 6.1 Matlab Simulation Environment . . . . . . . . . . . . . . . . . . . . . . . 135 6.2 Extrinsic Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 6.3 Physical Camera Configuration . . . . . . . . . . . . . . . . . . . . . . . . 137 6.4 Virtual Camera Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 137 6.4.1 The Focal Length . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 6.4.2 Prediscussion of the Field of View . . . . . . . . . . . . . . . . . . 138 6.4.3 Marginal Virtual Pixel Sizes . . . . . . . . . . . . . . . . . . . . . . 139 6.4.4 Calculation of the Field of View . . . . . . . . . . . . . . . . . . . 142 6.4.5 Calculation of the Virtual Pixel Size Ratios . . . . . . . . . . . . . 143 6.4.6 Results of the Virtual Camera Parameters . . . . . . . . . . . . . . 144 6.5 Spherical Depth Map Generation . . . . . . . . . . . . . . . . . . . . . . . 147 6.5.1 Omnidirectional Imaging Process . . . . . . . . . . . . . . . . . . . 148 6.5.2 Rectification Process . . . . . . . . . . . . . . . . . . . . . . . . . . 148 6.5.3 Rectified Depth Map Generation . . . . . . . . . . . . . . . . . . . 150 6.5.4 Spherical Depth Map Generation . . . . . . . . . . . . . . . . . . . 151 6.5.5 3D Reprojection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 6.6 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 7 Stereo Vision Demonstrator 163 7.1 Physical System Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 7.2 System Calibration Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 165 7.2.1 Intrinsic Calibration of the Physical Cameras . . . . . . . . . . . . 165 7.2.2 Extrinsic Calibration of the Physical and the Virtual Cameras . . 166 7.2.2.1 Extrinsic Initialization of the Physical Cameras . . . . . 167 7.2.2.2 Extrinsic Initialization of the Virtual Cameras . . . . . . 167 7.2.2.3 Two-View Stereo Calibration and Rectification . . . . . . 167 7.2.2.4 Three-View Stereo Rectification . . . . . . . . . . . . . . 168 7.2.2.5 Extrinsic Calibration Results . . . . . . . . . . . . . . . . 169 7.3 Virtual Camera Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 7.4 Software Realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 7.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 7.5.1 Qualitative Assessment . . . . . . . . . . . . . . . . . . . . . . . . 172 7.5.2 Performance Measurements . . . . . . . . . . . . . . . . . . . . . . 174 8 Discussion and Outlook 177 8.1 Discussion of the Current Results and Further Need for Research . . . . . 177 8.1.1 Assessment of the Geometrical Camera Configuration . . . . . . . 178 8.1.2 Assessment of the Depth Map Computation . . . . . . . . . . . . . 179 8.1.3 Assessment of the Depth Measurement Error . . . . . . . . . . . . 182 8.1.4 Assessment of the Spherical Stereo Vision Demonstrator . . . . . . 183 8.2 Review of the Different Approaches for Hemispherical Depth Map Generation184 8.2.1 Comparison of the Equilateral and the Right-Angled Three-View Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 8.2.2 Review of the Three-View Approach in Comparison with the Two- View Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 8.3 A Sample Algorithm for Human Behaviour Analysis . . . . . . . . . . . . 187 8.4 Closing Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 A Relevant Mathematics 191 A.1 Cross Product by Skew Symmetric Matrix . . . . . . . . . . . . . . . . . . 191 A.2 Derivation of the Quantization Error . . . . . . . . . . . . . . . . . . . . . 191 A.3 Derivation of the Statistical Distribution of Quantization Errors . . . . . . 192 A.4 Approximation of the Quantization Error for Equiangular Geometry . . . 194 B Further Relevant Publications 197 B.1 H-Binocular Omnidirectional Stereo Vision with Panoramic Views . . . . 197 B.2 V-Binocular Omnidirectional Stereo Vision with Panoramic Views . . . . 198 B.3 Binocular Omnidirectional Stereo Vision with Hemispherical Views . . . . 200 B.4 Trinocular Omnidirectional Stereo Vision . . . . . . . . . . . . . . . . . . 201 B.5 Miscellaneous Configurations . . . . . . . . . . . . . . . . . . . . . . . . . 202 Bibliography 209 List of Figures 223 List of Tables 229 Affidavit 231 Theses 233 Thesen 235 Curriculum Vitae 237

Page generated in 0.4121 seconds