Global ETD Search

31	Etalonnage de caméras à champs disjoints et reconstruction 3D : Application à un robot mobile / Non-overlapping camera calibration and 3D reconstruction : Application to Vision-Based Robotics Lébraly, Pierre 18 January 2012 (has links) Ces travaux s’inscrivent dans le cadre du projet VIPA « Véhicule Individuel Public Autonome », au cours duquel le LASMEA et ses partenaires ont mis au point des véhicules capables de naviguer automatiquement, sans aucune infrastructure extérieure dédiée, dans des zones urbaines (parkings, zones piétonnes, aéroports). Il est doté de deux caméras, l’une à l’avant, et l’autre à l’arrière. Avant son déploiement, le véhicule doit tout d’abord être étalonné et conduit manuellement afin de reconstruire la carte d’amers visuels dans laquelle il naviguera ensuite automatiquement. Les travaux de cette thèse ont pour but de développer et de mettre en oeuvre des méthodes souples permettant d’étalonner cet ensemble de caméras dont les champs de vue sont totalement disjoints. Après une étape préalable d’étalonnage intrinsèque et un état de l’art sur les systèmes multi-caméra, nous développons et mettons en oeuvre différentes méthodes d’étalonnage extrinsèque (déterminant les poses relatives des caméras à champs de vue disjoints). La première méthode présentée utilise un miroir plan pour créer un champ de vision commun aux différentes caméras. La seconde approche consiste à manoeuvrer le véhicule pendant que chaque caméra observe une scène statique composée de cibles (dont la détection est sous-pixellique). Dans la troisième approche, nous montrons que l’étalonnage extrinsèque peut être obtenu simultanément à la reconstruction 3D (par exemple lors de la phase d’apprentissage), en utilisant des points d’intérêt comme amers visuels. Pour cela un algorithme d’ajustement de faisceaux multi-caméra a été développé avec une implémentation creuse. Enfin, nous terminons par un étalonnage déterminant l’orientation du système multi-caméra par rapport au véhicule. / My research was involved in the VIPA « Automatic Electric Vehicle for Passenger Transportation » project. During which, the LASMEA and its partnerships have developed vehicles able to navigate autonomously, without any outside dedicated infrastructure in an urban environment (parking lots, pedestrian areas, airports). Two cameras are rigidly embedded on a vehicle : one at the front, another at the back. Before being available for autonomous navigation tasks, the vehicle have to be calibrated and driven manually in order to build a visual 3D map (calibration and learning steps). Then, the vehicle will use this map to localize itself and drive autonomously. The goals of this thesis are to develop and apply user friendly methods, which calibrate this set of nonoverlapping cameras. After a first step of intrinsic calibration and a state of the art on multi-camera rigs, we develop and test several methods to extrinsically calibrate non-overlapping cameras (i.e. estimate the camera relative poses). The first method uses a planar mirror to create an overlap between views of the different cameras. The second procedure consists in manoeuvring the vehicle while each camera observes a static scene (composed of a set of targets, which are detected accurately). In a third procedure, we solve the 3D reconstruction and the extrinsic calibration problems simultaneously (the learning step can be used for that purpose) relying on visual features such as interest points. To achieve this goal a multi-camera bundle adjustment is proposed and implemented with a sparse data structures. Lastly, we present a calibration of the orientation of a multi-camera rig relative to the vehicle. Étalonnage Reconstruction 3D Ajustement de faisceaux Miroir plan VIPA Calibration 3D reconstruction Bundle adjustment Non-overlapping cameras Planar mirror VIPA
32	Approches 2D/2D pour le SFM à partir d'un réseau de caméras asynchrones / 2D/2D approaches for SFM using an asynchronous multi-camera network Mhiri, Rawia 14 December 2015 (has links) Les systèmes d'aide à la conduite et les travaux concernant le véhicule autonome ont atteint une certaine maturité durant ces dernières aimées grâce à l'utilisation de technologies avancées. Une étape fondamentale pour ces systèmes porte sur l'estimation du mouvement et de la structure de l'environnement (Structure From Motion) pour accomplir plusieurs tâches, notamment la détection d'obstacles et de marquage routier, la localisation et la cartographie. Pour estimer leurs mouvements, de tels systèmes utilisent des capteurs relativement chers. Pour être commercialisés à grande échelle, il est alors nécessaire de développer des applications avec des dispositifs bas coûts. Dans cette optique, les systèmes de vision se révèlent une bonne alternative. Une nouvelle méthode basée sur des approches 2D/2D à partir d'un réseau de caméras asynchrones est présentée afin d'obtenir le déplacement et la structure 3D à l'échelle absolue en prenant soin d'estimer les facteurs d'échelle. La méthode proposée, appelée méthode des triangles, se base sur l'utilisation de trois images formant un triangle : deux images provenant de la même caméra et une image provenant d'une caméra voisine. L'algorithme admet trois hypothèses: les caméras partagent des champs de vue communs (deux à deux), la trajectoire entre deux images consécutives provenant d'une même caméra est approximée par un segment linéaire et les caméras sont calibrées. La connaissance de la calibration extrinsèque entre deux caméras combinée avec l'hypothèse de mouvement rectiligne du système, permet d'estimer les facteurs d'échelle absolue. La méthode proposée est précise et robuste pour les trajectoires rectilignes et présente des résultats satisfaisants pour les virages. Pour affiner l'estimation initiale, certaines erreurs dues aux imprécisions dans l'estimation des facteurs d'échelle sont améliorées par une méthode d'optimisation : un ajustement de faisceaux local appliqué uniquement sur les facteurs d'échelle absolue et sur les points 3D. L'approche présentée est validée sur des séquences de scènes routières réelles et évaluée par rapport à la vérité terrain obtenue par un GPS différentiel. Une application fondamentale dans les domaines d'aide à la conduite et de la conduite automatisée est la détection de la route et d'obstacles. Pour un système asynchrone, une première approche pour traiter cette application est présentée en se basant sur des cartes de disparité éparses. / Driver assistance systems and autonomous vehicles have reached a certain maturity in recent years through the use of advanced technologies. A fundamental step for these systems is the motion and the structure estimation (Structure From Motion) that accomplish several tasks, including the detection of obstacles and road marking, localisation and mapping. To estimate their movements, such systems use relatively expensive sensors. In order to market such systems on a large scale, it is necessary to develop applications with low cost devices. In this context, vision systems is a good alternative. A new method based on 2D/2D approaches from an asynchronous multi-camera network is presented to obtain the motion and the 3D structure at the absolute scale, focusing on estimating the scale factors. The proposed method, called Triangle Method, is based on the use of three images forming a. triangle shape: two images from the same camera and an image from a neighboring camera. The algorithrn has three assumptions: the cameras share common fields of view (two by two), the path between two consecutive images from a single camera is approximated by a line segment, and the cameras are calibrated. The extrinsic calibration between two cameras combined with the assumption of rectilinear motion of the system allows to estimate the absolute scale factors. The proposed method is accurate and robust for straight trajectories and present satisfactory results for curve trajectories. To refine the initial estimation, some en-ors due to the inaccuracies of the scale estimation are improved by an optimization method: a local bundle adjustment applied only on the absolute scale factors and the 3D points. The presented approach is validated on sequences of real road scenes, and evaluated with respect to the ground truth obtained through a differential GPS. Finally, another fundamental application in the fields of driver assistance and automated driving is road and obstacles detection. A method is presented for an asynchronous system based on sparse disparity maps Odométrie visuelle Méthode des triangles Ajustement de faisceaux local Structure from motion Visual odometry Asynchronous multi-camera system Triangle-based method Local bundle adjustment Obstacle detection
33	SPATIAL AND TEMPORAL SYSTEM CALIBRATION OF GNSS/INS-ASSISTED FRAME AND LINE CAMERAS ONBOARD UNMANNED AERIAL VEHICLES Lisa Marie Laforest (9188615) 31 July 2020 (has links) <p>Unmanned aerial vehicles (UAVs) equipped with imaging systems and integrated global navigation satellite system/inertial navigation system (GNSS/INS) are used for a variety of applications. Disaster relief, infrastructure monitoring, precision agriculture, and ecological forestry growth monitoring are among some of the applications that utilize UAV imaging systems. For most applications, accurate 3D spatial information from the UAV imaging system is required. Deriving reliable 3D coordinates is conditioned on accurate geometric calibration. Geometric calibration entails both spatial and temporal calibration. Spatial calibration consists of obtaining accurate internal characteristics of the imaging sensor as well as estimating the mounting parameters between the imaging and the GNSS/INS units. Temporal calibration ensures that there is little to no time delay between the image timestamps and corresponding GNSS/INS position and orientation timestamps. Manual and automated spatial calibration have been successfully accomplished on a variety of platforms and sensors including UAVs equipped with frame and push-broom line cameras. However, manual and automated temporal calibration has not been demonstrated on both frame and line camera systems without the use of ground control points (GCPs). This research focuses on manual and automated spatial and temporal system calibration for UAVs equipped with GNSS/INS frame and line camera systems. For frame cameras, the research introduces two approaches (direct and indirect) to correct for time delay between GNSS/INS recorded event markers and actual time of image exposures. To ensure the best estimates of system parameters without the use of ground control points, an optimal flight configuration for system calibration while estimating time delay is rigorously derived. For line camera systems, this research presents the direct approach to estimate system calibration parameters including time delay during the bundle block adjustment. The optimal flight configuration is also rigorously derived for line camera systems and the bias impact analysis is concluded. This shows that the indirect approach is not a feasible solution for push-broom line cameras onboard UAVs due to the limited ability of line cameras to decouple system parameters and is confirmed with experimental results. Lastly, this research demonstrates that for frame and line camera systems, the direct approach can be fully-automated by incorporating structure from motion (SfM) based tie point features. Methods for feature detection and matching for frame and line camera systems are presented. This research also presents the necessary changes in the bundle adjustment with self-calibration to successfully incorporate a large amount of automatically-derived tie points. For frame cameras, the results show that the direct and indirect approach is capable of estimating and correcting this time delay. When a time delay exists and the direct or indirect approach is applied, horizontal accuracy of 1–3 times the ground sampling distance (GSD) can be achieved without the use of any ground control points (GCPs). For line camera systems, the direct results show that when a time delay exists and spatial and temporal calibration is performed, vertical and horizontal accuracy are approximately that of the ground sample distance (GSD) of the sensor. Furthermore, when a large artificial time delay is introduced for line camera systems, the direct approach still achieves accuracy less than the GSD of the system and performs 2.5-8 times better in the horizontal components and up to 18 times better in the vertical component than when temporal calibration is not performed. Lastly, the results show that automated tie points can be successfully extracted for frame and line camera systems and that those tie point features can be incorporated into a fully-automated bundle adjustment with self-calibration including time delay estimation. The results show that this fully-automated calibration accurately estimates system parameters and demonstrates absolute accuracy similar to that of manually-measured tie/checkpoints without the use of GCPs.</p> Photogrammetry System Calibration Frame Camera Line Scanner time synchronization unmanned aerial vehicles (UAVs) GNSS/INS-assisted mapping bundle adjustment
34	Direction estimation using visual odometry / Uppskattning av riktning med visuell odometri Masson, Clément January 2015 (has links) This Master thesis tackles the problem of measuring objects’ directions from a motionless observation point. A new method based on a single rotating camera requiring the knowledge of only two (or more) landmarks’ direction is proposed. In a first phase, multi-view geometry is used to estimate camera rotations and key elements’ direction from a set of overlapping images. Then in a second phase, the direction of any object can be estimated by resectioning the camera associated to a picture showing this object. A detailed description of the algorithmic chain is given, along with test results on both synthetic data and real images taken with an infrared camera. / Detta masterarbete behandlar problemet med att mäta objekts riktningar från en fast observationspunkt. En ny metod föreslås, baserad på en enda roterande kamera som kräver endast två (eller flera) landmärkens riktningar. I en första fas används multiperspektivgeometri, för att uppskatta kamerarotationer och nyckelelements riktningar utifrån en uppsättning överlappande bilder. I en andra fas kan sedan riktningen hos vilket objekt som helst uppskattas genom att kameran, associerad till en bild visande detta objekt, omsektioneras. En detaljerad beskrivning av den algoritmiska kedjan ges, tillsammans med testresultat av både syntetisk data och verkliga bilder tagen med en infraröd kamera. direction estimation visual odometry camera computer vision bundle adjustment SLAM Computer Sciences Datavetenskap (datalogi)
35	Registration and Localization of Unknown Moving Objects in Markerless Monocular SLAM Troutman, Blake 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Simultaneous localization and mapping (SLAM) is a general device localization technique that uses realtime sensor measurements to develop a virtualization of the sensor's environment while also using this growing virtualization to determine the position and orientation of the sensor. This is useful for augmented reality (AR), in which a user looks through a head-mounted display (HMD) or viewfinder to see virtual components integrated into the real world. Visual SLAM (i.e., SLAM in which the sensor is an optical camera) is used in AR to determine the exact device/headset movement so that the virtual components can be accurately redrawn to the screen, matching the perceived motion of the world around the user as the user moves the device/headset. However, many potential AR applications may need access to more than device localization data in order to be useful; they may need to leverage environment data as well. Additionally, most SLAM solutions make the naive assumption that the environment surrounding the system is completely static (non-moving). Given these circumstances, it is clear that AR may benefit substantially from utilizing a SLAM solution that detects objects that move in the scene and ultimately provides localization data for each of these objects. This problem is known as the dynamic SLAM problem. Current attempts to address the dynamic SLAM problem often use machine learning to develop models that identify the parts of the camera image that belong to one of many classes of potentially-moving objects. The limitation with these approaches is that it is impractical to train models to identify every possible object that moves; additionally, some potentially-moving objects may be static in the scene, which these approaches often do not account for. Some other attempts to address the dynamic SLAM problem also localize the moving objects they detect, but these systems almost always rely on depth sensors or stereo camera configurations, which have significant limitations in real-world use cases. This dissertation presents a novel approach for registering and localizing unknown moving objects in the context of markerless, monocular, keyframe-based SLAM with no required prior information about object structure, appearance, or existence. This work also details a novel deep learning solution for determining SLAM map initialization suitability in structure-from-motion-based initialization approaches. This dissertation goes on to validate these approaches by implementing them in a markerless, monocular SLAM system called LUMO-SLAM, which is built from the ground up to demonstrate this approach to unknown moving object registration and localization. Results are collected for the LUMO-SLAM system, which address the accuracy of its camera localization estimates, the accuracy of its moving object localization estimates, and the consistency with which it registers moving objects in the scene. These results show that this solution to the dynamic SLAM problem, though it does not act as a practical solution for all use cases, has an ability to accurately register and localize unknown moving objects in such a way that makes it useful for some applications of AR without thwarting the system's ability to also perform accurate camera localization. Computer Vision Augmented Reality Virtual Reality Monocular Vision Bundle Adjustment Structure from Motion (SfM) Dynamic SLAM Moving Object Tracking
36	Structure from Motion with Unstructured RGBD Data Svensson, Niclas January 2021 (has links) This thesis covers the topic of depth- assisted Structure from Motion (SfM). When performing classic SfM, the goal is to reconstruct a 3D scene using only a set of unstructured RGB images. What is attempted to be achieved in this thesis is adding the depth dimension to the problem formulation, and consequently create a system that can receive a set of RGBD images. The problem has been addressed by modifying an already existing SfM pipeline and in particular, its Bundle Adjustment (BA) process. Comparisons between the modified framework and the baseline framework resulted in conclusions regarding the impact of the modifications. The results show mainly two things. First of all, the accuracy of the framework is increased in most situations. The difference is the most significant when the captured scene only is covered from a small sector. However, noisy data can cause the modified pipeline to decrease in performance. Secondly, the run time of the framework is significantly reduced. A discussion of how to modify other parts of the pipeline is covered in the conclusion of the report. / Följande examensarbete behandlar ämnet djupassisterad Struktur genom Rörelse (eng. SfM). Vid klassisk SfM är målet att återskapa en 3D scen, endast med hjälp av en sekvens av oordnade RGB bilder. I djupassiterad SfM adderas djupinformationen till problemformulering och följaktligen har ett system som kan motta RGBD bilder skapats. Problemet har lösts genom att modifiera en befintlig SfM- mjukvara och mer specifikt dess Buntjustering (eng. BA). Resultatet från den modifierade mjukvaran jämförs med resultatet av originalutgåvan för att dra slutsatser rådande modifikationens påverkan på prestandan. Resultaten visar huvudsakligen två saker. Först och främst, den modifierade mjukvaran producerar resultat med högre noggrannhet i de allra flesta fall. Skillnaden är som allra störst när bilderna är tagna från endast en liten sektor som omringar scenen. Data med brus kan dock försämra systemets prestanda aningen jämfört med orginalsystemet. För det andra, så minskar exekutionstiden betydligt. Slutligen diskuteras hur mjukvaran kan vidareutvecklas för att ytterligare förbättra resultaten. 3D Reconstruction Structure from Motion Bundle Adjustment Multi- View Stereo Depth Assisted. 3D Rekonstruktion Struktur genom Rörelse Buntjustering Multi- View Stereo Djupassisterad. Computer and Information Sciences Data- och informationsvetenskap
37	3D real time object recognition Amplianitis, Konstantinos 01 March 2017 (has links) Die Objekterkennung ist ein natürlicher Prozess im Menschlichen Gehirn. Sie ndet im visuellen Kortex statt und nutzt die binokulare Eigenschaft der Augen, die eine drei- dimensionale Interpretation von Objekten in einer Szene erlaubt. Kameras ahmen das menschliche Auge nach. Bilder von zwei Kameras, in einem Stereokamerasystem, werden von Algorithmen für eine automatische, dreidimensionale Interpretation von Objekten in einer Szene benutzt. Die Entwicklung von Hard- und Software verbessern den maschinellen Prozess der Objek- terkennung und erreicht qualitativ immer mehr die Fähigkeiten des menschlichen Gehirns. Das Hauptziel dieses Forschungsfeldes ist die Entwicklung von robusten Algorithmen für die Szeneninterpretation. Sehr viel Aufwand wurde in den letzten Jahren in der zweidimen- sionale Objekterkennung betrieben, im Gegensatz zur Forschung zur dreidimensionalen Erkennung. Im Rahmen dieser Arbeit soll demnach die dreidimensionale Objekterkennung weiterent- wickelt werden: hin zu einer besseren Interpretation und einem besseren Verstehen von sichtbarer Realität wie auch der Beziehung zwischen Objekten in einer Szene. In den letzten Jahren aufkommende low-cost Verbrauchersensoren, wie die Microsoft Kinect, generieren Farb- und Tiefendaten einer Szene, um menschenähnliche visuelle Daten zu generieren. Das Ziel hier ist zu zeigen, wie diese Daten benutzt werden können, um eine neue Klasse von dreidimensionalen Objekterkennungsalgorithmen zu entwickeln - analog zur Verarbeitung im menschlichen Gehirn. / Object recognition is a natural process of the human brain performed in the visual cor- tex and relies on a binocular depth perception system that renders a three-dimensional representation of the objects in a scene. Hitherto, computer and software systems are been used to simulate the perception of three-dimensional environments with the aid of sensors to capture real-time images. In the process, such images are used as input data for further analysis and development of algorithms, an essential ingredient for simulating the complexity of human vision, so as to achieve scene interpretation for object recognition, similar to the way the human brain perceives it. The rapid pace of technological advancements in hardware and software, are continuously bringing the machine-based process for object recognition nearer to the inhuman vision prototype. The key in this eld, is the development of algorithms in order to achieve robust scene interpretation. A lot of recognisable and signi cant e ort has been successfully carried out over the years in 2D object recognition, as opposed to 3D. It is therefore, within this context and scope of this dissertation, to contribute towards the enhancement of 3D object recognition; a better interpretation and understanding of reality and the relationship between objects in a scene. Through the use and application of low-cost commodity sensors, such as Microsoft Kinect, RGB and depth data of a scene have been retrieved and manipulated in order to generate human-like visual perception data. The goal herein is to show how RGB and depth information can be utilised in order to develop a new class of 3D object recognition algorithms, analogous to the perception processed by the human brain. 3D Objekt Erkennung 3D Mensch Segmentierung Objekt Erkennung Conditional Random Fields Kinect-Sensor RGBD-Daten 3D Rekonstruktionen Bundle Adjustment ICP Registrierung 3D Object Recognition 3D Human Segmentation Object Detection Conditional Random Fields Kinect Sensor RGBD Data 3D Reconstructions Bundle Adjustment ICP registration 004 Informatik 28 Informatik, Datenverarbeitung ST 330 ddc:004
38	使用光束調整法與多張影像做相機效正與三維模型重建 / Using bundle adjustment for camera Calibration and 3D reconstruction from multiple images 蔡政君, Tsai, Jeng Jiun Unknown Date (has links) 自動化三維建模需要準確的三維點座標，而三維點的位置則依賴高精度的對應點，因此對應點的尋找一直是此領域的研究議題，而使用稀疏光束調整法(SBA：Sparse Bundle Adjustment)來優化相機參數也是常用的作法，然而若三維點當中有少數幾個誤差較大的點，則稀疏光束調整法會受到很大的影響。我們採用多視角影像做依據，找出對應點座標及幾何關係，在改善對應點位置的步驟中，我們藉由位移三維點法向量來取得各種不同位置的三維補綴面(3D patch)，並根據投影到影像上之補綴面的正規化相關匹配法(NCC：Normalized Cross Correlation)來改善對應點位置。利用這些改善過的點資訊，我們使用稀疏光束調整法來針對相機校正做進一步的優化，為了避免誤差較大的三維點影響到稀疏光束調整法的結果，我們使用穩健的計算方法來過濾這些三維點，藉由此方法來減少再投影誤差(reprojection error)，最後產生較精準的相機參數，使用此參數我們可以自動化建出外型架構較接近真實物體的模型。 / Automated 3D modeling of the need for accurate 3D points, and location of the 3D points depends on the accuracy of corresponding points, so the search for corresponding points in this area has been a research topic, and the use of SBA(Sparse Bundle Adjustment) to optimize the camera parameters is also a common practice, however, if there are a few more error 3D points, the SBA will be greatly affected. In this paper, we establish the corresponding points and their geometry relationship from multi-view images. And the 3D patches are used to refine point positions. We translate the normal to get many patches, and project them into visible images. The NCC(Normalized Cross Correlation) values between patches in reference image and patches in visible image are used to estimate the best correspondence points. And they are used to get better camera parameters by SBA(sparse bundle adjustment). Furthermore, it is because that it usually exist outliers in the data observed, and they will influence the results by using SBA. So, we use our robust estimation method to resist the outliers. In our experiment, SBA is used to filter some outliers to reduce the reprojection error. After getting more precise camera parameters, we use them to reconstruct the 3D model more realistic. 影像處理極限幾何對應點補綴面再投影誤差 image processing epipolar geometry corresponding points sparse bundle adjustment patch normalized cross correlation reprojection error
39	SLAM temporel à contraintes multiples / Multiple constraints and temporal SLAM Ramadasan, Datta 15 December 2015 (has links) Ce mémoire décrit mes travaux de thèse de doctorat menés au sein de l’équipe ComSee (Computers that See) rattachée à l’axe ISPR (Image, Systèmes de Perception et Robotique) de l’Institut Pascal. Celle-ci a été financée par la Région Auvergne et le Fonds Européen de Développement Régional. Les travaux présentés s’inscrivent dans le cadre d’applications de localisation pour la robotique mobile et la Réalité Augmentée. Le framework réalisé au cours de cette thèse est une approche générique pour l’implémentation d’applications de SLAM : Simultaneous Localization And Mapping (algorithme de localisation par rapport à un modèle simultanément reconstruit). L’approche intègre une multitude de contraintes dans les processus de localisation et de reconstruction. Ces contraintes proviennent de données capteurs mais également d’a priori liés au contexte applicatif. Chaque contrainte est utilisée au sein d’un même algorithme d’optimisation afin d’améliorer l’estimation du mouvement ainsi que la précision du modèle reconstruit. Trois problèmes ont été abordés au cours de ce travail. Le premier concerne l’utilisation de contraintes sur le modèle reconstruit pour l’estimation précise d’objets 3D partiellement connus et présents dans l’environnement. La seconde problématique traite de la fusion de données multi-capteurs, donc hétérogènes et asynchrones, en utilisant un unique algorithme d’optimisation. La dernière problématique concerne la génération automatique et efficace d’algorithmes d’optimisation à contraintes multiples. L’objectif est de proposer une solution temps réel 1 aux problèmes de SLAM à contraintes multiples. Une approche générique est utilisée pour concevoir le framework afin de gérer une multitude de configurations liées aux différentes contraintes des problèmes de SLAM. Un intérêt tout particulier a été porté à la faible consommation de ressources (mémoire et CPU) tout en conservant une grande portabilité. De plus, la méta-programmation est utilisée pour générer automatiquement et spécifiquement les parties les plus complexes du code en fonction du problème à résoudre. La bibliothèque d’optimisation LMA qui a été développée au cours de cette thèse est mise à disposition de la communauté en open-source. Des expérimentations sont présentées à la fois sur des données de synthèse et des données réelles. Un comparatif exhaustif met en évidence les performances de la bibliothèque LMA face aux alternatives les plus utilisées de l’état de l’art. De plus, le framework de SLAM est utilisé sur des problèmes impliquant une difficulté et une quantité de contraintes croissantes. Les applications de robotique mobile et de Réalité Augmentée mettent en évidence des performances temps réel et un niveau de précision qui croît avec le nombre de contraintes utilisées. / This report describes my thesis work conducted within the ComSee (Computers That See) team related to the ISPR axis (ImageS, Perception Systems and Robotics) of Institut Pascal. It was financed by the Auvergne Région and the European Fund of Regional Development. The thesis was motivated by localization issues related to Augmented Reality and autonomous navigation. The framework developed during this thesis is a generic approach to implement SLAM algorithms : Simultaneous Localization And Mapping. The proposed approach use multiple constraints in the localization and mapping processes. Those constraints come from sensors data and also from knowledge given by the application context. Each constraint is used into one optimization algorithm in order to improve the estimation of the motion and the accuracy of the map. Three problems have been tackled. The first deals with constraints on the map to accurately estimate the pose of 3D objects partially known in the environment. The second problem is about merging multiple heterogeneous and asynchronous data coming from different sensors using an optimization algorithm. The last problem is to write an efficient and real-time implementation of the SLAM problem using multiple constraints. A generic approach is used to design the framework and to generate different configurations, according to the constraints, of each SLAM problem. A particular interest has been put in the low computational requirement (in term of memory and CPU) while offering a high portability. Moreover, meta-programming techniques have been used to automatically and specifically generate the more complex parts of the code according to the given problem. The optimization library LMA, developed during this thesis, is made available of the community in open-source. Several experiments were done on synthesis and real data. An exhaustive benchmark shows the performances of the LMA library compared to the most used alternatives of the state of the art. Moreover, the SLAM framework is used on different problems with an increasing difficulty and amount of constraints. Augmented Reality and autonomous navigation applications show the good performances and accuracies in multiple constraints context. Reconstruction 3D SLAM visuel SLAM contraint Temps réel Ajustement de faisceaux Réalité augmentée Navigation autonome Levenberg-Marquardt C++ Méta-programmation Structure from motion Visual SLAM Constraint SLAM Real-Time Bundle Adjustment Augmented Reality Autonomous Navigation Levenberg-Marquardt C++ Meta-programming
40	L'ajustement de faisceaux contraint comme cadre d'unification des méthodes de localisation : application à la réalité augmentée sur des objets 3D / Constrained beam adjustment as a framework for unifying location methods : application to augmented reality on 3D objects Tamaazousti, Mohamed 13 March 2013 (has links) Les travaux réalisés au cours de cette thèse s’inscrivent dans la problématique de localisation en temps réel d’une caméra par vision monoculaire. Dans la littérature, il existe différentes méthodes qui peuvent être classées en trois catégories. La première catégorie de méthodes considère une caméra évoluant dans un environnement complètement inconnu (SLAM). Cette méthode réalise une reconstruction enligne de primitives observées dans des images d’une séquence vidéo et utilise cette reconstruction pour localiser la caméra. Les deux autres permettent une localisation par rapport à un objet 3D de la scène en s’appuyant sur la connaissance, a priori, d’un modèle de cet objet (suivi basé modèle). L’une utilise uniquement l’information du modèle 3D de l’objet pour localiser la caméra, l’autre peut être considérée comme l’intermédiaire entre le SLAM et le suivi basé modèle. Cette dernière méthode consiste à localiser une caméra par rapport à un objet en utilisant, d’une part, le modèle de ce dernier et d’autre part, une reconstruction en ligne des primitives de l’objet d’intérêt. Cette reconstruction peut être assimilée à une mise à jour du modèle initial (suivi basé modèle avec mise à jour). Chacune de ces méthodes possède des avantages et des inconvénients. Dans le cadre de ces travaux de thèse, nous proposons une solution unifiant l’ensemble de ces méthodes de localisation dans un unique cadre désigné sous le terme de SLAM contraint. Cette solution, qui unifie ces différentes méthodes, permet de tirer profit de leurs avantages tout en limitant leurs inconvénients respectifs. En particulier, nous considérons que la caméra évolue dans un environnement partiellement connu, c’est-à-dire pour lequel un modèle (géométrique ou photométrique) 3D d’un objet statique de la scène est disponible. L’objectif est alors d’estimer de manière précise la pose de la caméra par rapport à cet objet 3D. L’information absolue issue du modèle 3D de l’objet d’intérêt est utilisée pour améliorer la localisation de type SLAM en incluant cette information additionnelle directement dans le processus d’ajustement de faisceaux. Afin de pouvoir gérer un large panel d’objets 3D et de scènes, plusieurs types de contraintes sont proposées dans ce mémoire. Ces différentes contraintes sont regroupées en deux approches. La première permet d’unifier les méthodes SLAM et de suivi basé modèle, en contraignant le déplacement de la caméra via la projection de primitives existantes extraites du modèle 3D dans les images. La seconde unifie les méthodes SLAM et de suivi basé modèle avec mise à jour en contraignant les primitives reconstruites par le SLAM à appartenir à la surface du modèle (unification SLAM et mise à jour du modèle). Les avantages de ces différents ajustements de faisceaux contraints, en terme de précision, de stabilité de recalage et de robustesse aux occultations, sont démontrés sur un grand nombre de données de synthèse et de données réelles. Des applications temps réel de réalité augmentée sont également présentées sur différents types d’objets 3D. Ces travaux ont fait l’objet de 4 publications internationales, de 2 publications nationales et d’un dépôt de brevet. / This thesis tackles the problem of real time location of a monocular camera. In the literature, there are different methods which can be classified into three categories. The first category considers a camera moving in a completely unknown environment (SLAM). This method performs an online reconstruction of the observed primitives in the images and uses this reconstruction to estimate the location of the camera. The two other categories of methods estimate the location of the camera with respect to a 3D object in the scene. The estimation is based on an a priori knowledge of a model of the object (Model-based). One of these two methods uses only the information of the 3D model of the object to locate the camera. The other method may be considered as an intermediary between the SLAM and Model-based approaches. It consists in locating the camera with respect to the object of interest by using, on one hand the 3D model of this object, and on the other hand an online reconstruction of the primitives of the latter. This last online reconstruction can be regarded as an update of the initial 3D model (Model-based with update). Each of these methods has advantages and disadvantages. In the context of this thesis, we propose a solution in order to unify all these localization methods in a single framework referred to as the constrained SLAM, by taking parts of their benefits and limiting their disadvantages. We, particularly, consider that the camera moves in a partially known environment, i.e. for which a 3D model (geometric or photometric) of a static object in the scene is available. The objective is then to accurately estimate the pose (position and orientation) of the camera with respect to this object. The absolute information provided by the 3D model of the object is used to improve the localization of the SLAM by directly including this additional information in the bundle adjustment process. In order to manage a wide range of 3D objets and scenes, various types of constraints are proposed in this study and grouped into two approaches. The first one allows to unify the SLAM and Model-based methods by constraining the trajectory of the camera through the projection, in the images, of the 3D primitives extracted from the model. The second one unifies the SLAM and Model-based with update methods, by constraining the reconstructed 3D primitives of the object to belong to the surface of the model (unification SLAM and model update). The benefits of the constrained bundle adjustment framework in terms of accuracy, stability, robustness to occlusions, are demonstrated on synthetic and real data. Real time applications of augmented reality are also presented on different types of 3D objects. This work has been the subject of four international publications, two national publications and one patent. Vision par ordinateur Réalité augmentée SLAM contraint Ajustement de faisceaux contraint Suivi basé modèle Computer vision Simultaneous Localization and Mapping Augmented Reality Constrained SLAM Constrained Bundle Adjustment Model-based tracking

Search results