Global ETD Search

101	Learning objects model and context for recognition and localisation / Apprentissage de modèles et contextes d'objets pour la reconnaissance et la localisation Manfredi, Guido 18 September 2015 (has links) Cette thèse traite des problèmes de modélisation, reconnaissance, localisation et utilisation du contexte pour la manipulation d'objets par un robot. Le processus de modélisation se divise en quatre composantes : le système réel, les données capteurs, les propriétés à reproduire et le modèle. En spécifiant chacune des ces composantes, il est possible de définir un processus de modélisation adapté au problème présent, la manipulation d'objets par un robot. Cette analyse mène à l'adoption des descripteurs de texture locaux pour la modélisation. La modélisation basée sur des descripteurs de texture locaux a été abordé dans de nombreux travaux traitant de structure par le mouvement (SfM) ou de cartographie et localisation simultanée (SLAM). Les méthodes existantes incluent Bundler, Roboearth et 123DCatch. Pourtant, aucune de ces méthodes n'a recueilli le consensus. En effet, l'implémentation d'une approche similaire montre que ces outils sont difficiles d'utilisation même pour des utilisateurs experts et qu'ils produisent des modèles d'une haute complexité. Cette complexité est utile pour fournir un modèle robuste aux variations de point de vue. Il existe deux façons pour un modèle d'être robuste : avec le paradigme des vues multiple ou celui des descripteurs forts. Dans le paradigme des vues multiples, le modèle est construit à partir d'un grand nombre de points de vue de l'objet. Le paradigme des descripteurs forts compte sur des descripteurs résistants aux changements de points de vue. Les expériences réalisées montrent que des descripteurs forts permettent d'utiliser un faible nombre de vues, ce qui résulte en un modèle simple. Ces modèles simples n'incluent pas tout les point de vus existants mais les angles morts peuvent être compensés par le fait que le robot est mobile et peut adopter plusieurs points de vue. En se basant sur des modèles simples, il est possible de définir des méthodes de modélisation basées sur des images seules, qui peuvent être récupérées depuis Internet. A titre d'illustration, à partir d'un nom de produit, il est possible de récupérer des manières totalement automatiques des images depuis des magasins en ligne et de modéliser puis localiser les objets désirés. Même avec une modélisation plus simple, dans des cas réel ou de nombreux objets doivent être pris en compte, il se pose des problèmes de stockage et traitement d'une telle masse de données. Cela se décompose en un problème de complexité, il faut traiter de nombreux modèles rapidement, et un problème d'ambiguïté, des modèles peuvent se ressembler. L'impact de ces deux problèmes peut être réduit en utilisant l'information contextuelle. Le contexte est toute information non issue des l'objet lui même et qui aide a la reconnaissance. Ici deux types de contexte sont abordés : le lieu et les objets environnants. Certains objets se trouvent dans certains endroits particuliers. En connaissant ces liens lieu/objet, il est possible de réduire la liste des objets candidats pouvant apparaître dans un lieu donné. Par ailleurs l'apprentissage du lien lieu/objet peut être fait automatiquement par un robot en modélisant puis explorant un environnement. L'information appris peut alors être fusionnée avec l'information visuelle courante pour améliorer la reconnaissance. Dans les cas des objets environnants, un objet peut souvent apparaître au cotés d'autres objets, par exemple une souris et un clavier. En connaissant la fréquence d'apparition d'un objet avec d'autres objets, il est possible de réduire la liste des candidats lors de la reconnaissance. L'utilisation d'un Réseau de Markov Logique est particulièrement adaptée à la fusion de ce type de données. Cette thèse montre la synergie de la robotique et du contexte pour la modélisation, reconnaissance et localisation d'objets. / This Thesis addresses the modeling, recognition, localization and use of context for objects manipulation by a robot. We start by presenting the modeling process and its components: the real system, the sensors' data, the properties to reproduce and the model. We show how, by specifying each of them, one can define a modeling process adapted to the problem at hand, namely object manipulation by a robot. This analysis leads us to the adoption of local textured descriptors for object modeling. Modeling with local textured descriptors is not a new concept, it is the subject of many Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM) works. Existing methods include bundler, roboearth modeler and 123DCatch. Still, no method has gained widespread adoption. By implementing a similar approach, we show that they are hard to use even for expert users and produce highly complex models. Such complex techniques are necessary to guaranty the robustness of the model to view point change. There are two ways to handle the problem: the multiple views paradigm and the robust features paradigm. The multiple views paradigm advocate in favor of using a large number of views of the object. The robust feature paradigm relies on robust features able to resist large view point changes. We present a set of experiments to provide an insight into the right balance between both. By varying the number of views and using different features we show that small and fast models can provide robustness to view point changes up to bounded blind spots which can be handled by robotic means. We propose four different methods to build simple models from images only, with as little a priori information as possible. The first one applies to planar or piecewise planar objects and relies on homographies for localization. The second approach is applicable to objects with simple geometry, such as cylinders or spheres, but requires many measures on the object. The third method requires the use of a calibrated 3D sensor but no additional information. The fourth technique doesn't need a priori information at all. We apply this last method to autonomous grocery objects modeling. From images automatically retrieved from a grocery store website, we build a model which allows recognition and localization for tracking. Even using light models, real situations ask for numerous object models to be stored and processed. This poses the problems of complexity, processing multiple models quickly, and ambiguity, distinguishing similar objects. We propose to solve both problems by using contextual information. Contextual information is any information helping the recognition which is not directly provided by sensors. We focus on two contextual cues: the place and the surrounding objects. Some objects are mainly found in some particular places. By knowing the current place, one can restrict the number of possible identities for a given object. We propose a method to autonomously explore a previously labeled environment and establish a correspondence between objects and places. Then this information can be used in a cascade combining simple visual descriptors and context. This experiment shows that, for some objects, recognition can be achieved with as few as two simple features and the location as context. The objects surrounding a given object can also be used as context. Objects like a keyboard, a mouse and a monitor are often close together. We use qualitative spatial descriptors to describe the position of objects with respect to their neighbors. Using a Markov Logic Network, we learn patterns in objects disposition. This information can then be used to recognize an object when surrounding objects are already identified. This Thesis stresses the good match between robotics, context and objects recognition. Modélisation Reconnaissance Localisation Contexte Cooccurrence d'objets Réseaux logiques de Markov Structure par le mouvement SLAM Structure from motion Object modeling Object recognition Object localization Context learning Robotics
102	Using Structure-from-Motion Technology to Compare Coral Coverage on Restored vs. Unrestored Reefs Rosing, Trina 17 June 2021 (has links) No description available. Ecology Environmental Science Aquatic Sciences Biology Biological Oceanography Oceanography coral reefs coral bleaching photomosaics structure-from-motion technology CoralNet biodiversity marine ecosystems coral restoration Shannon-Wiener diversity index
103	Mobilní aplikace pro 3D rekonstrukci / Mobile application for 3D reconstruction Krátký, Martin January 2021 (has links) The aim of this master thesis is to create mobile application for spatial reconstruction. Thesis describes image processing methods suitable for solving this problem. Available platforms for creating mobile application are described. The parameters of the measured scenes are defined. A mobile application containing the described methods is created. The application is tested by reconstruction of objects in different conditions.
104	Approches 2D/2D pour le SFM à partir d'un réseau de caméras asynchrones / 2D/2D approaches for SFM using an asynchronous multi-camera network Mhiri, Rawia 14 December 2015 (has links) Les systèmes d'aide à la conduite et les travaux concernant le véhicule autonome ont atteint une certaine maturité durant ces dernières aimées grâce à l'utilisation de technologies avancées. Une étape fondamentale pour ces systèmes porte sur l'estimation du mouvement et de la structure de l'environnement (Structure From Motion) pour accomplir plusieurs tâches, notamment la détection d'obstacles et de marquage routier, la localisation et la cartographie. Pour estimer leurs mouvements, de tels systèmes utilisent des capteurs relativement chers. Pour être commercialisés à grande échelle, il est alors nécessaire de développer des applications avec des dispositifs bas coûts. Dans cette optique, les systèmes de vision se révèlent une bonne alternative. Une nouvelle méthode basée sur des approches 2D/2D à partir d'un réseau de caméras asynchrones est présentée afin d'obtenir le déplacement et la structure 3D à l'échelle absolue en prenant soin d'estimer les facteurs d'échelle. La méthode proposée, appelée méthode des triangles, se base sur l'utilisation de trois images formant un triangle : deux images provenant de la même caméra et une image provenant d'une caméra voisine. L'algorithme admet trois hypothèses: les caméras partagent des champs de vue communs (deux à deux), la trajectoire entre deux images consécutives provenant d'une même caméra est approximée par un segment linéaire et les caméras sont calibrées. La connaissance de la calibration extrinsèque entre deux caméras combinée avec l'hypothèse de mouvement rectiligne du système, permet d'estimer les facteurs d'échelle absolue. La méthode proposée est précise et robuste pour les trajectoires rectilignes et présente des résultats satisfaisants pour les virages. Pour affiner l'estimation initiale, certaines erreurs dues aux imprécisions dans l'estimation des facteurs d'échelle sont améliorées par une méthode d'optimisation : un ajustement de faisceaux local appliqué uniquement sur les facteurs d'échelle absolue et sur les points 3D. L'approche présentée est validée sur des séquences de scènes routières réelles et évaluée par rapport à la vérité terrain obtenue par un GPS différentiel. Une application fondamentale dans les domaines d'aide à la conduite et de la conduite automatisée est la détection de la route et d'obstacles. Pour un système asynchrone, une première approche pour traiter cette application est présentée en se basant sur des cartes de disparité éparses. / Driver assistance systems and autonomous vehicles have reached a certain maturity in recent years through the use of advanced technologies. A fundamental step for these systems is the motion and the structure estimation (Structure From Motion) that accomplish several tasks, including the detection of obstacles and road marking, localisation and mapping. To estimate their movements, such systems use relatively expensive sensors. In order to market such systems on a large scale, it is necessary to develop applications with low cost devices. In this context, vision systems is a good alternative. A new method based on 2D/2D approaches from an asynchronous multi-camera network is presented to obtain the motion and the 3D structure at the absolute scale, focusing on estimating the scale factors. The proposed method, called Triangle Method, is based on the use of three images forming a. triangle shape: two images from the same camera and an image from a neighboring camera. The algorithrn has three assumptions: the cameras share common fields of view (two by two), the path between two consecutive images from a single camera is approximated by a line segment, and the cameras are calibrated. The extrinsic calibration between two cameras combined with the assumption of rectilinear motion of the system allows to estimate the absolute scale factors. The proposed method is accurate and robust for straight trajectories and present satisfactory results for curve trajectories. To refine the initial estimation, some en-ors due to the inaccuracies of the scale estimation are improved by an optimization method: a local bundle adjustment applied only on the absolute scale factors and the 3D points. The presented approach is validated on sequences of real road scenes, and evaluated with respect to the ground truth obtained through a differential GPS. Finally, another fundamental application in the fields of driver assistance and automated driving is road and obstacles detection. A method is presented for an asynchronous system based on sparse disparity maps Odométrie visuelle Méthode des triangles Ajustement de faisceaux local Structure from motion Visual odometry Asynchronous multi-camera system Triangle-based method Local bundle adjustment Obstacle detection
105	Comparing Structure from Motion Photogrammetry and Computer Vision for Low-Cost 3D Cave Mapping: Tipton-Haynes Cave, Tennessee Elmore, Clinton 01 August 2019 (has links) Natural caves represent one of the most difficult environments to map with modern 3D technologies. In this study I tested two relatively new methods for 3D mapping in Tipton-Haynes Cave near Johnson City, Tennessee: Structure from Motion Photogrammetry and Computer Vision using Tango, an RGB-D (Red Green Blue and Depth) technology. Many different aspects of these two methods were analyzed with respect to the needs of average cave explorers. Major considerations were cost, time, accuracy, durability, simplicity, lighting setup, and drift. The 3D maps were compared to a conventional cave map drafted with measurements from a modern digital survey instrument called the DistoX2, a clinometer, and a measuring tape. Both 3D mapping methods worked, but photogrammetry proved to be too time consuming and laborious for capturing more than a few meters of passage. RGB-D was faster, more accurate, and showed promise for the future of low-cost 3D cave mapping. Speleology Caves Cave Mapping 3D Mapping Google Tango RGB-D Photogrammetry Structure from Motion Paleontology DistoX Cave Survey Geology Other Earth Sciences
106	Potentialities of Unmanned Aerial Vehicles in Hydraulic Modelling : Drone remote sensing through photogrammetry for 1D flow numerical modelling Reali, Andrea January 2018 (has links) In civil and environmental engineering numerous are the applications that require prior collection of data on the ground. When it comes to hydraulic modelling, valuable topographic and morphology features of the region are one of the most useful of them, yet often unavailable, expensive or difficult to obtain. In the last few years UAVs entered the scene of remote sensing tools used to deliver such information and their applications connected to various photo-analysis techniques have been tested in specific engineering fields, with promising results. The content of this thesis aims contribute to the growing literature on the topic, assessing the potentialities of UAV and SfM photogrammetry analysis in developing terrain elevation models to be used as input data for numerical flood modelling. This thesis covered all phases of the engineering process, from the survey to the implementation of a 1D hydraulic model based on the photogrammetry derived topography The area chosen for the study was the Limpopo river. The challenging environment of the Mozambican inland showed the great advantages of this technology, which allowed a precise and fast survey easily overcoming risks and difficulties. The test on the field was also useful to expose the current limits of the drone tool in its high susceptibility to weather conditions, wind and temperatures and the restricted battery capacity which did not allow flight longer than 20 minutes. The subsequent photogrammetry analysis showed a high degree of dependency on a number of ground control points and the need of laborious post-processing manipulations in order to obtain a reliable DEM and avoid the insurgence of dooming effects. It revealed, this way, the importance of understanding the drone and the photogrammetry software as a single instrument to deliver a quality DEM and consequently the importance of planning a survey photogrammetry-oriented by the adoption of specific precautions. Nevertheless, the DEM we produced presented a degree of spatial resolution comparable to the one high precision topography sources. Finally, considering four different topography sources (SRTM DEM 30 m, lidar DEM 1 m, drone DEM 0.6 m, total station&RTK bathymetric cross sections o.5 m) the relationship between spatial accuracy and water depth estimation was tested through 1D, steady flow models on HECRAS. The performances of each model were expressed in terms of mean absolute error (MAE) in water depth estimations of the considered model compared to the one based on the bathymetric cross-sections. The result confirmed the potentialities of the drone for hydraulic engineering applications, with MAE differences between lidar, bathymetry and drone included within 1 m. The calibration of SRTM, Lidar and Drone based models to the bathymetry one demonstrated the relationship between geometry detail and roughness of the cross-sections, with a global improvement in the MAE, but more pronounced for the coarse geometry of SRTM. UAV remote sensing structure from motion photogrammetry hydraulic modelling Engineering and Technology Teknik och teknologier Civil Engineering Samhällsbyggnadsteknik Water Engineering Vattenteknik
107	Registration and Localization of Unknown Moving Objects in Markerless Monocular SLAM Troutman, Blake 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Simultaneous localization and mapping (SLAM) is a general device localization technique that uses realtime sensor measurements to develop a virtualization of the sensor's environment while also using this growing virtualization to determine the position and orientation of the sensor. This is useful for augmented reality (AR), in which a user looks through a head-mounted display (HMD) or viewfinder to see virtual components integrated into the real world. Visual SLAM (i.e., SLAM in which the sensor is an optical camera) is used in AR to determine the exact device/headset movement so that the virtual components can be accurately redrawn to the screen, matching the perceived motion of the world around the user as the user moves the device/headset. However, many potential AR applications may need access to more than device localization data in order to be useful; they may need to leverage environment data as well. Additionally, most SLAM solutions make the naive assumption that the environment surrounding the system is completely static (non-moving). Given these circumstances, it is clear that AR may benefit substantially from utilizing a SLAM solution that detects objects that move in the scene and ultimately provides localization data for each of these objects. This problem is known as the dynamic SLAM problem. Current attempts to address the dynamic SLAM problem often use machine learning to develop models that identify the parts of the camera image that belong to one of many classes of potentially-moving objects. The limitation with these approaches is that it is impractical to train models to identify every possible object that moves; additionally, some potentially-moving objects may be static in the scene, which these approaches often do not account for. Some other attempts to address the dynamic SLAM problem also localize the moving objects they detect, but these systems almost always rely on depth sensors or stereo camera configurations, which have significant limitations in real-world use cases. This dissertation presents a novel approach for registering and localizing unknown moving objects in the context of markerless, monocular, keyframe-based SLAM with no required prior information about object structure, appearance, or existence. This work also details a novel deep learning solution for determining SLAM map initialization suitability in structure-from-motion-based initialization approaches. This dissertation goes on to validate these approaches by implementing them in a markerless, monocular SLAM system called LUMO-SLAM, which is built from the ground up to demonstrate this approach to unknown moving object registration and localization. Results are collected for the LUMO-SLAM system, which address the accuracy of its camera localization estimates, the accuracy of its moving object localization estimates, and the consistency with which it registers moving objects in the scene. These results show that this solution to the dynamic SLAM problem, though it does not act as a practical solution for all use cases, has an ability to accurately register and localize unknown moving objects in such a way that makes it useful for some applications of AR without thwarting the system's ability to also perform accurate camera localization. Computer Vision Augmented Reality Virtual Reality Monocular Vision Bundle Adjustment Structure from Motion (SfM) Dynamic SLAM Moving Object Tracking
108	Structure from Motion with Unstructured RGBD Data Svensson, Niclas January 2021 (has links) This thesis covers the topic of depth- assisted Structure from Motion (SfM). When performing classic SfM, the goal is to reconstruct a 3D scene using only a set of unstructured RGB images. What is attempted to be achieved in this thesis is adding the depth dimension to the problem formulation, and consequently create a system that can receive a set of RGBD images. The problem has been addressed by modifying an already existing SfM pipeline and in particular, its Bundle Adjustment (BA) process. Comparisons between the modified framework and the baseline framework resulted in conclusions regarding the impact of the modifications. The results show mainly two things. First of all, the accuracy of the framework is increased in most situations. The difference is the most significant when the captured scene only is covered from a small sector. However, noisy data can cause the modified pipeline to decrease in performance. Secondly, the run time of the framework is significantly reduced. A discussion of how to modify other parts of the pipeline is covered in the conclusion of the report. / Följande examensarbete behandlar ämnet djupassisterad Struktur genom Rörelse (eng. SfM). Vid klassisk SfM är målet att återskapa en 3D scen, endast med hjälp av en sekvens av oordnade RGB bilder. I djupassiterad SfM adderas djupinformationen till problemformulering och följaktligen har ett system som kan motta RGBD bilder skapats. Problemet har lösts genom att modifiera en befintlig SfM- mjukvara och mer specifikt dess Buntjustering (eng. BA). Resultatet från den modifierade mjukvaran jämförs med resultatet av originalutgåvan för att dra slutsatser rådande modifikationens påverkan på prestandan. Resultaten visar huvudsakligen två saker. Först och främst, den modifierade mjukvaran producerar resultat med högre noggrannhet i de allra flesta fall. Skillnaden är som allra störst när bilderna är tagna från endast en liten sektor som omringar scenen. Data med brus kan dock försämra systemets prestanda aningen jämfört med orginalsystemet. För det andra, så minskar exekutionstiden betydligt. Slutligen diskuteras hur mjukvaran kan vidareutvecklas för att ytterligare förbättra resultaten. 3D Reconstruction Structure from Motion Bundle Adjustment Multi- View Stereo Depth Assisted. 3D Rekonstruktion Struktur genom Rörelse Buntjustering Multi- View Stereo Djupassisterad. Computer and Information Sciences Data- och informationsvetenskap
109	Modeling Smooth Time-Trajectories for Camera and Deformable Shape in Structure from Motion with Occlusion Gotardo, Paulo Fabiano Urnau 28 September 2010 (has links) No description available. Artificial Intelligence Computer Science Electrical Engineering Mathematics Motion Pictures Robots structure from motion matrix factorization missing data camera trajectory shape trajectory
110	Pose Estimation and Structure Analysis of Image Sequences Hedborg, Johan January 2009 (has links) Autonomous navigation for ground vehicles has many challenges. Autonomous systems must be able to self-localise, avoid obstacles and determine navigable surfaces. This thesis studies several aspects of autonomous navigation with a particular emphasis on vision, motivated by it being a primary component for navigation in many high-level biological organisms. The key problem of self-localisation or pose estimation can be solved through analysis of the changes in appearance of rigid objects observed from different view points. We therefore describe a system for structure and motion estimation for real-time navigation and obstacle avoidance. With the explicit assumption of a calibrated camera, we have studied several schemes for increasing accuracy and speed of the estimation.The basis of most structure and motion pose estimation algorithms is a good point tracker. However point tracking is computationally expensive and can occupy a large portion of the CPU resources. In thisthesis we show how a point tracker can be implemented efficiently on the graphics processor, which results in faster tracking of points and the CPU being available to carry out additional processing tasks.In addition we propose a novel view interpolation approach, that can be used effectively for pose estimation given previously seen views. In this way, a vehicle will be able to estimate its location by interpolating previously seen data.Navigation and obstacle avoidance may be carried out efficiently using structure and motion, but only whitin a limited range from the camera. In order to increase this effective range, additional information needs to be incorporated, more specifically the location of objects in the image. For this, we propose a real-time object recognition method, which uses P-channel matching, which may be used for improving navigation accuracy at distances where structure estimation is unreliable. / Diplecs KLT GPU structure from motion stereo pose estimation Engineering and Technology Teknik och teknologier Signal Processing Signalbehandling

Search results