• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 67
  • 16
  • 9
  • 6
  • 3
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 137
  • 137
  • 132
  • 39
  • 38
  • 32
  • 32
  • 28
  • 25
  • 17
  • 16
  • 15
  • 15
  • 14
  • 13
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

Comparing Structure from Motion Photogrammetry and Computer Vision for Low-Cost 3D Cave Mapping: Tipton-Haynes Cave, Tennessee

Elmore, Clinton 01 August 2019 (has links)
Natural caves represent one of the most difficult environments to map with modern 3D technologies. In this study I tested two relatively new methods for 3D mapping in Tipton-Haynes Cave near Johnson City, Tennessee: Structure from Motion Photogrammetry and Computer Vision using Tango, an RGB-D (Red Green Blue and Depth) technology. Many different aspects of these two methods were analyzed with respect to the needs of average cave explorers. Major considerations were cost, time, accuracy, durability, simplicity, lighting setup, and drift. The 3D maps were compared to a conventional cave map drafted with measurements from a modern digital survey instrument called the DistoX2, a clinometer, and a measuring tape. Both 3D mapping methods worked, but photogrammetry proved to be too time consuming and laborious for capturing more than a few meters of passage. RGB-D was faster, more accurate, and showed promise for the future of low-cost 3D cave mapping.
112

Potentialities of Unmanned Aerial Vehicles in Hydraulic Modelling : Drone remote sensing through photogrammetry for 1D flow numerical modelling

Reali, Andrea January 2018 (has links)
In civil and environmental engineering numerous are the applications that require prior collection of data on the ground. When it comes to hydraulic modelling, valuable topographic and morphology features of the region are one of the most useful of them, yet often unavailable, expensive or difficult to obtain. In the last few years UAVs entered the scene of remote sensing tools used to deliver such information and their applications connected to various photo-analysis techniques have been tested in specific engineering fields, with promising results. The content of this thesis aims contribute to the growing literature on the topic, assessing the potentialities of UAV and SfM photogrammetry analysis in developing terrain elevation models to be used as input data for numerical flood modelling. This thesis covered all phases of the engineering process, from the survey to the implementation of a 1D hydraulic model based on the photogrammetry derived topography The area chosen for the study was the Limpopo river. The challenging environment of the Mozambican inland showed the great advantages of this technology, which allowed a precise and fast survey easily overcoming risks and difficulties. The test on the field was also useful to expose the current limits of the drone tool in its high susceptibility to weather conditions, wind and temperatures and the restricted battery capacity which did not allow flight longer than 20 minutes. The subsequent photogrammetry analysis showed a high degree of dependency on a number of ground control points and the need of laborious post-processing manipulations in order to obtain a reliable DEM and avoid the insurgence of dooming effects. It revealed, this way, the importance of understanding the drone and the photogrammetry software as a single instrument to deliver a quality DEM and consequently the importance of planning a survey photogrammetry-oriented by the adoption of specific precautions. Nevertheless, the DEM we produced presented a degree of spatial resolution comparable to the one high precision topography sources. Finally, considering four different topography sources (SRTM DEM 30 m, lidar DEM 1 m, drone DEM 0.6 m, total station&RTK bathymetric cross sections o.5 m) the relationship between spatial accuracy and water depth estimation was tested through 1D, steady flow models on HECRAS. The performances of each model were expressed in terms of mean absolute error (MAE) in water depth estimations of the considered model compared to the one based on the bathymetric cross-sections. The result confirmed the potentialities of the drone for hydraulic engineering applications, with MAE differences between lidar, bathymetry and drone included within 1 m. The calibration of SRTM, Lidar and Drone based models to the bathymetry one demonstrated the relationship between geometry detail and roughness of the cross-sections, with a global improvement in the MAE, but more pronounced for the coarse geometry of SRTM.
113

Registration and Localization of Unknown Moving Objects in Markerless Monocular SLAM

Troutman, Blake 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Simultaneous localization and mapping (SLAM) is a general device localization technique that uses realtime sensor measurements to develop a virtualization of the sensor's environment while also using this growing virtualization to determine the position and orientation of the sensor. This is useful for augmented reality (AR), in which a user looks through a head-mounted display (HMD) or viewfinder to see virtual components integrated into the real world. Visual SLAM (i.e., SLAM in which the sensor is an optical camera) is used in AR to determine the exact device/headset movement so that the virtual components can be accurately redrawn to the screen, matching the perceived motion of the world around the user as the user moves the device/headset. However, many potential AR applications may need access to more than device localization data in order to be useful; they may need to leverage environment data as well. Additionally, most SLAM solutions make the naive assumption that the environment surrounding the system is completely static (non-moving). Given these circumstances, it is clear that AR may benefit substantially from utilizing a SLAM solution that detects objects that move in the scene and ultimately provides localization data for each of these objects. This problem is known as the dynamic SLAM problem. Current attempts to address the dynamic SLAM problem often use machine learning to develop models that identify the parts of the camera image that belong to one of many classes of potentially-moving objects. The limitation with these approaches is that it is impractical to train models to identify every possible object that moves; additionally, some potentially-moving objects may be static in the scene, which these approaches often do not account for. Some other attempts to address the dynamic SLAM problem also localize the moving objects they detect, but these systems almost always rely on depth sensors or stereo camera configurations, which have significant limitations in real-world use cases. This dissertation presents a novel approach for registering and localizing unknown moving objects in the context of markerless, monocular, keyframe-based SLAM with no required prior information about object structure, appearance, or existence. This work also details a novel deep learning solution for determining SLAM map initialization suitability in structure-from-motion-based initialization approaches. This dissertation goes on to validate these approaches by implementing them in a markerless, monocular SLAM system called LUMO-SLAM, which is built from the ground up to demonstrate this approach to unknown moving object registration and localization. Results are collected for the LUMO-SLAM system, which address the accuracy of its camera localization estimates, the accuracy of its moving object localization estimates, and the consistency with which it registers moving objects in the scene. These results show that this solution to the dynamic SLAM problem, though it does not act as a practical solution for all use cases, has an ability to accurately register and localize unknown moving objects in such a way that makes it useful for some applications of AR without thwarting the system's ability to also perform accurate camera localization.
114

Structure from Motion with Unstructured RGBD Data

Svensson, Niclas January 2021 (has links)
This thesis covers the topic of depth- assisted Structure from Motion (SfM). When performing classic SfM, the goal is to reconstruct a 3D scene using only a set of unstructured RGB images. What is attempted to be achieved in this thesis is adding the depth dimension to the problem formulation, and consequently create a system that can receive a set of RGBD images. The problem has been addressed by modifying an already existing SfM pipeline and in particular, its Bundle Adjustment (BA) process. Comparisons between the modified framework and the baseline framework resulted in conclusions regarding the impact of the modifications. The results show mainly two things. First of all, the accuracy of the framework is increased in most situations. The difference is the most significant when the captured scene only is covered from a small sector. However, noisy data can cause the modified pipeline to decrease in performance. Secondly, the run time of the framework is significantly reduced. A discussion of how to modify other parts of the pipeline is covered in the conclusion of the report. / Följande examensarbete behandlar ämnet djupassisterad Struktur genom Rörelse (eng. SfM). Vid klassisk SfM är målet att återskapa en 3D scen, endast med hjälp av en sekvens av oordnade RGB bilder. I djupassiterad SfM adderas djupinformationen till problemformulering och följaktligen har ett system som kan motta RGBD bilder skapats. Problemet har lösts genom att modifiera en befintlig SfM- mjukvara och mer specifikt dess Buntjustering (eng. BA). Resultatet från den modifierade mjukvaran jämförs med resultatet av originalutgåvan för att dra slutsatser rådande modifikationens påverkan på prestandan. Resultaten visar huvudsakligen två saker. Först och främst, den modifierade mjukvaran producerar resultat med högre noggrannhet i de allra flesta fall. Skillnaden är som allra störst när bilderna är tagna från endast en liten sektor som omringar scenen. Data med brus kan dock försämra systemets prestanda aningen jämfört med orginalsystemet. För det andra, så minskar exekutionstiden betydligt. Slutligen diskuteras hur mjukvaran kan vidareutvecklas för att ytterligare förbättra resultaten.
115

Modeling Smooth Time-Trajectories for Camera and Deformable Shape in Structure from Motion with Occlusion

Gotardo, Paulo Fabiano Urnau 28 September 2010 (has links)
No description available.
116

Pose Estimation and Structure Analysis of Image Sequences

Hedborg, Johan January 2009 (has links)
Autonomous navigation for ground vehicles has many challenges. Autonomous systems must be able to self-localise, avoid obstacles and determine navigable surfaces. This thesis studies several aspects of autonomous navigation with a particular emphasis on vision, motivated by it being a primary component for navigation in many high-level biological organisms.  The key problem of self-localisation or pose estimation can be solved through analysis of the changes in appearance of rigid objects observed from different view points. We therefore describe a system for structure and motion estimation for real-time navigation and obstacle avoidance. With the explicit assumption of a calibrated camera, we have studied several schemes for increasing accuracy and speed of the estimation.The basis of most structure and motion pose estimation algorithms is a good point tracker. However point tracking is computationally expensive and can occupy a large portion of the CPU resources. In thisthesis we show how a point tracker can be implemented efficiently on the graphics processor, which results in faster tracking of points and the CPU being available to carry out additional processing tasks.In addition we propose a novel view interpolation approach, that can be used effectively for pose estimation given previously seen views. In this way, a vehicle will be able to estimate its location by interpolating previously seen data.Navigation and obstacle avoidance may be carried out efficiently using structure and motion, but only whitin a limited range from the camera. In order to increase this effective range, additional information needs to be incorporated, more specifically the location of objects in the image. For this, we propose a real-time object recognition method, which uses P-channel matching, which may be used for improving navigation accuracy at distances where structure estimation is unreliable. / Diplecs
117

Use of consumer grade small unmanned aerial systems (sUAS) for mapping storm damage in forested environments

Cox, James Dewey 13 May 2022 (has links) (PDF)
Storm damages to forested environments pose significant challenges to landowners, land managers, and conservationists alike. Damage scope and scale assessments can be difficult, costly, and time consuming with conventional pedestrian survey techniques. Consumer grade sUAS technology offers an efficient, cost-effective way to accurately assess storm damage in small to moderate sized survey areas (less than 10 km²). Data were collected over a 0.195 km² area of damaged timber within the Kisatchie National Forest in Central Louisiana using a DJI Mavic 2 Pro drone. Collected imagery was processed into an orthomosaic using Agisoft Metashape Professional with a resulting ground sampling distance of 2.58 cm per pixel. Combined X and Y ground distance accuracy r was calculated as 1.39230 meters and a combined horizontal error was calculated as 0.810455526 meters. From the generated orthomosaic, the total storm damage area was estimated as 2.68 Ha, or 6.63 ac based on digitized polygon area calculations.
118

Contributions to Monocular Deformable 3D Reconstruction : Curvilinear Objects and Multiple Visual Cues / Contributions à la reconstruction 3D déformable monoculaire : objets curvilinéaires et indices visuels multiples

Gallardo, Mathias 20 September 2018 (has links)
La reconstruction 3D monoculaire déformable est le problème général d'estimation de forme 3D d'un objet déformable à partir d'images 2D. Plusieurs scénarios ont émergé : le Shape-from-Template (SfT) et le Non-Rigid Structure-from-Motion (NRSfM) sont deux approches qui ont été grandement étudiées pour leur applicabilité. La première utilise une seule image qui montre un objet se déformant et un patron (une forme 3D texturée de l'objet dans une pose de référence). La seconde n'utilise pas de patron, mais utilise plusieurs images et estime la forme 3D dans chaque image. Les deux approches s'appuient sur le mouvement de points de correspondances entre les images et sur des a priori de déformations, restreignant ainsi leur utilisation à des surfaces texturées qui se déforment de manière lisse. Cette thèse fait avancer l'état de l'art du SfT et du NRSfM dans deux directions. La première est l'étude du SfT dans le cas de patrons 1D (c’est-à-dire des courbes comme des cordes et des câbles). La seconde direction est le développement d'algorithmes de SfT et de NRSfM qui exploitent plusieurs indices visuels et qui résolvent des cas réels et complexes non-résolus précédemment. Nous considérons des déformations isométriques et reconstruisons la partie extérieure de l'objet. Les contributions techniques et scientifiques de cette thèse sont divisées en quatre parties.La première partie de cette thèse étudie le SfT curvilinéaire, qui est le cas du patron curvilinéaire plongé dans un espace 2D ou 3D. Nous proposons une analyse théorique approfondie et des solutions pratiques pour le SfT curvilinéaire. Malgré son apparente simplicité, le SfT curvilinéaire s'est avéré être un problème complexe : il ne peut pas être résolu à l'aide de solutions locales non-holonomes d'une équation différentielle ordinaire et ne possède pas de solution unique, mais un nombre fini de solutions ambiguës. Une contribution technique majeure est un algorithme basé sur notre théorie, qui génère toutes les solutions ambiguës. La deuxième partie de cette thèse traite d'une limitation des méthodes de SfT : la reconstruction de plis. Cette limitation vient de la parcimonie de la contrainte de mouvement et de la régularisation. Nous proposons deux contributions qui s'appuient sur un cadre de minimisation d'énergie non-convexe. Tout d'abord, nous complétons la contrainte de mouvement avec une contrainte robuste de bord. Ensuite, nous modélisons implicitement les plis à l'aide d'une représentation dense de la surface basée maillage et d'une contrainte robuste de lissage qui désactive automatiquement le lissage de la courbure sans connaître a priori la position des plis.La troisième partie de cette thèse est dédiée à une autre limitation du SfT : la reconstruction de surfaces peu texturées. Cette limitation vient de la difficulté d'obtenir des correspondances (parcimonieuses ou denses) sur des surfaces peu texturées. Comme l'ombrage révèle les détails sur des surfaces peu texturées, nous proposons de combiner l'ombrage avec le SfT. Nous présentons deux contributions. La première est une initialisation en cascade qui estime séquentiellement la déformation de la surface, l'illumination de la scène, la réponse de la caméra et enfin les albédos de la surface à partir d'images monoculaires où la surface se déforme. La seconde est l'intégration de l'ombrage à notre précédent cadre de minimisation d'énergie afin de raffiner simultanément les paramètres photométriques et de déformation.La dernière partie de cette thèse relâche la connaissance du patron et aborde deux limitations du NRSfM : la reconstruction de surfaces peu texturées avec des plis. Une contribution majeure est l'extension du second cadre d'optimisation pour la reconstruction conjointe de la forme 3D de la surface sur toutes les images d'entrée et des albédos de la surface sans en connaître un patron. / Monocular deformable 3D reconstruction is the general problem of recovering the 3D shape of a deformable object from monocular 2D images. Several scenarios have emerged: the Shape-from-Template (SfT) and the Non-Rigid Structure-from-Motion (NRSfM) are two approaches intensively studied for their practicability. The former uses a single image depicting the deforming object and a template (a textured 3D shape of this object in a reference pose). The latter does not use a template, but uses several images and recovers the 3D shape in each image. Both approaches rely on the motion of correspondences between the images and deformation priors, which restrict their use to well-textured surfaces which deform smoothly. This thesis advances the state-of-the-art in SfT and NRSfM in two main directions. The first direction is to study SfT for the case of 1D templates (i.e. curved, thin structures such as ropes and cables). The second direction is to develop algorithms in SfT and NRSfM that exploit multiple visual cues and can solve complex, real-world cases which were previously unsolved. We focus on isometric deformations and reconstruct the outer part of the object. The technical and scientific contributions of this thesis are divided into four parts. The first part of this thesis studies the case of a curvilinear template embedded in 2D or 3D space, referred to Curve SfT. We propose a thorough theoretical analysis and practical solutions for Curve SfT. Despite its apparent simplicity, Curve SfT appears to be a complex problem: it cannot be solved locally using exact non-holonomic partial differential equation and is only solvable up to a finite number of ambiguous solutions. A major technical contribution is a computational solution based on our theory, which generates all the ambiguous solutions.The second part of this thesis deals with a limitation of SfT methods: reconstructing creases. This is due to the sparsity of the motion constraint and regularization. We propose two contributions which rely on a non-convex energy minimization framework. First, we complement the motion constraint with a robust boundary contour constraint. Second, we implicitly model creases with a dense mesh-based surface representation and an associated robust smoothing constraint, which deactivates curvature smoothing automatically where needed, without knowing a priori the crease location. The third part of this thesis is dedicated to another limitation of SfT: reconstructing poorly-textured surfaces. This is due to correspondences which cannot be obtained so easily on poorly-textured surfaces (either sparse or dense). As shading reveals details on poorly-textured surfaces, we propose to combine shading and SfT. We have two contributions. The first is a cascaded initialization which estimates sequentially the surface's deformation, the scene illumination, the camera response and then the surface albedos from deformed monocular images. The second is to integrate shading to our previous energy minimization framework for simultaneously refining deformation and photometric parameters.The last part of this thesis relaxes the knowledge of the template and addresses two limitations of NRSfM: reconstructing poorly-textured surfaces with creases. Our major contribution is an extension of the second framework to recover jointly the 3D shapes of all input images and the surface albedos without any template.
119

Entwicklung und Validierung methodischer Konzepte einer kamerabasierten Durchfahrtshöhenerkennung für Nutzfahrzeuge

Hänert, Stephan 03 July 2020 (has links)
Die vorliegende Arbeit beschäftigt sich mit der Konzeptionierung und Entwicklung eines neuartigen Fahrerassistenzsystems für Nutzfahrzeuge, welches die lichte Höhe von vor dem Fahrzeug befindlichen Hindernissen berechnet und über einen Abgleich mit der einstellbaren Fahrzeughöhe die Passierbarkeit bestimmt. Dabei werden die von einer Monokamera aufgenommenen Bildsequenzen genutzt, um durch indirekte und direkte Rekonstruktionsverfahren ein 3D-Abbild der Fahrumgebung zu erschaffen. Unter Hinzunahme einer Radodometrie-basierten Eigenbewegungsschätzung wird die erstellte 3D-Repräsentation skaliert und eine Prädiktion der longitudinalen und lateralen Fahrzeugbewegung ermittelt. Basierend auf dem vertikalen Höhenplan der Straßenoberfläche, welcher über die Aneinanderreihung mehrerer Ebenen modelliert wird, erfolgt die Klassifizierung des 3D-Raums in Fahruntergrund, Struktur und potentielle Hindernisse. Die innerhalb des Fahrschlauchs liegenden Hindernisse werden hinsichtlich ihrer Entfernung und Höhe bewertet. Ein daraus abgeleitetes Warnkonzept dient der optisch-akustischen Signalisierung des Hindernisses im Kombiinstrument des Fahrzeugs. Erfolgt keine entsprechende Reaktion durch den Fahrer, so wird bei kritischen Hindernishöhen eine Notbremsung durchgeführt. Die geschätzte Eigenbewegung und berechneten Hindernisparameter werden mithilfe von Referenzsensorik bewertet. Dabei kommt eine dGPS-gestützte Inertialplattform sowie ein terrestrischer und mobiler Laserscanner zum Einsatz. Im Rahmen der Arbeit werden verschiedene Umgebungssituationen und Hindernistypen im urbanen und ländlichen Raum untersucht und Aussagen zur Genauigkeit und Zuverlässigkeit des Verfahrens getroffen. Ein wesentlicher Einflussfaktor auf die Dichte und Genauigkeit der 3D-Rekonstruktion ist eine gleichmäßige Umgebungsbeleuchtung innerhalb der Bildsequenzaufnahme. Es wird in diesem Zusammenhang zwingend auf den Einsatz einer Automotive-tauglichen Kamera verwiesen. Die durch die Radodometrie bestimmte Eigenbewegung eignet sich im langsamen Geschwindigkeitsbereich zur Skalierung des 3D-Punktraums. Dieser wiederum sollte durch eine Kombination aus indirektem und direktem Punktrekonstruktionsverfahren erstellt werden. Der indirekte Anteil stützt dabei die Initialisierung des Verfahrens zum Start der Funktion und ermöglicht eine robuste Kameraschätzung. Das direkte Verfahren ermöglicht die Rekonstruktion einer hohen Anzahl an 3D-Punkten auf den Hindernisumrissen, welche zumeist die Unterkante beinhalten. Die Unterkante kann in einer Entfernung bis zu 20 m detektiert und verfolgt werden. Der größte Einflussfaktor auf die Genauigkeit der Berechnung der lichten Höhe von Hindernissen ist die Modellierung des Fahruntergrunds. Zur Reduktion von Ausreißern in der Höhenberechnung eignet sich die Stabilisierung des Verfahrens durch die Nutzung von zeitlich vorher zur Verfügung stehenden Berechnungen. Als weitere Maßnahme zur Stabilisierung wird zudem empfohlen die Hindernisausgabe an den Fahrer und den automatischen Notbremsassistenten mittels einer Hysterese zu stützen. Das hier vorgestellte System eignet sich für Park- und Rangiervorgänge und ist als kostengünstiges Fahrerassistenzsystem interessant für Pkw mit Aufbauten und leichte Nutzfahrzeuge. / The present work deals with the conception and development of a novel advanced driver assistance system for commercial vehicles, which estimates the clearance height of obstacles in front of the vehicle and determines the passability by comparison with the adjustable vehicle height. The image sequences captured by a mono camera are used to create a 3D representation of the driving environment using indirect and direct reconstruction methods. The 3D representation is scaled and a prediction of the longitudinal and lateral movement of the vehicle is determined with the aid of a wheel odometry-based estimation of the vehicle's own movement. Based on the vertical elevation plan of the road surface, which is modelled by attaching several surfaces together, the 3D space is classified into driving surface, structure and potential obstacles. The obstacles within the predicted driving tube are evaluated with regard to their distance and height. A warning concept derived from this serves to visually and acoustically signal the obstacle in the vehicle's instrument cluster. If the driver does not respond accordingly, emergency braking will be applied at critical obstacle heights. The estimated vehicle movement and calculated obstacle parameters are evaluated with the aid of reference sensors. A dGPS-supported inertial measurement unit and a terrestrial as well as a mobile laser scanner are used. Within the scope of the work, different environmental situations and obstacle types in urban and rural areas are investigated and statements on the accuracy and reliability of the implemented function are made. A major factor influencing the density and accuracy of 3D reconstruction is uniform ambient lighting within the image sequence. In this context, the use of an automotive camera is mandatory. The inherent motion determined by wheel odometry is suitable for scaling the 3D point space in the slow speed range. The 3D representation however, should be created by a combination of indirect and direct point reconstruction methods. The indirect part supports the initialization phase of the function and enables a robust camera estimation. The direct method enables the reconstruction of a large number of 3D points on the obstacle outlines, which usually contain the lower edge. The lower edge can be detected and tracked up to 20 m away. The biggest factor influencing the accuracy of the calculation of the clearance height of obstacles is the modelling of the driving surface. To reduce outliers in the height calculation, the method can be stabilized by using calculations from older time steps. As a further stabilization measure, it is also recommended to support the obstacle output to the driver and the automatic emergency brake assistant by means of hysteresis. The system presented here is suitable for parking and maneuvering operations and is interesting as a cost-effective driver assistance system for cars with superstructures and light commercial vehicles.
120

Structureless Camera Motion Estimation of Unordered Omnidirectional Images

Sastuba, Mark 08 August 2022 (has links)
This work aims at providing a novel camera motion estimation pipeline from large collections of unordered omnidirectional images. In oder to keep the pipeline as general and flexible as possible, cameras are modelled as unit spheres, allowing to incorporate any central camera type. For each camera an unprojection lookup is generated from intrinsics, which is called P2S-map (Pixel-to-Sphere-map), mapping pixels to their corresponding positions on the unit sphere. Consequently the camera geometry becomes independent of the underlying projection model. The pipeline also generates P2S-maps from world map projections with less distortion effects as they are known from cartography. Using P2S-maps from camera calibration and world map projection allows to convert omnidirectional camera images to an appropriate world map projection in oder to apply standard feature extraction and matching algorithms for data association. The proposed estimation pipeline combines the flexibility of SfM (Structure from Motion) - which handles unordered image collections - with the efficiency of PGO (Pose Graph Optimization), which is used as back-end in graph-based Visual SLAM (Simultaneous Localization and Mapping) approaches to optimize camera poses from large image sequences. SfM uses BA (Bundle Adjustment) to jointly optimize camera poses (motion) and 3d feature locations (structure), which becomes computationally expensive for large-scale scenarios. On the contrary PGO solves for camera poses (motion) from measured transformations between cameras, maintaining optimization managable. The proposed estimation algorithm combines both worlds. It obtains up-to-scale transformations between image pairs using two-view constraints, which are jointly scaled using trifocal constraints. A pose graph is generated from scaled two-view transformations and solved by PGO to obtain camera motion efficiently even for large image collections. Obtained results can be used as input data to provide initial pose estimates for further 3d reconstruction purposes e.g. to build a sparse structure from feature correspondences in an SfM or SLAM framework with further refinement via BA. The pipeline also incorporates fixed extrinsic constraints from multi-camera setups as well as depth information provided by RGBD sensors. The entire camera motion estimation pipeline does not need to generate a sparse 3d structure of the captured environment and thus is called SCME (Structureless Camera Motion Estimation).:1 Introduction 1.1 Motivation 1.1.1 Increasing Interest of Image-Based 3D Reconstruction 1.1.2 Underground Environments as Challenging Scenario 1.1.3 Improved Mobile Camera Systems for Full Omnidirectional Imaging 1.2 Issues 1.2.1 Directional versus Omnidirectional Image Acquisition 1.2.2 Structure from Motion versus Visual Simultaneous Localization and Mapping 1.3 Contribution 1.4 Structure of this Work 2 Related Work 2.1 Visual Simultaneous Localization and Mapping 2.1.1 Visual Odometry 2.1.2 Pose Graph Optimization 2.2 Structure from Motion 2.2.1 Bundle Adjustment 2.2.2 Structureless Bundle Adjustment 2.3 Corresponding Issues 2.4 Proposed Reconstruction Pipeline 3 Cameras and Pixel-to-Sphere Mappings with P2S-Maps 3.1 Types 3.2 Models 3.2.1 Unified Camera Model 3.2.2 Polynomal Camera Model 3.2.3 Spherical Camera Model 3.3 P2S-Maps - Mapping onto Unit Sphere via Lookup Table 3.3.1 Lookup Table as Color Image 3.3.2 Lookup Interpolation 3.3.3 Depth Data Conversion 4 Calibration 4.1 Overview of Proposed Calibration Pipeline 4.2 Target Detection 4.3 Intrinsic Calibration 4.3.1 Selected Examples 4.4 Extrinsic Calibration 4.4.1 3D-2D Pose Estimation 4.4.2 2D-2D Pose Estimation 4.4.3 Pose Optimization 4.4.4 Uncertainty Estimation 4.4.5 PoseGraph Representation 4.4.6 Bundle Adjustment 4.4.7 Selected Examples 5 Full Omnidirectional Image Projections 5.1 Panoramic Image Stitching 5.2 World Map Projections 5.3 World Map Projection Generator for P2S-Maps 5.4 Conversion between Projections based on P2S-Maps 5.4.1 Proposed Workflow 5.4.2 Data Storage Format 5.4.3 Real World Example 6 Relations between Two Camera Spheres 6.1 Forward and Backward Projection 6.2 Triangulation 6.2.1 Linear Least Squares Method 6.2.2 Alternative Midpoint Method 6.3 Epipolar Geometry 6.4 Transformation Recovery from Essential Matrix 6.4.1 Cheirality 6.4.2 Standard Procedure 6.4.3 Simplified Procedure 6.4.4 Improved Procedure 6.5 Two-View Estimation 6.5.1 Evaluation Strategy 6.5.2 Error Metric 6.5.3 Evaluation of Estimation Algorithms 6.5.4 Concluding Remarks 6.6 Two-View Optimization 6.6.1 Epipolar-Based Error Distances 6.6.2 Projection-Based Error Distances 6.6.3 Comparison between Error Distances 6.7 Two-View Translation Scaling 6.7.1 Linear Least Squares Estimation 6.7.2 Non-Linear Least Squares Optimization 6.7.3 Comparison between Initial and Optimized Scaling Factor 6.8 Homography to Identify Degeneracies 6.8.1 Homography for Spherical Cameras 6.8.2 Homography Estimation 6.8.3 Homography Optimization 6.8.4 Homography and Pure Rotation 6.8.5 Homography in Epipolar Geometry 7 Relations between Three Camera Spheres 7.1 Three View Geometry 7.2 Crossing Epipolar Planes Geometry 7.3 Trifocal Geometry 7.4 Relation between Trifocal, Three-View and Crossing Epipolar Planes 7.5 Translation Ratio between Up-To-Scale Two-View Transformations 7.5.1 Structureless Determination Approaches 7.5.2 Structure-Based Determination Approaches 7.5.3 Comparison between Proposed Approaches 8 Pose Graphs 8.1 Optimization Principle 8.2 Solvers 8.2.1 Additional Graph Solvers 8.2.2 False Loop Closure Detection 8.3 Pose Graph Generation 8.3.1 Generation of Synthetic Pose Graph Data 8.3.2 Optimization of Synthetic Pose Graph Data 9 Structureless Camera Motion Estimation 9.1 SCME Pipeline 9.2 Determination of Two-View Translation Scale Factors 9.3 Integration of Depth Data 9.4 Integration of Extrinsic Camera Constraints 10 Camera Motion Estimation Results 10.1 Directional Camera Images 10.2 Omnidirectional Camera Images 11 Conclusion 11.1 Summary 11.2 Outlook and Future Work Appendices A.1 Additional Extrinsic Calibration Results A.2 Linear Least Squares Scaling A.3 Proof Rank Deficiency A.4 Alternative Derivation Midpoint Method A.5 Simplification of Depth Calculation A.6 Relation between Epipolar and Circumferential Constraint A.7 Covariance Estimation A.8 Uncertainty Estimation from Epipolar Geometry A.9 Two-View Scaling Factor Estimation: Uncertainty Estimation A.10 Two-View Scaling Factor Optimization: Uncertainty Estimation A.11 Depth from Adjoining Two-View Geometries A.12 Alternative Three-View Derivation A.12.1 Second Derivation Approach A.12.2 Third Derivation Approach A.13 Relation between Trifocal Geometry and Alternative Midpoint Method A.14 Additional Pose Graph Generation Examples A.15 Pose Graph Solver Settings A.16 Additional Pose Graph Optimization Examples Bibliography

Page generated in 0.0826 seconds