131 |
Sequential Motion Estimation and Refinement for Applications of Real-time Reconstruction from Stereo VisionStefanik, Kevin Vincent 10 August 2011 (has links)
This paper presents a new approach to the feature-matching problem for 3D reconstruction by taking advantage of GPS and IMU data, along with a prior calibrated stereo camera system. It is expected that pose estimates and calibration can be used to increase feature matching speed and accuracy. Given pose estimates of cameras and extracted features from images, the algorithm first enumerates feature matches based on stereo projection constraints in 2D and then backprojects them to 3D. Then, a grid search algorithm over potential camera poses is proposed to match the 3D features and find the largest group of 3D feature matches between pairs of stereo frames. This approach will provide pose accuracy to within the space that each grid region covers. Further refinement of relative camera poses is performed with an iteratively re-weighted least squares (IRLS) method in order to reject outliers in the 3D matches. The algorithm is shown to be capable of running in real-time correctly, where the majority of processing time is taken by feature extraction and description. The method is shown to outperform standard open source software for reconstruction from imagery. / Master of Science
|
132 |
Online 3D Reconstruction and Ground Segmentation using Drone based Long Baseline Stereo Vision SystemKumar, Prashant 16 November 2018 (has links)
This thesis presents online 3D reconstruction and ground segmentation using unmanned aerial vehicle (UAV) based stereo vision. For this purpose, a long baseline stereo vision system has been designed and built. Application of this system is to work as part of an air and ground based multi-robot autonomous terrain surveying project at Unmanned Systems Lab (USL), Virginia Tech, to act as a first responder robotic system in disaster situations. Areas covered by this thesis are design of long baseline stereo vision system, study of stereo vision raw output, techniques to filter out outliers from raw stereo vision output, a 3D reconstruction method and a study to improve running time by controlling the density of point clouds. Presented work makes use of filtering methods and implementations in Point Cloud Library (PCL) and feature matching on graphics processing unit (GPU) using OpenCV with CUDA. Besides 3D reconstruction, the challenge in the project was speed and several steps and ideas are presented to achieve it. Presented 3D reconstruction algorithm uses feature matching in 2D images, converts keypoints to 3D using disparity images, estimates rigid body transformation between matched 3D keypoints and fits point clouds. To correct and control orientation and localization errors, it fits re-projected UAV positions on GPS recorded UAV positions using iterative closest point (ICP) algorithm as the correction step. A new but computationally intensive process of use of superpixel clustering and plane fitting to increase resolution of disparity images to sub-pixel resolution is also presented. Results section provides accuracy of 3D reconstruction results. The presented process is able to generate application acceptable semi-dense 3D reconstruction and ground segmentation at 8-12 frames per second (fps). In 3D reconstruction of an area of size 25 x 40 m2, with UAV flight altitude of 23 m, average obstacle localization error and average obstacle size/dimension error is found to be of 17 cm and 3 cm, respectively. / MS / This thesis presents near real-time, called online, visual reconstruction in 3-dimensions (3D) using ground facing camera system on an unmanned aerial vehicle. Another result of this thesis is separating ground from obstacles on the ground. To do this the camera system using two cameras, called stereo vision system, with the cameras being positioned comparatively far away from each other at 60 cm was designed as well as an algorithm and software to do the visual 3D reconstruction was developed. Application of this system is to work as part of an air and ground based multi-robot autonomous terrain surveying project at Unmanned Systems Lab, Virginia Tech, to act as a first responder robotic system in disaster situations. Presented work makes use of Point Cloud Library and library functions on graphics processing unit using OpenCV with CUDA, which are popular Computer Vision libraries. Besides 3D reconstruction, the challenge in the project was speed and several steps and ideas are presented to achieve it. Presented 3D reconstruction algorithm is based on feature matching, which is a popular way to mathematically identify unique pixels in an image. Besides using image features in 3D reconstruction, the algorithm also presents a correction step to correct and control orientation and localization errors using iterative closest point algorithm. A new but computationally intensive process to improve resolution of disparity images, which is an output of the developed stereo vision system, from single pixel accuracy to sub-pixel accuracy is also presented. Results section provides accuracy of 3D reconstruction results. The presented process is able to generate application acceptable 3D reconstruction and ground segmentation at 8-12 frames per second. In 3D reconstruction of an area of size 25 x 40 m2 , with UAV flight altitude of 23 m, average obstacle localization error and average obstacle size/dimension error is found to be of 17 cm and 3 cm, respectively.
|
133 |
Semi-supervised learning for joint visual odometry and depth estimationPapadopoulos, Kyriakos January 2024 (has links)
Autonomous driving has seen huge interest and improvements in the last few years. Two important functions of autonomous driving is the depth and visual odometry estimation.Depth estimation refers to determining the distance from the camera to each point in the scene captured by the camera, while the visual odometry refers to estimation of ego motion using images recorded by the camera. The algorithm presented by Zhou et al. [1] is a completely unsupervised algorithm for depth and ego motion estimation. This thesis sets out to minimize ambiguity and enhance performance of the algorithm [1]. The purpose of the mentioned algorithm is to estimate the depth map given an image, from a camera attached to the agent, and the ego motion of the agent, in the case of the thesis, the agent is a vehicle. The algorithm lacks the ability to make predictions in the true scale in both depth and ego motion, said differently, it suffers from ambiguity. Two extensions of the method were developed by changing the loss function of the algorithm and supervising ego motion. Both methods show a remarkable improvement in their performance and reduced ambiguity, utilizing only the ego motion ground data which is significantly easier to access than depth ground truth data
|
134 |
Evaluation of probabilistic representations for modeling and understanding shape based on synthetic and real sensory data / Utvärdering av probabilistiska representationer för modellering och förståelse av form baserat på syntetisk och verklig sensordataZarzar Gandler, Gabriela January 2017 (has links)
The advancements in robotic perception in the recent years have empowered robots to better execute tasks in various environments. The perception of objects in the robot work space significantly relies on how sensory data is represented. In this context, 3D models of object’s surfaces have been studied as a means to provide useful insights on shape of objects and ultimately enhance robotic perception. This involves several challenges, because sensory data generally presents artifacts, such as noise and incompleteness. To tackle this problem, we employ Gaussian Process Implicit Surface (GPIS), a non-parametric probabilistic reconstruction of object’s surfaces from 3D data points. This thesis investigates different configurations for GPIS, as a means to tackle the extraction of shape information. In our approach we interpret an object’s surface as the level-set of an underlying sparse Gaussian Process (GP) with variational formulation. Results show that the variational formulation for sparse GP enables a reliable approximation to the full GP solution. Experiments are performed on a synthetic and a real sensory data set. We evaluate results by assessing how close the reconstructed surfaces are to the ground-truth correspondences, and how well objects from different categories are clustered based on the obtained representation. Finally we conclude that the proposed solution derives adequate surface representations to reason about object shape and to discriminate objects based on shape information. / Framsteg inom robotperception de senaste åren har resulterat i robotar som är bättre på attutföra uppgifter i olika miljöer. Perception av objekt i robotens arbetsmiljö är beroende avhur sensorisk data representeras. I det här sammanhanget har 3D-modeller av objektytorstuderats för att ge användbar insikt om objektens form och i slutändan bättre robotperception. Detta innebär flera utmaningar, eftersom sensoriska data ofta innehåller artefakter, såsom brus och brist på data. För att hantera detta problem använder vi oss av Gaussian Process Implicit Surface (GPIS), som är en icke-parametrisk probabilistisk rekonstruktion av ett objekts yta utifrån 3D-punkter. Detta examensarbete undersöker olika konfigurationer av GPIS för att på detta sätt kunna extrahera forminformation. I vår metod tolkar vi ett objekts yta som nivåkurvor hos en underliggande gles variational Gaussian Process (GP) modell. Resultat visar att en gles variational GP möjliggör en tillförlitlig approximation av en komplett GP-lösningen. Experiment utförs på ett syntetisk och ett reellt sensorisk dataset. Vi utvärderar resultat genom att bedöma hur nära de rekonstruerade ytorna är till grundtruth- korrespondenser, och hur väl objektkategorier klustras utifrån den erhållna representationen. Slutligen konstaterar vi att den föreslagna lösningen leder till tillräckligt goda representationer av ytor för tolkning av objektens form och för att diskriminera objekt utifrån forminformation.
|
135 |
A Multi Sensor System for a Human Activities Space : Aspects of Planning and Quality MeasurementChen, Jiandan January 2008 (has links)
In our aging society, the design and implementation of a high-performance autonomous distributed vision information system for autonomous physical services become ever more important. In line with this development, the proposed Intelligent Vision Agent System, IVAS, is able to automatically detect and identify a target for a specific task by surveying a human activities space. The main subject of this thesis is the optimal configuration of a sensor system meant to capture the target objects and their environment within certain required specifications. The thesis thus discusses how a discrete sensor causes a depth spatial quantisation uncertainty, which significantly contributes to the 3D depth reconstruction accuracy. For a sensor stereo pair, the quantisation uncertainty is represented by the intervals between the iso-disparity surfaces. A mathematical geometry model is then proposed to analyse the iso-disparity surfaces and optimise the sensors’ configurations according to the required constrains. The thesis also introduces the dithering algorithm which significantly reduces the depth reconstruction uncertainty. This algorithm assures high depth reconstruction accuracy from a few images captured by low-resolution sensors. To ensure the visibility needed for surveillance, tracking, and 3D reconstruction, the thesis introduces constraints of the target space, the stereo pair characteristics, and the depth reconstruction accuracy. The target space, the space in which human activity takes place, is modelled as a tetrahedron, and a field of view in spherical coordinates is proposed. The minimum number of stereo pairs necessary to cover the entire target space and the arrangement of the stereo pairs’ movement is optimised through integer linear programming. In order to better understand human behaviour and perception, the proposed adaptive measurement method makes use of a fuzzily defined variable, FDV. The FDV approach enables an estimation of a quality index based on qualitative and quantitative factors. The suggested method uses a neural network as a tool that contains a learning function that allows the integration of the human factor into a quantitative quality index. The thesis consists of two parts, where Part I gives a brief overview of the applied theory and research methods used, and Part II contains the five papers included in the thesis.
|
136 |
Vers la modélisation grand échelle d'environnements urbains à partir d'images / Towards large-scale urban environments modeling from imagesMoslah, Oussama 05 July 2011 (has links)
L'objectif principal de cette thèse est de développer des outils pour la reconstruction de l'environnement urbain à partir d'images. Les entrées typiques de notre travail est un ensemble d'images de façades, des empreintes au sol de bâtiments, et des modèles 3D reconstruits à partir d'images aériennes. Les principales étapes comprennent le calibrage des images,le recalage avec le modèle 3D, la récupération des informations de profondeur ainsi que la sémantique des façades.Pour atteindre cet objectif, nous utilisons des techniques du domaine de vision par ordinateur, reconnaissance de formes et de l'informatique graphique. Les contributions de notre approche sont présentés en deux parties.Dans la première partie, nous nous sommes concentrés sur des techniques de reconstruction multi-vues dans le but de récupérer automatiquement les informations de profondeur de façades à partir un ensemble des photographies non calibrées. Tout d'abord, nous utilisons la technique structure et mouvement pour calibrer automatiquement l'ensemble des photographies. Ensuite, nous proposons des techniques pour le recalage de la reconstruction avec un modèle 3D. Enfin, nous proposons des techniques de reconstruction 3d dense (stéréo multi-vues et voxel coloring) pour produire un maillage 3D texturé d'une scène d'un ensemble d'images calibrées.La deuxième partie est consacrée à la reconstruction à partir d'une seule vue et son objectif est de récupérer la structure sémantique d'une façade d'une image ortho-rectifiée. La nouveauté de cette approche est l'utilisation d'une grammaire stochastique décrivant un style architectural comme modèle pour la reconstruction de façades. nous combinons un ensemble de détecteurs image avec une méthode d'optimisation globale stochastique en utilisant l'algorithme Metropolis-Hastings. / The main goal of this thesis is to develop innovative and practicaltools for the reconstruction of buildings from images. The typical input to our workis a set of facade images, building footprints, and coarse 3d models reconstructedfrom aerial images. The main steps include the calibration of the photographs,the registration with the coarse 3d model, the recovery of depth and sematicinformation, and the refinement of the coarse 3d model.To achieve this goal, we use computer vision, pattern recognition and computergraphics techniques. Contributions in this approach are presented on two parts.In the first part, we focused on multiple view reconstruction techniques withthe aim to automatically recover the depth information of facades from a setof uncalibrated photographs. First, we use structure from motion techniques toautomatically calibrate the set of photographs. Then, we propose techniques for theregistration of the sparse reconstruction to a coarse 3d model. Finally, we proposean accelerated multi-view stereo and voxel coloring framework using graphicshardware to produce a textured 3d mesh of a scene from a set of calibrated images.The second part is dedicated to single view reconstruction and its aim is to recoverthe semantic structure of a facade from an ortho-rectified image. The novelty ofthis approach is the use of a stochastic grammar describing an architectural style asa model for facade reconstruction. we combine bottom-up detection with top-downproposals to optimize the facade structure using the Metropolis-Hastings algorithm.
|
137 |
Reconstructing plant architecture from 3D laser scanner data / Acquisition et validation de modèles architecturaux virtuels de plantesPreuksakarn, Chakkrit 19 December 2012 (has links)
Les modèles virtuels de plantes sont visuellement de plus en plus réalistes dans les applications infographiques. Cependant, dans le contexte de la biologie et l'agronomie, l'acquisition de modèles précis de plantes réelles reste un problème majeur pour la construction de modèles quantitatifs du développement des plantes.Récemment, des scanners laser 3D permettent d'acquérir des images 3D avec pour chaque pixel une profondeur correspondant à la distance entre le scanner et la surface de l'objet visé. Cependant, une plante est généralement un ensemble important de petites surfaces sur lesquelles les méthodes classiques de reconstruction échouent. Dans cette thèse, nous présentons une méthode pour reconstruire des modèles virtuels de plantes à partir de scans laser. Mesurer des plantes avec un scanner laser produit des données avec différents niveaux de précision. Les scans sont généralement denses sur la surface des branches principales mais recouvrent avec peu de points les branches fines. Le cœur de notre méthode est de créer itérativement un squelette de la structure de la plante en fonction de la densité locale de points. Pour cela, une méthode localement adaptative a été développée qui combine une phase de contraction et un algorithme de suivi de points.Nous présentons également une procédure d'évaluation quantitative pour comparer nos reconstructions avec des structures reconstruites par des experts de plantes réelles. Pour cela, nous explorons d'abord l'utilisation d'une distance d'édition entre arborescence. Finalement, nous formalisons la comparaison sous forme d'un problème d'assignation pour trouver le meilleur appariement entre deux structures et quantifier leurs différences. / In the last decade, very realistic rendering of plant architectures have been produced in computer graphics applications. However, in the context of biology and agronomy, acquisition of accurate models of real plants is still a tedious task and a major bottleneck for the construction of quantitative models of plant development. Recently, 3D laser scanners made it possible to acquire 3D images on which each pixel has an associate depth corresponding to the distance between the scanner and the pinpointed surface of the object. Standard geometrical reconstructions fail on plants structures as they usually contain a complex set of discontinuous or branching surfaces distributed in space with varying orientations. In this thesis, we present a method for reconstructing virtual models of plants from laser scanning of real-world vegetation. Measuring plants with laser scanners produces data with different levels of precision. Points set are usually dense on the surface of the main branches, but only sparsely cover thin branches. The core of our method is to iteratively create the skeletal structure of the plant according to local density of point set. This is achieved thanks to a method that locally adapts to the levels of precision of the data by combining a contraction phase and a local point tracking algorithm. In addition, we present a quantitative evaluation procedure to compare our reconstructions against expertised structures of real plants. For this, we first explore the use of an edit distance between tree graphs. Alternatively, we formalize the comparison as an assignment problem to find the best matching between the two structures and quantify their differences.
|
138 |
Contrôle hydrodynamique de la formation des biofilms en milieu eaux usées / Hydrodynamic control of biofilm formation in wastewater systemEl Khatib, Rime 17 November 2011 (has links)
Les biofilms bactériens se développent sur toute interface liquide-solide dès que les conditions sont favorables. Ils correspondent à des assemblages de microcolonies qui baignent dans une matrice extracellulaire polymérique. Parmi les facteurs contrôlant le développement des biofilms, l’hydrodynamique est un paramètre clé qui affecte la morphologie et la composition du biofilm. Nous nous intéressons plus particulièrement dans cette thèse à l'influence du gradient de vitesse pariétal sur la formation du biofilm. Pour cela, nous utilisons un réacteur Couette-Poiseuille qui permet de travailler sous écoulement laminaire stable dans différentes conditions d'écoulement. Les biofilms obtenus après circulation d'eaux usées, sont prélevés sur des coupons et visualisés par microscopie confocale à balayage laser. Différents paramètres caractérisant la morphologie du biofilm sont déterminés après reconstruction 3D de leur structure à l'aide du modeleur GOCAD. Nous montrons que le transport convectif constitue une étape essentielle dans la formation initiale du biofilm, et qu'un gradient pariétal nul permet d'inhiber le développement de celui-ci / Bacterial biofilms develop on any solid-liquid interface whenever conditions are appropriate. They correspond to microcolony assemblages embedded in an extracellular matrix. Among the factors controlling biofilm growth, hydrodynamics is a key parameter affecting both biofilm morphology and composition. In this thesis we investigate the influence of hydrodynamics, and more precisely the wall shear rate effect on biofilm development. For this purpose, a Couette-Poiseuille reactor, allowing to work under stable laminar flow with different flow velocities, was used. Biofilms grown from urban wastewater on coupon surfaces were observed with confocal scanning microscopy. A 3D modeling using GOCAD software was established, thus allowing the determination of various biofilms structural characteristics. The results show the essential role of convective mass transport in biofilm formation, actually a zero wall shear rate inhibited bacterial deposition, and hence biofilm growth
|
139 |
Avancements dans l'estimation de pose et la reconstruction 3D de scènes à 2 et 3 vues / Advances on Pose Estimation and 3D Resconstruction of 2 and 3-View ScenesFernandez Julia, Laura 13 December 2018 (has links)
L'étude des caméras et des images a été un sujet prédominant depuis le début de la vision par ordinateur, l'un des principaux axes étant l'estimation de la pose et la reconstruction 3D. Le but de cette thèse est d'aborder et d'étudier certains problèmes et méthodes spécifiques du pipeline de la structure-from-motion afin d'améliorer la précision, de réaliser de vastes études pour comprendre les avantages et les inconvénients des modèles existants et de créer des outils mis à la disposition du public. Plus spécifiquement, nous concentrons notre attention sur les pairs stéréoscopiques et les triplets d'images et nous explorons certaines des méthodes et modèles capables de fournir une estimation de la pose et une reconstruction 3D de la scène.Tout d'abord, nous abordons la tâche d'estimation de la profondeur pour les pairs stéréoscopiques à l'aide de la correspondance de blocs. Cette approche suppose implicitement que tous les pixels du patch ont la même profondeur, ce qui produit l'artefact commun dénommé "foreground-fattening effect". Afin de trouver un support plus approprié, Yoon et Kweon ont introduit l'utilisation de poids basés sur la similarité des couleurs et la distance spatiale, analogues à ceux utilisés dans le filtre bilatéral. Nous présentons la théorie de cette méthode et l'implémentation que nous avons développée avec quelques améliorations. Nous discutons de quelques variantes de la méthode et analysons ses paramètres et ses performances.Deuxièmement, nous considérons l'ajout d'une troisième vue et étudions le tenseur trifocal, qui décrit les contraintes géométriques reliant les trois vues. Nous explorons les avantages offerts par cet opérateur dans la tâche d'estimation de pose d'un triplet de caméras par opposition au calcul des poses relatives paire par paire en utilisant la matrice fondamentale. De plus, nous présentons une étude et l’implémentation de plusieurs paramétrisations du tenseur. Nous montrons que l'amélioration initiale de la précision du tenseur trifocal n'est pas suffisante pour avoir un impact remarquable sur l'estimation de la pose après ajustement de faisceau et que l'utilisation de la matrice fondamentale avec des triplets d'image reste pertinente.Enfin, nous proposons d'utiliser un modèle de projection différent de celui de la caméra à sténopé pour l'estimation de la pose des caméras en perspective. Nous présentons une méthode basée sur la factorisation matricielle due à Tomasi et Kanade qui repose sur la projection orthographique. Cette méthode peut être utilisée dans des configurations où d'autres méthodes échouent, en particulier lorsque l'on utilise des caméras avec des objectifs à longue distance focale. La performance de notre implémentation de cette méthode est comparée à celle des méthodes basées sur la perspective, nous considérons que l'exactitude obtenue et la robustesse démontré en font un élément à considérer dans toute procédure de la SfM / The study of cameras and images has been a prominent subject since the beginning of computer vision, one of the main focus being the pose estimation and 3D reconstruction. The goal of this thesis is to tackle and study some specific problems and methods of the structure-from-motion pipeline in order to provide improvements in accuracy, broad studies to comprehend the advantages and disadvantages of the state-of-the-art models and useful implementations made available to the public. More specifically, we center our attention to stereo pairs and triplets of images and discuss some of the methods and models able to provide pose estimation and 3D reconstruction of the scene.First, we address the depth estimation task for stereo pairs using block-matching. This approach implicitly assumes that all pixels in the patch have the same depth producing the common artifact known as the ``foreground fattening effect''. In order to find a more appropriate support, Yoon and Kweon introduced the use of weights based on color similarity and spatial distance, analogous to those used in the bilateral filter. We present the theory of this method and the implementation we have developed with some improvements. We discuss some variants of the method and analyze its parameters and performance.Secondly, we consider the addition of a third view and study the trifocal tensor, which describes the geometric constraints linking the three views. We explore the advantages offered by this operator in the pose estimation task of a triplet of cameras as opposed to computing the relative poses pair by pair using the fundamental matrix. In addition, we present a study and implementation of several parameterizations of the tensor. We show that the initial improvement in accuracy of the trifocal tensor is not enough to have a remarkable impact on the pose estimation after bundle adjustment and that using the fundamental matrix with image triplets remains relevant.Finally, we propose using a different projection model than the pinhole camera for the pose estimation of perspective cameras. We present a method based on the matrix factorization due to Tomasi and Kanade that relies on the orthographic projection. This method can be used in configurations where other methods fail, in particular, when using cameras with long focal length lenses. The performance of our implementation of this method is compared to that given by the perspective-based methods, we consider that the accuracy achieved and its robustness make it worth considering in any SfM procedure
|
140 |
Towards scalable, multi-view urban modeling using structure priors / Vers une modélisation urbaine 3D extensible intégrant des à priori de structure géométriqueBourki, Amine 21 December 2017 (has links)
Nous étudions dans cette thèse le problème de reconstruction 3D multi-vue à partir d’une séquence d’images au sol acquises dans des environnements urbains ainsi que la prise en compte d’a priori permettant la préservation de la structure sous-jacente de la géométrie 3D observée, ainsi que le passage à l’échelle de tels processus de reconstruction qui est intrinsèquement délicat dans le contexte de l’imagerie urbaine. Bien que ces deux axes aient été traités de manière extensive dans la littérature, les méthodes de reconstruction 3D structurée souffrent d’une complexité en temps de calculs restreignant significativement leur intérêt. D’autre part, les approches de reconstruction 3D large échelle produisent généralement une géométrie simplifiée, perdant ainsi des éléments de structures qui sont importants dans le contexte urbain. L’objectif de cette thèse est de concilier les avantages des approches de reconstruction 3D structurée à celles des méthodes rapides produisant une géométrie simplifiée. Pour ce faire, nous présentons “Patchwork Stereo”, un framework qui combine stéréoscopie photométrique utilisant une poignée d’images issues de points de vue éloignés, et un nuage de point épars. Notre méthode intègre une analyse simultanée 2D-3D réalisant une extraction robuste de plans 3D ainsi qu’une segmentation d’images top-down structurée et repose sur une optimisation par champs de Markov aléatoires. Les contributions présentées sont évaluées via des expériences quantitatives et qualitatives sur des données d’imagerie urbaine complexes illustrant des performances tant quant à la fidélité structurelle des reconstructions 3D que du passage à l’échelle / In this thesis, we address the problem of 3D reconstruction from a sequence of calibrated street-level photographs with a simultaneous focus on scalability and the use of structure priors in Multi-View Stereo (MVS).While both aspects have been studied broadly, existing scalable MVS approaches do not handle well the ubiquitous structural regularities, yet simple, of man-made environments. On the other hand, structure-aware 3D reconstruction methods are slow and scale poorly with the size of the input sequences and/or may even require additional restrictive information. The goal of this thesis is to reconcile scalability and structure awareness within common MVS grounds using soft, generic priors which encourage : (i) piecewise planarity, (ii) alignment of objects boundaries with image gradients and (iii) with vanishing directions (VDs), and (iv) objects co-planarity. To do so, we present the novel “Patchwork Stereo” framework which integrates photometric stereo from a handful of wide-baseline views and a sparse 3D point cloud combining robust 3D plane extraction and top-down image partitioning from a unified 2D-3D analysis in a principled Markov Random Field energy minimization. We evaluate our contributions quantitatively and qualitatively on challenging urban datasets and illustrate results which are at least on par with state-of-the-art methods in terms of geometric structure, but achieved in several orders of magnitude faster paving the way for photo-realistic city-scale modeling
|
Page generated in 0.1061 seconds