Spelling suggestions: "subject:"(3D) deconstruction"" "subject:"(3D) areconstruction""
151 |
3d Reconstruction Of Underwater Scenes From Uncalibrated Video SequencesKirli, Mustafa Yavuz 01 August 2008 (has links) (PDF)
The aim of this thesis is to reconstruct 3D representation of underwater scenes from uncalibrated video sequences. Underwater visualization is important for underwater Remotely Operated Vehicles and underwater is a complex structured environment because of inhomogeneous light absorption and light scattering by the environment. These factors make 3D reconstruction in underwater more challenging.
The reconstruction consists of the following stages: Image enhancement, feature detection and matching, fundamental matrix estimation, auto-calibration, recovery of extrinsic parameters, rectification, stereo matching and triangulation.
For image enhancement, a pre-processing filter is used to remove the effects of water and to enhance the images. Two feature extraction methods are examined: 1. Difference of Gaussian with SIFT feature descriptor, 2. Harris Corner Detector with grey level around the feature point. Matching is performed by finding similarities of SIFT features and by finding correlated grey levels respectively for each feature extraction method. The results show that SIFT performs better than Harris with grey level information. RANSAC method with normalized 8-point algorithm is used to estimate fundamental matrix and to reject outliers. Because of the difficulties of calibrating the cameras in underwater, auto-calibration process is examined. Rectification is also performed since it provides epipolar lines coincide with image scan lines which is helpful to stereo matching algorithms. The Graph-Cut stereo matching algorithm is used to compute corresponding pixel of each pixel in the stereo image pair. For the last stage triangulation is used to compute 3D points from the corresponding pixel pairs.
|
152 |
Unlocking the urban photographic record through 4D scene modelingSchindler, Grant 09 July 2010 (has links)
Vast collections of historical photographs are being digitally archived and placed online, providing an objective record of the last two centuries that remains largely untapped. We propose that time-varying 3D models can pull together and index large collections of images while also serving as a tool of historical discovery, revealing new information about the locations, dates, and contents of historical images. In particular, our goal is to use computer vision techniques to tie together a large set of historical photographs of a given city into a consistent 4D model of the city: a 3D model with time as an additional dimension.
To extract 4D city models from historical images, we must perform inference about the position of cameras and scene structure in both space and time. Traditional structure from motion techniques can be used to deal with the spatial problem, while here we focus on the problem of inferring temporal information: a date for each image and a time interval for which each structural element in the scene persists.
We first formulate this task as a constraint satisfaction problem based on the visibility of structural elements in each image, resulting in a temporal ordering of images. Next, we present methods to incorporate real date information into the temporal inference solution. Finally, we present a general probabilistic framework for estimating all temporal variables in structure from motion problems, including an unknown date for each camera and an unknown time interval for each structural element. Given a collection of images with mostly unknown or uncertain dates, we can use this framework to automatically recover the dates of all images by reasoning probabilistically about the visibility and existence of objects in the scene. We present results for image collections consisting of hundreds of historical images of cities taken over decades of time, including Manhattan and downtown Atlanta.
|
153 |
Single View Reconstruction for Human Face and Motion with PriorsWang, Xianwang 01 January 2010 (has links)
Single view reconstruction is fundamentally an under-constrained problem. We aim to develop new approaches to model human face and motion with model priors that restrict the space of possible solutions. First, we develop a novel approach to recover the 3D shape from a single view image under challenging conditions, such as large variations in illumination and pose. The problem is addressed by employing the techniques of non-linear manifold embedding and alignment. Specifically, the local image models for each patch of facial images and the local surface models for each patch of 3D shape are learned using a non-linear dimensionality reduction technique, and the correspondences between these local models are then learned by a manifold alignment method. Local models successfully remove the dependency of large training databases for human face modeling. By combining the local shapes, the global shape of a face can be reconstructed directly from a single linear system of equations via least square.
Unfortunately, this learning-based approach cannot be successfully applied to the problem of human motion modeling due to the internal and external variations in single view video-based marker-less motion capture. Therefore, we introduce a new model-based approach for capturing human motion using a stream of depth images from a single depth sensor. While a depth sensor provides metric 3D information, using a single sensor, instead of a camera array, results in a view-dependent and incomplete measurement of object motion. We develop a novel two-stage template fitting algorithm that is invariant to subject size and view-point variations, and robust to occlusions. Starting from a known pose, our algorithm first estimates a body configuration through temporal registration, which is used to search the template motion database for a best match. The best match body configuration as well as its corresponding surface mesh model are deformed to fit the input depth map, filling in the part that is occluded from the input and compensating for differences in pose and body-size between the input image and the template. Our approach does not require any makers, user-interaction, or appearance-based tracking.
Experiments show that our approaches can achieve good modeling results for human face and motion, and are capable of dealing with variety of challenges in single view reconstruction, e.g., occlusion.
|
154 |
A novel 3D recovery method by dynamic (de)focused projectionLertrusdachakul, Intuon 30 November 2011 (has links) (PDF)
This paper presents a novel 3D recovery method based on structured light. This method unifies depth from focus (DFF) and depth from defocus (DFD) techniques with the use of a dynamic (de)focused projection. With this approach, the image acquisition system is specifically constructed to keep a whole object sharp in all of the captured images. Therefore, only the projected patterns experience different defocused deformations according to the object's depths. When the projected patterns are out of focus, their Point Spread Function (PSF) is assumed to follow a Gaussian distribution. The final depth is computed by the analysis of the relationship between the sets of PSFs obtained from different blurs and the variation of the object's depths. Our new depth estimation can be employed as a stand-alone strategy. It has no problem with occlusion and correspondence issues. Moreover, it handles textureless and partially reflective surfaces. The experimental results on real objects demonstrate the effective performance of our approach, providing reliable depth estimation and competitive time consumption. It uses fewer input images than DFF, and unlike DFD, it ensures that the PSF is locally unique.
|
155 |
Τρισδιάστατη ανακατασκευή αντικειμένου από φωτογραφίες (με χρήση Matlab)Φάκα, Σοφία 21 March 2011 (has links)
Το αντικείμενο της εργασίας είναι η τρισδιάστατη ανακατασκευή ενός αντικειμένου ή ενός χώρου, μέσα από τουλάχιστο δύο φωτογραφίες του. Το θέμα της εργασίας είναι μέρος του τομέα της Υπολογιστικής Όρασης, που έχει μεγάλη άνθιση τα τελευταία χρόνια λόγω των πολλών εφαρμογών, όπου η γνώση της τρισδιάστατης δομής ενός αντικειμένου ή ενός χώρου κρίνεται απαραίτητη. Βέβαια, συνέβαλε σε αυτό και η ραγδαία εξέλιξη των ηλεκτρονικών υπολογιστών, με αποτέλεσμα να είναι δυνατή η ακριβής και ποιοτική απεικόνιση σύνθετων τρισδιάστατων σκηνών σε πραγματικό χρόνο, μέσω κατάλληλων αλγορίθμων.
Η τρισδιάστατη ανακατασκευή ενός αντικειμένου ή ενός χώρου, από φωτογραφίες ή βίντεο αποτελεί ένα ενδιαφέρον και εντυπωσιακό θέμα με πολλές εφαρμογές και πολύ ενθαρρυντικά αποτελέσματα. Αυτά αποτέλεσαν ουσιαστικά και το έναυσμα για την ενασχόλησή μου με τον τομέα της Υπολογιστικής Όρασης και την επιλογή του θέματος της εργασίας. Οι εφαρμογές που αναπτύσσονται στα πλαίσια της εργασίας παρέχουν οπτικά ευχάριστα αποτελέσματα και έχουν μεγάλη προσαρμοστικότητα και ευελιξία στης διάφορες συνθήκες φωτογράφησης ή λήψης βίντεο. Το σημαντικό, λοιπόν, είναι ότι δεν χρειάζεται απαραίτητα περιβάλλον εργαστηρίου για την λήψη των δεδομένων, δηλαδή των εικόνων. Προκύπτουν καλά αποτελέσματα ακόμα και με εικόνες που λήφθηκαν μέσω μίας απλής φορητής φωτογραφικής κάμερας, χωρίς τρίποδα για στήριξη. Αρκεί απλά να δημιουργήσουμε τις προϋποθέσεις για μικρή κίνηση της κάμερας μεταξύ των λήψεων των εικόνων.
Στην παρούσα εργασία παρουσιάζονται και εξετάζονται διεξοδικά όλα τα θέματα που αφορούν την τρισδιάστατη οπτικοποίηση των αντικειμένων. Αρχικά, στις παραγράφους 2.1 έως 3.2, αναλύεται η θεωρία των δύο κυριότερων μεθόδων της "Δομής και Κίνησης" και της "Στερεοσκοπικής Όρασης" . Στην συνέχεια στις παραγράφους 3.3 και 3.4 αναπτύσσεται η μεθοδολογία που ακολουθείται από τις εφαρμογές της Δομής και Κίνησης , ενώ της Στερεοσκοπική Όρασης αναπτύσσεται στην 3.5 . Στην μέθοδο της Δομής και Κίνησης περικλείονται δύο περιπτώσεις. Η πρώτη είναι η μη βαθμονομημένη περίπτωση και η δεύτερη είναι η βαθμονομημένη. Στην δεύτερη, λοιπόν, προηγείται η βαθμονόμηση της κάμερας, οπότε είναι γνωστές εκ των προτέρων οι παράμετροι της κάμερας . Η υλοποίηση των αλγορίθμων γίνεται στο παράρτημα, με την βοήθεια του περιβάλλοντος αριθμητικής υπολογιστικής της προγραμματιστικής γλώσσας Matlab. Τέλος, στο τέταρτο κεφάλαιο, δίνονται κάποια παραδείγματα ανακατασκευών που αποδεικνύουν την αποτελεσματικότητα των αλγορίθμων της υλοποίησης.
Τόσο η θεωρία, όσο και οι αλγόριθμοι που παρουσιάζονται στην παρούσα εργασία καλύπτουν πλήρως τις απαραίτητες γνώσεις για την υλοποίηση της τρισδιάστατης αναπαράστασης. Συσσωρεύτηκαν πληροφορίες από δύο μεθόδους, δηλαδή της Δομής και Κίνησης αλλά και της Στερεοσκοπικής Όρασης, οι οποίες συνδυασμένες δίνουν βέλτιστα και αρτιότερα αποτελέσματα. Πρωταρχικός στόχος της εργασίας είναι η ανάδειξη των δυνατοτήτων που παρέχουν οι συγκεκριμένες μέθοδοι. Από την άλλη μεριά, η υλοποίηση των δύο μεθόδων και κατά συνέπεια των αλγόριθμων, αποτελούν μία αρκετά καλή βάση για περαιτέρω ανάπτυξη και προώθηση για μελλοντική έρευνα στον εν λόγω τομέα. Ούτως ή άλλως, τα τελευταία χρόνια, η ερεύνα που αφορά την Υπολογιστική Όραση έχει αποδώσει και έχει δημιουργήσει ικανοποιητικότατα αποτελέσματα. Οπότε στο μέλλον αναμένονται ισχυρότεροι αλγόριθμοι, βελτιώσεις αλλά και πολλές εφαρμογές στους εξελισσόμενους τομείς της ηλεκτρονικής και όχι μόνο. / The purpose of this thesis is the three-dimensional reconstruction of an object or a space, through at least two photos. The theme is part of the field of Computer Vision, which has known great development in the recent years due to the many applications, where the knowledge of the three-dimensional structure of an object or a space is necessary. Of course to this development contributed also the rapid evolution of computers, making possible the accurate and high quality display of complex three-dimensional scenes in real time, through appropriate algorithms.
The three-dimensional reconstruction of an object or a space, using photos or video, is an interesting and impressive subject with many applications and very encouraging results. This was basically what intrigued me to involve with the field of Computer Vision and choose the topic of this thesis. The applications in this thesis provide visually pleasant results and have great adaptability and flexibility in various conditions of shooting and making of videos. What is important, therefore, is that a laboratory environment to obtain the data, meaning images, is not necessary. The results are satisfactory even with pictures taken by a simple handheld camera, without the use of a tripod. It’s enough if we just create the proper conditions for a small camera movement between the shots.
The present thesis presents and discusses thoroughly all the subjects related with the three-dimensional visualization of objects. First in paragraphs 2.1 to 3.2, is analyzed the theory of the two most important methods, of "Structure and Motion" and of "Stereo Vision". Then in paragraphs 3.3 and 3.4 is discussed the methodology followed by the applications of Structure and Motion , and this of Stereo Vision is discussed in paragraph 3.5. The method of Structure and Motion encloses two cases. The first is the non-calibrated case and the second is the calibrated case. In the calibrated case the calibration of the camera comes first, so the parameters of the camera are known in advance. The implementation of the algorithms is in the Annex, with the help of the numerical computing environment of the programming language Matlab. Finally, in chapter four, are given same examples of reconstructions that demonstrate the effectiveness of the algorithms of implementation.
Both the theory and the algorithms presented in this thesis cover fully the necessary knowledge for the materialization of the three-dimensional representation. The information are is accumulated by two methods, this of Structure and Motion and this of Stereo Vision, which combined give the best and most complete results. Primary objective of this thesis is to highlight the possibilities offered by these methods. On the other hand, the implementation of these two methods and thus the algorithms is a good basis for further development and promotion for future research in this field. Anyway, in recent years, the research on Computer Vision has given great results. So in the future are expected stronger algorithms, improvements but also many applications relating with the developing sector of electronics and more.
|
156 |
Τρισδιάστατη ανακατασκευή χώρου από ψηφιακές φωτογραφίεςΓκιννής, Μιχάλης 07 June 2013 (has links)
Σκοπός της παρούσας εργασίας είναι η παρουσίαση των σταδίων της πιο γενικής από τις παθητικές μεθόδους τρισδιάστατης ανακατασκευής στατικών σκηνών, της δομής από κίνηση. Εκτός της χρήσης στερεοσκοπικών ζευγών εικόνων για την δημιουργία πυκνών χαρτών βάθους, παρουσιάζεται και η περίπτωση χρήσης παράλληλων εικόνων, που αποτελούν την καλύτερη λύση σε περιπτώσεις που η κύρια συνιστώσα της κίνησης της κάμερας είναι παράλληλη στον οπτικό της άξονα. Επίσης, παρουσιάζεται μια πρωτότυπη γεωμετρική μέθοδος διόρθωσης των εικόνων τόσο για την δημιουργία στερεοσκοπικών ζευγών, όσο και για την περίπτωση των παράλληλων εικόνων. Σε κάθε στάδιο της όλης διαδικασίας, περιγράφονται οι μέθοδοι εκείνες που θεωρούνται οι αντιπροσωπευτικότερες της κατηγορίας τους. / The purpose of this paper is to present the phases of the most general passive method of three dimensional reconstruction of static scenes,called structure from motion.Besides using stereo images as the final result of image rectification; we examine the case of using parallel images that represent the best solution in cases where the main component of the camera motion is parallel to the optical axis. Also we show an original geometric correction method of images both for generating stereo pairs and parallel images pairs. At each stage of the process we describe some of the known methods that can be considered representative of their class.
|
157 |
3D field ion microscopy and atom probe tomography techniques for the atomic scale characterisation of radiation damage in tungstenDagan, Michal January 2016 (has links)
In this work, new reconstruction and analysis methods were developed for 3D field ion microscopy (FIM) data, motivated by the goal of atomic scale characterisation of radiation damage for fusion applications. A comparative FIM/ atom probe tomography (APT) study of radiation damage in self-implanted tungsten revealed FIM advantages in atomistic crystallographic characterisation, able to identify dislocations, large vacancy clusters, and single vacancies. While the latter is beyond the detection capabilities of APT, larger damage features were observed indirectly in APT data via trajectory aberrations and solute segregation. An automated 3DFIM reconstruction approach was developed to maintain reliable, atomistic, 3D insights into the atomic arrangements and vacancies distribution in ion-implanted tungsten. The new method was utilized for the automated âatom-by-atom' reconstruction of thousands of tungsten atoms yielding highly accurate reconstructions of atomically resolved poles but also applied to larger microstructural features such as carbides and a grain boundary, extending across larger portions of the sample. Additional tools were developed to overcome reconstruction challenges arising from the presence of crystal defects and the intrinsic distortion of FIM data. Those were employed for the automated 3D mapping of vacancies in ion-implanted tungsten, analysing their distribution in a volume extending across 50nm into the depth of the sample. The new FIM reconstruction also opened the door for more advanced analyses on FIM data. It was applied to the preliminary studies of the distortion of the reconstructed planes, found to depend on crystallographic orientation, with an increased variance in atomic positions measured in a radial direction to the centre of the poles. Additional analyses followed the subtle displacements in atomic coordinates on consecutive FIM images, to find them affected by the evaporation of atoms from the same plane. The displacements were found to increase with size as the distance to the evaporated atom decreased, and are likely to be the result of a convolution between image gas effects, surface atoms relaxation, and charge re-distribution. These measurements show potential to probe the dynamic nature of the FIM experiment and possibly resolve contributions from the different processes effecting the final image. Finally, APT characterisation was performed on bulk and pre-sharpened needles to determine the effect of sample's geometry on the resulting implantation profiles, and the extent to which pre-sharpened needles could be employed in radiation damage studies. While the ions depth profiles in needles were not found within a good match to SRIM simulations, the damage profiles exhibited closer agreement. Further, the concentration of implanted ions in bulk samples was found significantly higher than in the respective needle implanted samples, with excessive loss found for the light ion implantation.
|
158 |
Fusion of sonar and stereo images for underwater archeology / Fusion de données sonar et stéréoscopiques : application à l’archéologie sous-marineOnmek, Yadpiroon 19 December 2017 (has links)
L’objectif de ce travail est de reconstruire en 3D des objets archéologiques en environnement sousmarin. Une méthode de fusion est proposée afin d’obtenir une carte 3D à partir de données sonar et du système de stéréovision sous marin.Le manuscrit est décomposé en deux parties principales : Dans une première partie, la reconstruction d’objets 3D est réalisée à partir d’un système utilisant une paire de caméras stéréoscopiques. La seconde partie utilise les données 3D de l’environnement sous marin à partir d’un sonar multifaisceaux. Ensuite, les deux informations sont mises en correspondance, données optiques et acoustiques, nécessitant de prendre en compte les différentes résolutions de ces deux capteurs.La première partie se concentre sur le système optique permettant la reconstruction 3D des objets avec la paire stéréoscopique. Le travail se fait à partir de la séquence vidéo enregistrée par des plongeurs dans des environnements dont la complexité est croissante (piscine, lac, mer). L’utilisation d’une mire de calibration sous marine permet la calibration de chaque caméra afin d’en exprimer le modèle interne, puis de déterminer les paramètres externes de rotation et de translation entre les deux caméras stéréoscopiques.La difficulté de ce travail est de construire une reconstruction 3D à partir de la séquence vidéo stéréoscopique et d’extraire les paires d’images permettant d’y parvenir.La recherche de points d’intérêts et leur mise en correspondance est réalisée en appliquant la méthode de RANSAC. La triangulation des informations pertinentes des images 2D vers le nuage de points 3D est réalisée à partir du modèle à projection centrale et l’estimation de la distance euclidienne. La texture et le rendu 3D sont ensuite obtenus par rétropropagation de ces informations dans les images 2D. La séquence temporelle des images permet une reconstruction 3D des points de l’objet en estimant les différentes transformations entre les paires d’image et en se basant sur une méthode type Structure From Motion.La seconde partie permet d’effectuer la fusion de ce modèle 3D avec la carte acoustique fournie par le sonar multifaisceaux. Afin de guider l’alignement des deux modèles, une initialisation manuelle est nécessaire, en sélectionnant des points d’intérêt sur les deux nuages de points. La mise en correspondance est finalisée par un algorithme d’Iterative Closest Points.Ce travail a permis la création d’une carte 3D multimodale utilisant un modèle 3D obtenu à partir d’une séquence vidéo et d’une carte acoustique. / The objective of this work is to create the 3D reconstruction of the archaeologicalobjects in underwater environment. The fusion technique is present, to obtainedthe 3D maps from the optical and acoustic systems. This work is divided intotwo main parts; First, we created the 3D reconstruction of the underwater scenesfrom the optic system by using the stereo cameras. Second, we created the 3Dinformaton of the underwater environment from the acoustic system by using themultibeam sonar. And we merge the two different types of map, i.e. from the opticand acoustic, which is all the more difficult for different task because of differentresolutions.The first part focus on the optical system used, to carry out the 3D reconstruc-tion by using the stereoscopic device. The underwater video and images sequencefor this work were acquired by divers in different underwater environment such asthe sea, the lake and the pool. First using a stereo camera to take a video of a cali-bration chessboard to calibrate the parameters of the camera, the intrinsic parame-ters are estimated for each camera, and then the external parameters are estimatedto determine the rotation matrix and translation vector of the stereo camera.The aims of this work is to create 3D reconstruction from multiple images.In each images pair,the features of interest are selected and matched across im-age pairs. An additional outlier removal step is performed based on the RANSACmethod. Triangulation of the inlier features from the 2D images space into a sparse3D points cloud is done by using a pinhole camera model and Euclidean distanceestimation. Then, the texture and rendering of the 3D stereo model are processed.The tempolral sequence of images is processed into a set of local 3D reconstructionwith estimate of coordinate transformation between temporally adjacent 3D localreconstruction by using the SFM method.The second part consists of fusing the 3D model obtained previously with theacoustic map. To align the two 3D models (optical model and acoustic model), we use a first approximate registration by selecting manually few points on each cloud.To increase the accuracy of this registration, we use analgorithm ICP (IterativeClosest Point).In this work we created a 3D underwatermultimodal map performedusingglobal 3D reconstruction model and an acousticglobal map.
|
159 |
Tomographie par rayons X haute résolution : application à l'intégration 3D pour la microélectronique / High resolution X-ray tomography : application to 3D Integration for microelectronicsLaloum, David 29 September 2015 (has links)
Les travaux de ce doctorat concernent le développement d'une technique de caractérisation non destructive encore peu utilisée dans le domaine de la microélectronique : la tomographie par rayons X dans un microscope électronique à balayage. Cet instrument a été utilisé pour l'analyse haute résolution d'interconnexions métalliques, telles que les piliers de cuivre ainsi que les vias traversants, utilisées dans le cadre de l'intégration 3D pour connecter verticalement plusieurs puces entre elles. Les contributions les plus significatives de ces travaux sont : (1) l'amélioration des capacités d'analyse offertes par l'instrument. De nombreuses études – simulations et expériences – ont été menées afin de déterminer et améliorer les résolutions 2D et 3D de ce système d'imagerie. Il a été montré que la résolution 2D de ce système d'imagerie pouvait atteindre 60 nanomètres. La qualité des images acquises et reconstruites a également été améliorée à travers l'implémentation d'algorithmes de reconstruction itératifs et de nombreuses méthodes d'alignement des radiographies. (2) La réduction du temps d'analyse d'un facteur 3 à travers l'implémentation d'algorithmes de reconstruction contraints tels que la méthode de reconstruction basée sur la minimisation de la variation totale. (3) La mise en place d'algorithmes de correction efficaces pour l'élimination d'artéfacts de reconstruction liés à la polychromaticité du faisceau de rayons X utilisé. (4) La mise en application de l'ensemble de ces algorithmes sur des cas réels, rencontrés par des technologues. / In this thesis, an original non-destructive 3D characterization technique has been developed : the X-ray tomography hosted in a scanning electron microscope. This instrument is not widely used in the microelectronics field. This computed tomography (CT) system has been used for the high resolution analysis of metallic interconnections such as copper pillars and through silicon vias (TSVs). These components are widely used in the field of 3D integration to make vertical stacks of interconnected chips.The most significant contributions of this thesis are : (1) the enhancement of the analytical capabilities of the instrument. Many studies – simulations and experiments – have been performed in order to determine and improve the 2D and 3D resolutions of this imaging system. It has been shown that the 2D resolution of this instrument can reach 60 nanometers. The quality of the projections and reconstruction has also been improved through the implementation of iterative reconstruction algorithms and various projections alignment methods. (2) The reduction of the scanning time by a factor 3 through the implementation of constrained reconstruction techniques such as the reconstruction method based on the total variation minimization. (3) The application of effective correction algorithms for removing reconstruction artefacts due to the polychromaticity of the X-ray beam. (4) The application of all these reconstruction methods and algorithms on real cases encountered by materials engineers.
|
160 |
Room layout estimation on mobile devicesAngladon, Vincent 27 April 2018 (has links) (PDF)
Room layout generation is the problem of generating a drawing or a digital model of an existing room from a set of measurements such as laser data or images. The generation of floor plans can find application in the building industry to assess the quality and the correctness of an ongoing construction w.r.t. the initial model, or to quickly sketch the renovation of an apartment. Real estate industry can rely on automatic generation of floor plans to ease the process of checking the livable surface and to propose virtual visits to prospective customers. As for the general public, the room layout can be integrated into mixed reality games to provide a better immersiveness experience, or used in other related augmented reality applications such room redecoration. The goal of this industrial thesis (CIFRE) is to investigate and take advantage of the state-of-the art mobile devices in order to automate the process of generating room layouts. Nowadays, modern mobile devices usually come a wide range of sensors, such as inertial motion unit (IMU), RGB cameras and, more recently, depth cameras. Moreover, tactile touchscreens offer a natural and simple way to interact with the user, thus favoring the development of interactive applications, in which the user can be part of the processing loop. This work aims at exploiting the richness of such devices to address the room layout generation problem. The thesis has three major contributions. We first show how the classic problem of detecting vanishing points in an image can benefit from an a-priori given by the IMU sensor. We propose a simple and effective algorithm for detecting vanishing points relying on the gravity vector estimated by the IMU. A new public dataset containing images and the relevant IMU data is introduced to help assessing vanishing point algorithms and foster further studies in the field. As a second contribution, we explored the state of-the-art of real-time localization and map optimization algorithms for RGB-D sensors. Real-time localization is a fundamental task to enable augmented reality applications, and thus it is a critical component when designing interactive applications. We propose an evaluation of existing algorithms for the common desktop set-up in order to be employed on a mobile device. For each considered method, we assess the accuracy of the localization as well as the computational performances when ported on a mobile device. Finally, we present a proof of concept of application able to generate the room layout relying on a Project Tango tablet equipped with an RGB-D sensor. In particular, we propose an algorithm that incrementally processes and fuses the 3D data provided by the sensor in order to obtain the layout of the room. We show how our algorithm can rely on the user interactions in order to correct the generated 3D model during the acquisition process.
|
Page generated in 0.092 seconds