71 |
3D mapping with iPhone / 3D-kartering med iPhoneLundqvist, Tobias January 2011 (has links)
Today, 3D models of cities are created from aerial images using a camera rig. Images, together with sensor data from the flights, are stored for further processing when building 3D models. However, there is a market demand for a more mobile solution of satisfactory quality. If the camera position can be calculated for each image, there is an existing algorithm available for the creation of 3D models. This master thesis project aims to investigate whether the iPhone 4 offers good enough image and sensor data quality from which 3D models can be created. Calculations on movements and rotations from sensor data forms the foundation of the image processing, and should refine the camera position estimations. The 3D models are built only from image processing since sensor data cannot be used due to poor data accuracy. Because of that, the scaling of the 3D models are unknown and a measurement is needed on the real objects to make scaling possible. Compared to a test algorithm that calculates 3D models from only images, already available at the SBD’s system, the quality of the 3D model in this master thesis project is almost the same or, in some respects, even better when compared with the human eye.
|
72 |
3d Face Reconstruction Using Stereo Images And Structured LightOzturk, Oguz Ahmet 01 December 2007 (has links) (PDF)
Nowadays, 3D modelling of objects from multiple images is a topic that has gained great recognition and is widely used in various fields. Recently, lots of progress has been made in identification of people using 3D face models, which are usually reconstructed from multiple face images. In this thesis, a system including stereo cameras and structured light is built for the purpose of 3D modelling. The system outputs are 3D shapes of the face and also the texture information registered to this shape. Although the system in this thesis is developed for face reconstruction, it is not specific to faces. Using the same methodology proposed in this study 3D reconstruction of any object can be achieved.
|
73 |
Multi-View Reconstruction and Camera Recovery using a Real or Virtual Reference PlaneRother, Carsten January 2003 (has links)
<p>Reconstructing a 3-dimensional scene from a set of2-dimensional images is a fundamental problem in computervision. A system capable of performing this task can be used inmany applications in robotics, architecture, archaeology,biometrics, human computer interaction and the movie andentertainment industry.</p><p>Most existing reconstruction approaches exploit one sourceof information to tackle the problem. This is the motion of thecamera, the 2D images are taken from different viewpoints. Weexploit an additional information source, the reference plane,which makes it possible to reconstruct difficult scenes whereother methods fail. A real scene plane may serve as thereference plane. Furthermore, there are many alternativetechniques to obtain virtual reference planes. For instance,orthogonal directions in the scene provide a virtual referenceplane, the plane at infinity, or images taken with a parallelprojection camera. A collection of known and novel referenceplane scenarios is presented in this thesis.</p><p>The main contribution of the thesis is a novel multi-viewreconstruction approach using a reference plane. The techniqueis applicable to three different feature types, points, linesand planes. The novelty of our approach is that all cameras andall features (off the reference plane) are reconstructedsimultaneously from a single linear system of imagemeasurements. It is based on the novel observation that camerasand features have a linear relationship if a reference plane isknown. In the absence of a reference plane, this relationshipis non-linear. Thus many previousmethods must reconstructfeatures and cameras sequentially. Another class of methods,popular in the literature, is factorization, but, in contrastto our approach, this has the serious practical drawback thatall features are required to be visible in all views. Extensiveexperiments show that our approach is superior to allpreviously suggested reference plane and non-reference planemethods for difficult reference plane scenarios.</p><p>Furthermore, the thesis studies scenes which do not have aunique reconstruction, so-called critical configurations. It isproven that in the presence of a reference plane the set ofcritical configurations is small.</p><p>Finally, the thesis introduces a complete, automaticmulti-view reconstruction system based on the reference planeapproach. The input data is a set of images and the output a 3Dpoint reconstruction together with the correspondingcameras.</p>
|
74 |
Towards Intelligent Telerobotics: Visualization and Control of Remote RobotFu, Bo 01 January 2015 (has links)
Human-machine cooperative or co-robotics has been recognized as the next generation of robotics. In contrast to current systems that use limited-reasoning strategies or address problems in narrow contexts, new co-robot systems will be characterized by their flexibility, resourcefulness, varied modeling or reasoning approaches, and use of real-world data in real time, demonstrating a level of intelligence and adaptability seen in humans and animals. The research I focused is in the two sub-field of co-robotics: teleoperation and telepresence. We firstly explore the ways of teleoperation using mixed reality techniques. I proposed a new type of display: hybrid-reality display (HRD) system, which utilizes commodity projection device to project captured video frame onto 3D replica of the actual target surface. It provides a direct alignment between the frame of reference for the human subject and that of the displayed image. The advantage of this approach lies in the fact that no wearing device needed for the users, providing minimal intrusiveness and accommodating users eyes during focusing. The field-of-view is also significantly increased. From a user-centered design standpoint, the HRD is motivated by teleoperation accidents, incidents, and user research in military reconnaissance etc. Teleoperation in these environments is compromised by the Keyhole Effect, which results from the limited field of view of reference. The technique contribution of the proposed HRD system is the multi-system calibration which mainly involves motion sensor, projector, cameras and robotic arm. Due to the purpose of the system, the accuracy of calibration should also be restricted within millimeter level. The followed up research of HRD is focused on high accuracy 3D reconstruction of the replica via commodity devices for better alignment of video frame. Conventional 3D scanner lacks either depth resolution or be very expensive. We proposed a structured light scanning based 3D sensing system with accuracy within 1 millimeter while robust to global illumination and surface reflection. Extensive user study prove the performance of our proposed algorithm. In order to compensate the unsynchronization between the local station and remote station due to latency introduced during data sensing and communication, 1-step-ahead predictive control algorithm is presented. The latency between human control and robot movement can be formulated as a linear equation group with a smooth coefficient ranging from 0 to 1. This predictive control algorithm can be further formulated by optimizing a cost function.
We then explore the aspect of telepresence. Many hardware designs have been developed to allow a camera to be placed optically directly behind the screen. The purpose of such setups is to enable two-way video teleconferencing that maintains eye-contact. However, the image from the see-through camera usually exhibits a number of imaging artifacts such as low signal to noise ratio, incorrect color balance, and lost of details. Thus we develop a novel image enhancement framework that utilizes an auxiliary color+depth camera that is mounted on the side of the screen. By fusing the information from both cameras, we are able to significantly improve the quality of the see-through image. Experimental results have demonstrated that our fusion method compares favorably against traditional image enhancement/warping methods that uses only a single image.
|
75 |
Top-Down Bayesian Modeling and Inference for Indoor ScenesDel Pero, Luca January 2013 (has links)
People can understand the content of an image without effort. We can easily identify the objects in it, and figure out where they are in the 3D world. Automating these abilities is critical for many applications, like robotics, autonomous driving and surveillance. Unfortunately, despite recent advancements, fully automated vision systems for image understanding do not exist. In this work, we present progress restricted to the domain of images of indoor scenes, such as bedrooms and kitchens. These environments typically have the "Manhattan" property that most surfaces are parallel to three principal ones. Further, the 3D geometry of a room and the objects within it can be approximated with simple geometric primitives, such as 3D blocks. Our goal is to reconstruct the 3D geometry of an indoor environment while also understanding its semantic meaning, by identifying the objects in the scene, such as beds and couches. We separately model the 3D geometry, the camera, and an image likelihood, to provide a generative statistical model for image data. Our representation captures the rich structure of an indoor scene, by explicitly modeling the contextual relationships among its elements, such as the typical size of objects and their arrangement in the room, and simple physical constraints, such as 3D objects do not intersect. This ensures that the predicted image interpretation will be globally coherent geometrically and semantically, which allows tackling the ambiguities caused by projecting a 3D scene onto an image, such as occlusions and foreshortening. We fit this model to images using MCMC sampling. Our inference method combines bottom-up evidence from the data and top-down knowledge from the 3D world, in order to explore the vast output space efficiently. Comprehensive evaluation confirms our intuition that global inference of the entire scene is more effective than estimating its individual elements independently. Further, our experiments show that our approach is competitive and often exceeds the results of state-of-the-art methods.
|
76 |
3D geometrijos atstatymas panaudojant Kinect jutiklį / 3D geometry reconstruction using Kinect sensorUdovenko, Nikita 23 July 2012 (has links)
Šiame darbe yra tiriamos atrankiojo aplinkos 3D geometrijos atstatymo galimybės panaudojant Kinect jutiklio kombinuotą vaizdo-gylio kamerą: pateikiamas matematinis atstatymo modelis, jo parametrizavimui reikalingi koeficientai, apibūdinama tikėtinų paklaidų apimtis, siūloma aktualių scenos duomenų išskyrimo iš scenos procedūra, tiriamas gaunamo modelio triukšmas ir jo pašalinimo galimybės ir metodai. Atstatyta geometrija yra pateikiama metrinėje matų sistemoje, ir kiekvienas 3D scenos taškas papildomai saugo savo spalvinę informaciją. Praktinėje dalyje pateikiama sukurta taikomoji programa yra įgyvendinta naudojant C++ ir OpenCV matematines programavimo bibliotekas. Ji atlieka 3D geometrijos atstatymą pagal pateiktą teorinį modelį, išskiria aktualius scenos duomenis, pašalina triukšmą ir gali išsaugoti gautus duomenis į 3D modeliavimo programoms suprantamą PLY formato bylą. Darbą sudaro: įvadas, 3 skyriai, išvados ir literatūros sąrašas. Darbo apimtis – 61 p. teksto be priedų, 43 paveikslai, 4 lentelės, 22 bibliografiniai šaltiniai. / The purpose of this thesis is to investigate the possibilities of selective 3D geometry reconstruction using Kinect combined image-depth camera: a mathematical reconstruction model is provided, as well as coefficients to parametrize it and estimates on expected precision; a procedure on filtering out the background from depth image is proposed, depth image noise and possibilities for its removal are studied. Resulting reconstructed geometry is provided using metric system of measurement, and each 3D point also retains it's color data. Resulting application is implemented in C++ programming language and uses OpenCV programming library. It implements 3D geometry reconstruction as described in theory section, removes background from depth image, as well as noise, and is able to save the resulting 3D geometry to a 3D modeling applications readable file format. Structure: introduction, 3 chapters, conclusions, references. Thesis consists of – 61 p. of text, 43 figures, 4 tables, 22 bibliographical entries.
|
77 |
Single View Modeling and View SynthesisLiao, Miao 01 January 2011 (has links)
This thesis develops new algorithms to produce 3D content from a single camera. Today, amateurs can use hand-held camcorders to capture and display the 3D world in 2D, using mature technologies. However, there is always a strong desire to record and re-explore the 3D world in 3D. To achieve this goal, current approaches usually make use of a camera array, which suffers from tedious setup and calibration processes, as well as lack of portability, limiting its application to lab experiments.
In this thesis, I try to produce the 3D contents using a single camera, making it as simple as shooting pictures. It requires a new front end capturing device rather than a regular camcorder, as well as more sophisticated algorithms. First, in order to capture the highly detailed object surfaces, I designed and developed a depth camera based on a novel technique called light fall-off stereo (LFS). The LFS depth camera outputs color+depth image sequences and achieves 30 fps, which is necessary for capturing dynamic scenes. Based on the output color+depth images, I developed a new approach that builds 3D models of dynamic and deformable objects. While the camera can only capture part of a whole object at any instance, partial surfaces are assembled together to form a complete 3D model by a novel warping algorithm.
Inspired by the success of single view 3D modeling, I extended my exploration into 2D-3D video conversion that does not utilize a depth camera. I developed a semi-automatic system that converts monocular videos into stereoscopic videos, via view synthesis. It combines motion analysis with user interaction, aiming to transfer as much depth inferring work from the user to the computer. I developed two new methods that analyze the optical flow in order to provide additional qualitative depth constraints. The automatically extracted depth information is presented in the user interface to assist with user labeling work.
In this thesis, I developed new algorithms to produce 3D contents from a single camera. Depending on the input data, my algorithm can build high fidelity 3D models for dynamic and deformable objects if depth maps are provided. Otherwise, it can turn the video clips into stereoscopic video.
|
78 |
MONOCULAR POSE ESTIMATION AND SHAPE RECONSTRUCTION OF QUASI-ARTICULATED OBJECTS WITH CONSUMER DEPTH CAMERAYe, Mao 01 January 2014 (has links)
Quasi-articulated objects, such as human beings, are among the most commonly seen objects in our daily lives. Extensive research have been dedicated to 3D shape reconstruction and motion analysis for this type of objects for decades. A major motivation is their wide applications, such as in entertainment, surveillance and health care. Most of existing studies relied on one or more regular video cameras. In recent years, commodity depth sensors have become more and more widely available. The geometric measurements delivered by the depth sensors provide significantly valuable information for these tasks. In this dissertation, we propose three algorithms for monocular pose estimation and shape reconstruction of quasi-articulated objects using a single commodity depth sensor. These three algorithms achieve shape reconstruction with increasing levels of granularity and personalization. We then further develop a method for highly detailed shape reconstruction based on our pose estimation techniques.
Our first algorithm takes advantage of a motion database acquired with an active marker-based motion capture system. This method combines pose detection through nearest neighbor search with pose refinement via non-rigid point cloud registration. It is capable of accommodating different body sizes and achieves more than twice higher accuracy compared to a previous state of the art on a publicly available dataset.
The above algorithm performs frame by frame estimation and therefore is less prone to tracking failure. Nonetheless, it does not guarantee temporal consistent of the both the skeletal structure and the shape and could be problematic for some applications. To address this problem, we develop a real-time model-based approach for quasi-articulated pose and 3D shape estimation based on Iterative Closest Point (ICP) principal with several novel constraints that are critical for monocular scenario. In this algorithm, we further propose a novel method for automatic body size estimation that enables its capability to accommodate different subjects.
Due to the local search nature, the ICP-based method could be trapped to local minima in the case of some complex and fast motions. To address this issue, we explore the potential of using statistical model for soft point correspondences association. Towards this end, we propose a unified framework based on Gaussian Mixture Model for joint pose and shape estimation of quasi-articulated objects. This method achieves state-of-the-art performance on various publicly available datasets.
Based on our pose estimation techniques, we then develop a novel framework that achieves highly detailed shape reconstruction by only requiring the user to move naturally in front of a single depth sensor. Our experiments demonstrate reconstructed shapes with rich geometric details for various subjects with different apparels.
Last but not the least, we explore the applicability of our method on two real-world applications. First of all, we combine our ICP-base method with cloth simulation techniques for Virtual Try-on. Our system delivers the first promising 3D-based virtual clothing system. Secondly, we explore the possibility to extend our pose estimation algorithms to assist physical therapist to identify their patients’ movement dysfunctions that are related to injuries. Our preliminary experiments have demonstrated promising results by comparison with the gold standard active marker-based commercial system. Throughout the dissertation, we develop various state-of-the-art algorithms for pose estimation and shape reconstruction of quasi-articulated objects by leveraging the geometric information from depth sensors. We also demonstrate their great potentials for different real-world applications.
|
79 |
Motion Correction Structured Light using Pattern Interleaving TechniqueCavaturu, Raja Kalyan Ram 01 January 2008 (has links)
Phase Measuring Profilometry (PMP) is the most robust scanning technique for static 3D data acquisition. To make this technique robust to the target objects which are in motion during the scan interval a novel algorithm called ‘Pattern Interleaving’ is used to get a high density single scan image and making Phase Measuring Profilometry insensitive to ‘z’ motion and prevent motion banding which is predominant in 3D reconstruction when the object is in motion during the scan time
|
80 |
Robust Self-Calibration and Fundamental Matrix Estimation in 3D Computer VisionRastgar, Houman 30 September 2013 (has links)
The recent advances in the field of computer vision have brought many of the laboratory algorithms into the realm of industry. However, one problem that still remains open in the field of 3D vision is the problem of noise. The challenging problem of 3D structure recovery from images is highly sensitive to the presence of input data that are contaminated by errors that do not conform to ideal assumptions. Tackling the problem of extreme data, or outliers has led to many robust methods in the field that are able to handle moderate levels of outliers and still provide accurate outputs. However, this problem remains open, especially for higher noise levels and so it has been the goal of this thesis to address the issue of robustness with respect to two central problems in 3D computer vision. The two problems are highly related and they have been presented together within a Structure from Motion (SfM) context. The first, is the problem of robustly estimating the fundamental matrix from images whose correspondences contain high outlier levels. Even though this area has been extensively studied, two algorithms have been proposed that significantly speed up the computation of the fundamental matrix and achieve accurate results in scenarios containing more than 50% outliers. The presented algorithms rely on ideas from the field of robust statistics in order to develop guided sampling techniques that rely on information inferred from residual analysis. The second, problem addressed in this thesis is the robust estimation of camera intrinsic parameters from fundamental matrices, or self-calibration. Self-calibration algorithms are notoriously unreliable for general cases and it is shown that the existing methods are highly sensitive to noise. In spite of this, robustness in self-calibration has received little attention in the literature. Through experimental results, it is shown that it is essential for a real-world self-calibration algorithm to be robust. In order to introduce robustness to the existing methods, three robust algorithms have been proposed that utilize existing constraints for self-calibration from the fundamental matrix. However, the resulting algorithms are less affected by noise than existing algorithms based on these constraints. This is an important milestone since self-calibration offers many possibilities by providing estimates of camera parameters without requiring access to the image acquisition device. The proposed algorithms rely on perturbation theory, guided sampling methods and a robust root finding method for systems of higher order polynomials. By adding robustness to self-calibration it is hoped that this idea is one step closer to being a practical method of camera calibration rather than merely a theoretical possibility.
|
Page generated in 0.0788 seconds