Spelling suggestions: "subject:"depth destimation"" "subject:"depth coestimation""
1 |
Tomographic inversion of traveltime data in reflection seismologyWilliamson, P. R. January 1986 (has links)
No description available.
|
2 |
Learning Unsupervised Depth Estimation, from Stereo to Monocular ImagesPilzer, Andrea 22 June 2020 (has links)
In order to interact with the real world, humans need to perform several tasks such as object detection, pose estimation, motion estimation and distance estimation. These tasks are all part of scene understanding and are fundamental tasks of computer vision. Depth estimation received unprecedented attention from the research community in recent years due to the growing interest in its practical applications (ie robotics, autonomous driving, etc.) and the performance improvements achieved with deep learning. In fact, the applications expanded from the more traditional tasks such as robotics to new fields such as autonomous driving, augmented reality devices and smartphones applications. This is due to several factors. First, with the increased availability of training data, bigger and bigger datasets were collected. Second, deep learning frameworks running on graphical cards exponentially increased the data processing capabilities allowing for higher precision deep convolutional networks, ConvNets, to be trained. Third, researchers applied unsupervised optimization objectives to ConvNets overcoming the hurdle of collecting expensive ground truth and fully exploiting the abundance of images available in datasets.
This thesis addresses several proposals and their benefits for unsupervised depth estimation, i.e., (i) learning from resynthesized data, (ii) adversarial learning, (iii) coupling generator and discriminator losses for collaborative training, and (iv) self-improvement ability of the learned model. For the first two points, we developed a binocular stereo unsupervised depth estimation model that uses reconstructed data as an additional self-constraint during training. In addition to that, adversarial learning improves the quality of the reconstructions, further increasing the performance of the model. The third point is inspired by scene understanding as a structured task. A generator and a discriminator joining their efforts in a structured way improve the quality of the estimations. Our intuition may sound counterintuitive when cast in the general framework of adversarial learning. However, in our experiments we demonstrate the effectiveness of the proposed approach. Finally, self-improvement is inspired by estimation refinement, a widespread practice in dense reconstruction tasks like depth estimation. We devise a monocular unsupervised depth estimation approach, which measures the reconstruction errors in an unsupervised way, to produce a refinement of the depth predictions. Furthermore, we apply knowledge distillation to improve the student ConvNet with the knowledge of the teacher ConvNet that has access to the errors.
|
3 |
Obstacle detection using stereo vision for unmanned ground vehiclesOlsson, Martin January 2009 (has links)
No description available.
|
4 |
Temporally consistent semantic segmentation in videosRaza, Syed H. 08 June 2015 (has links)
The objective of this Thesis research is to develop algorithms for temporally consistent semantic segmentation in videos. Though many different forms of semantic segmentations exist, this research is focused on the problem of temporally-consistent holistic scene understanding in outdoor videos. Holistic scene understanding requires an understanding of many individual aspects of the scene including 3D layout, objects present, occlusion boundaries, and depth. Such a description of a dynamic scene would be useful for many robotic applications including object reasoning, 3D perception, video analysis, video coding, segmentation, navigation and activity recognition.
Scene understanding has been studied with great success for still images. However, scene understanding in videos requires additional approaches to account for the temporal variation, dynamic information, and exploiting causality. As a first step, image-based scene understanding methods can be directly applied to individual video frames to generate a description of the scene. However, these methods do not exploit temporal information across neighboring frames. Further, lacking temporal consistency, image-based methods can result in temporally-inconsistent labels across frames. This inconsistency can impact performance, as scene labels suddenly change between frames.
The objective of our this study is to develop temporally consistent scene descriptive algorithms by processing videos efficiently, exploiting causality and data-redundancy, and cater for scene dynamics. Specifically, we achieve our research objectives by (1) extracting geometric context from videos to give broad 3D structure of the scene with all objects present, (2) Detecting occlusion boundaries in videos due to depth discontinuity, (3) Estimating depth in videos by combining monocular and motion features with semantic features and occlusion boundaries.
|
5 |
Εκτίμηση βάθους σκηνής από κάμερα τοποθετημένη σε αυτοκίνητο που κινείταιΚαπρινιώτης, Αχιλλέας 10 June 2014 (has links)
Στη διπλωματική αυτή εργασία αναλύεται η εκτίμηση του βάθους μίας άκαμπτης σκηνής από κάμερα τοποθετημένη σε αυτοκίνητο που κινείται. Στο κεφάλαιο 1 γίνεται μία εισαγωγή στον τομέα της Υπολογιστικής Όρασης και δίνονται μερικά παραδείγματα εφαρμογών της. Στο κεφάλαιο 2 περιγράφονται βασικές αρχές της προβολικής γεωμετρίας που χρησιμοποιείται ως μαθηματικό υπόβαθρο για τα επόμενα κεφάλαια. Στο κεφάλαιο 3 γίνεται λόγος για το θεωρητικό μοντέλο της κάμερας, των παραμέτρων της και των παραμορφώσεων που υπεισέρχονται στο μοντέλο αυτό. Στο κεφάλαιο 4 αναφέρεται η διαδικασία βαθμονόμησης της κάμερας, μαζί με την υλοποίησή της. Στο κεφάλαιο 5 παρουσιάζονται γενικές κατηγορίες των στερεοσκοπικών αλγορίθμων που χρησιμοποιούνται, καθώς και τα κατάλληλα μέτρα ομοιότητάς τους. Στο κεφάλαιο 6 γίνεται αναφορά στον ανιχνευτή γωνιών Harris και γίνεται η εφαρμογή του τόσο ως προς την ανίχνευση των γωνιών, όσο και ως προς την αντιστοίχιση των 2 εικόνων. Στο κεφάλαιο 7 αναλύεται η θεωρία του αλγόριθμου SIFT και δίνεται ένα παράδειγμα ανίχνευσης και αντιστοίχισης χαρακτηριστικών. Στο κεφάλαιο 8 επισημαίνονται οι βασικές αρχές της επιπολικής γεωμετρίας, καθώς η σημασία της διόρθωσης των εικόνων. Στο κεφάλαιο 9 αναφέρεται η συνολική διαδικασία που ακολουθήθηκε, μαζί με την περιγραφή και την υλοποίηση των μεθόδων εκτίμησης βάθους που χρησιμοποιήθηκαν. / The current master’s thesis analyzes the depth estimation of a rigid scene from a camera attached to a moving vehicle. The first chapter gives an introduction to the field of Computer Vision and provides some examples of its applications. The second chapter describes basic principles of projective geometry that are being used as mathematical background for the next chapters. The third chapter refers to the theoretical modeling of a camera, along with its parameters and the distortions that appear in this model. The forth chapter deals with the camera calibration procedure, along with its implementation. Chapter five presents general categories of stereoscopic algorithms, along with their similarity measures. Chapter six talks about Harris corner detector and its implementation in detecting corners and in the matching process as well. Chapter 7 analyzes the SIFT algorithm theory and gives an example of detecting and matching features. Chapter 8 highlights basic principles of epipolar geometry and stresses out the importance of image rectification. Chapter nine presents the procedure that has been followed, along with the description and implementation of the depth estimation methods that have been used.
|
6 |
Domain-Independent Moving Object Depth Estimation using Monocular Camera / Domän-oberoende djupestimering av objekt i rörelse med monokulär kameraNassir, Cesar January 2018 (has links)
Today automotive companies across the world strive to create vehicles with fully autonomous capabilities. There are many benefits of developing autonomous vehicles, such as reduced traffic congestion, increased safety and reduced pollution, etc. To be able to achieve that goal there are many challenges ahead, one of them is visual perception. Being able to estimate depth from a 2D image has been shown to be a key component for 3D recognition, reconstruction and segmentation. Being able to estimate depth in an image from a monocular camera is an ill-posed problem since there is ambiguity between the mapping from colour intensity and depth value. Depth estimation from stereo images has come far compared to monocular depth estimation and was initially what depth estimation relied on. However, being able to exploit monocular cues is necessary for scenarios when stereo depth estimation is not possible. We have presented a novel CNN network, BiNet which is inspired by ENet, to tackle depth estimation of moving objects using only a monocular camera in real-time. It performs better than ENet in the Cityscapes dataset while adding only a small overhead to the complexity. / I dag strävar bilföretag över hela världen för att skapa fordon med helt autonoma möjligheter. Det finns många fördelar med att utveckla autonoma fordon, såsom minskad trafikstockning, ökad säkerhet och minskad förorening, etc. För att kunna uppnå det målet finns det många utmaningar framåt, en av dem är visuell uppfattning. Att kunna uppskatta djupet från en 2D-bild har visat sig vara en nyckelkomponent för 3D-igenkännande, rekonstruktion och segmentering. Att kunna uppskatta djupet i en bild från en monokulär kamera är ett svårt problem eftersom det finns tvetydighet mellan kartläggningen från färgintensitet och djupvärde. Djupestimering från stereobilder har kommit långt jämfört med monokulär djupestimering och var ursprungligen den metod som man har förlitat sig på. Att kunna utnyttja monokulära bilder är dock nödvändig för scenarier när stereodjupuppskattning inte är möjligt. Vi har presenterat ett nytt nätverk, BiNet som är inspirerat av ENet, för att ta itu med djupestimering av rörliga objekt med endast en monokulär kamera i realtid. Det fungerar bättre än ENet med datasetet Cityscapes och lägger bara till en liten kostnad på komplexiteten.
|
7 |
Self-supervised monocular image depth learning and confidence estimationChen, L., Tang, W., Wan, Tao Ruan, John, N.W. 17 June 2020 (has links)
No / We present a novel self-supervised framework for monocular image depth learning and confidence estimation. Our framework reduces the amount of ground truth annotation data required for training Convolutional Neural Networks (CNNs), which is often a challenging problem for the fast deployment of CNNs in many computer vision tasks. Our DepthNet adopts a novel fully differential patch-based cost function through the Zero-Mean Normalized Cross Correlation (ZNCC) to take multi-scale patches as matching and learning strategies. This approach greatly increases the accuracy and robustness of the depth learning. Whilst the proposed patch-based cost function naturally provides a 0-to-1 confidence, it is then used to self-supervise the training of a parallel network for confidence map learning and estimation by exploiting the fact that ZNCC is a normalized measure of similarity which can be approximated as the confidence of the depth estimation. Therefore, the proposed corresponding confidence map learning and estimation operate in a self-supervised manner and is a parallel network to the DepthNet. Evaluation on the KITTI depth prediction evaluation dataset and Make3D dataset show that our method outperforms the state-of-the-art results.
|
8 |
Horizontal to vertical spectral ratio of seismic ambient noise: Estimating the depth a mine tailing. / Horisontellt och vertikalt spektralförhållande för seismiskt omgivningsljud: Uppskattning av tjockleken på gruvavfall.Hellerud, Niels January 2024 (has links)
As the world moves towards more green technology and energy-resources, the need for rare earth elements (REE) has increased rapidly. A potential secondary resource for REE’s are mine tailings, and a technique to estimate the thickness of a tailing is the horizontal-to-vertical spectral ratio (HVSR) method. In this project, the depth of a mine-tailing along a profile in Blötberget was estimated using this method. The HVSR method is a non-invasive environmentally friendly seismic method which utilizes ambient noise of the Earth. The method uses seismic sensors consisting of 3 components, which measures ground motion in three directions. The acquired data was processed in the Geopsy software, where certain parameters, such as filtering and window selection, are set to make the most satisfactory results. The Geopsy software provides the user HVSRs for the selected windows. This ratio makes up a curve in the frequency domain, where a fundamental resonant frequency can be derived. The fundamental frequency is determined as the sharp, lowest-frequency peak in the data in case of a strong velocity contrast. This fundamental frequency must fulfil certain criteria to be considered reliable. When the fundamental resonant frequencies could be determined reliable, they were mathematically calculated into the thickness of the tailing by a simple mathematical formula in Excel, using the shear-wave velocity of the overlying layer and the fundamental frequency. The elevation at the location of each sensor and the thickness of the contrasting interface is used to provide a 2-D depth of the mine-tailing. This profile was compared to radiomagnetotelluric measurements. Although the measurement locations were not coinciding reasonable results were obtained.
|
9 |
Using Texture Features To Perform Depth EstimationKotha, Bhavi Bharat 22 January 2018 (has links)
There is a great need in real world applications for estimating depth through electronic means without human intervention. There are many methods in the field which help in autonomously finding depth measurements. Some of which are using LiDAR, Radar, etc. One of the most researched topic in the field of depth measurements is Computer Vision which uses techniques on 2D images to achieve the desired result. Out of the many 3D vision techniques used, stereovision is a field where a lot of research is being done to solve this kind of problem. Human vision plays an important part behind the inspiration and research performed in this field.
Stereovision gives a very high spatial resolution of depth estimates which is used for obstacle avoidance, path planning, object recognition, etc. Stereovision makes use of two images in the image pair. These images are taken with two cameras from different views and those two images are processed to get depth information.
Processing stereo images has been one of the most intensively sought-after research topics in computer vision. Many factors affect the performance of this approach like computational efficiency, depth discontinuities, lighting changes, correspondence and correlation, electronic noise, etc.
An algorithm is proposed which uses texture features obtained using Laws Energy Masks and multi-block approach to perform correspondence matching between stereo pair of images with high baseline. This is followed by forming disparity maps to get the relative depth of pixels in the image. An analysis is also made between this approach to the current state-of-the-art algorithms. A robust method to score and rank the stereo algorithms is also proposed. This approach provides a simple way for researchers to rank the algorithms according to their application needs. / Master of Science / There is a great need in real world applications for estimating depth through electronic means without human intervention. There are many methods in the field which help in autonomously finding depth measurements. Some of which are using LiDAR, Radar, etc. One of the most researched topic in the field of depth measurements is Computer Vision which uses techniques on 2D images to achieve the desired result. Out of the many 3D vision techniques used, stereovision is a field where a lot of research is being done to solve this kind of problem. Human vision plays a important part behind the inspiration and research performed in this field. A large variety of algorithms are being developed to find the measure of depth of ideally each and every point on the pictured scene giving us a very high spatial resolution as compared to other methods.
Real world needs of depth estimation and the benefits provided by using stereo vision are the main driving force behind this approach. Stereovision gives a very high spatial resolution which is used for obstacle avoidance, path planning, object recognition, etc. Stereovision makes use of image pairs taken from two cameras with different perspective to estimate depth. The two images in the image pair are taken with two cameras from different views (translational change in view) and those two images are processed to get depth information. The software tool developed is a new approach to perform correspondence matching to find depth using stereo vision concepts.
This software tool developed in this work is written in MATLAB. The tools efficiency was evaluated using standard techniques which have been described in detail. The evaluation was also performed by using the software tool with the images collected using a pair of stereo cameras and a tape measure to measure the depth of an object by hand. A scoring method has also been proposed to rank the algorithms in the field of stereo vision.
|
10 |
Single image scene-depth estimation based on self-supervised deep learning : For perception in autonomous heavy duty vehiclesPiven, Yegor January 2021 (has links)
Depth information is a vital component for perception of the 3D structure of vehicle's surroundings in the autonomous scenario. Ubiquity and relatively low cost of camera equipment make image-based depth estimation very attractive compared to employment of the specialised sensors. Classical image-based depth estimation approaches typically rely on multi-view geometry, requiring alignment and calibration between multiple image sources, which is both cumbersome and error-prone. In contrast, single images lack both temporal information and multi-view correspondences. Also, depth information is lost in projection from the 3D world to a 2D image during the image formation process, making single image depth estimation problem ill-posed. In recent years, Deep Learning-based approaches have been widely proposed for single image depth estimation. The problem is typically tackled in a supervised manner, requiring access to image data with pixel-wise depth information. Acquisition of large amounts of such data that is both varied and accurate, is a laborious and costly task. As an alternative, a number of self-supervised approaches exist showing that it is possible to train models performing single image depth estimation using synchronized stereo image-pairs or sequences of monocular images instead of depth labeled data. This thesis investigates the self-supervised approach utilizing sequences of monocular images, by training and evaluating one of the state-of-the-art methods on both the popular public KITTI dataset and the data of the host company, Scania. A number of extensions are implemented for the method of choice, namely addition of weak supervision with velocity data, employment of geometry consistency constraints and incorporation of a self-attention mechanism. Resulting models showed good depth estimation performance for major components of the scene, such as nearby roads and buildings, however struggled at further ranges, and with dynamic objects and thin structures. Minor qualitative and quantitative improvements in performance were observed with introduction of geometry consistency loss and mask, as well as the self-attention mechanism. Qualitative improvements included the models' enhanced ability to identify clearer object boundaries and better distinguish objects from their background. Geometry consistency loss also proved to be informative in low-texture regions of the image and resolved artifacting behaviour that was observed when training models on Scania's data. Incorporation of the supervision of predicted translations using velocity data has proved to be effective at enforcing the metric scale of the depth network's predictions. However, a risk of overfitting to such supervision was observed when training on Scania's data. In order to resolve this issue, velocity-supervised fine-tuning procedure is proposed as an alternative to velocity-supervised training from scratch, resolving the observed overfitting issue while still enabling the model to learn the metric scale. Proposed fine-tuning procedure was effective even when training models on the KITTI dataset, where no overfitting was observed, suggesting its general applicability.
|
Page generated in 0.1077 seconds