Global ETD Search

11	Applied statistical modeling of three-dimensional natural scene data Su, Che-Chun 27 June 2014 (has links) Natural scene statistics (NSS) have played an increasingly important role in both our understanding of the function and evolution of the human vision system, and in the development of modern image processing applications. Because depth/range, i.e., egocentric distance, is arguably the most important thing a visual system must compute (from an evolutionary perspective), the joint statistics between natural image and depth/range information are of particular interest. However, while there exist regular and reliable statistical models of two-dimensional (2D) natural images, there has been little work done on statistical modeling of natural luminance/chrominance and depth/disparity, and of their mutual relationships. One major reason is the dearth of high-quality three-dimensional (3D) image and depth/range database. To facilitate research progress on 3D natural scene statistics, this dissertation first presents a high-quality database of color images and accurately co-registered depth/range maps using an advanced laser range scanner mounted with a high-end digital single-lens reflex camera. By utilizing this high-resolution, high-quality database, this dissertation performs reliable and robust statistical modeling of natural image and depth/disparity information, including new bivariate and spatial oriented correlation models. In particular, these new statistical models capture higher-order dependencies embedded in spatially adjacent bandpass responses projected from natural environments, which have not yet been well understood or explored in literature. To demonstrate the efficacy and effectiveness of the advanced NSS models, this dissertation addresses two challenging, yet very important problems, depth estimation from monocular images and no-reference stereoscopic/3D (S3D) image quality assessment. A Bayesian depth estimation framework is proposed to consider the canonical depth/range patterns in natural scenes, and it forms priors and likelihoods using both univariate and bivariate NSS features. The no-reference S3D image quality index proposed in this dissertation exploits new bivariate and correlation NSS features to quantify different types of stereoscopic distortions. Experimental results show that the proposed framework and index achieve superior performance to state-of-the-art algorithms in both disciplines. / text Natural scene statistics (NSS) Bivariate model Correlation model Bayesian Depth estimation Stereoscopic/3D Quality assessment
12	Computational Imaging For Miniature Cameras Salahieh, Basel January 2015 (has links) Miniature cameras play a key role in numerous imaging applications ranging from endoscopy and metrology inspection devices to smartphones and head-mount acquisition systems. However, due to the physical constraints, the imaging conditions, and the low quality of small optics, their imaging capabilities are limited in terms of the delivered resolution, the acquired depth of field, and the captured dynamic range. Computational imaging jointly addresses the imaging system and the reconstructing algorithms to bypass the traditional limits of optical systems and deliver better restorations for various applications. The scene is encoded into a set of efficient measurements which could then be computationally decoded to output a richer estimate of the scene as compared with the raw images captured by conventional imagers. In this dissertation, three task-based computational imaging techniques are developed to make low-quality miniature cameras capable of delivering realistic high-resolution reconstructions, providing full-focus imaging, and acquiring depth information for high dynamic range objects. For the superresolution task, a non-regularized direct superresolution algorithm is developed to achieve realistic restorations without being penalized by improper assumptions (e.g., optimizers, priors, and regularizers) made in the inverse problem. An adaptive frequency-based filtering scheme is introduced to upper bound the reconstruction errors while still producing more fine details as compared with previous methods under realistic imaging conditions. For the full-focus imaging task, a computational depth-based deconvolution technique is proposed to bring a scene captured by an ordinary fixed-focus camera to a full-focus based on a depth-variant point spread function prior. The ringing artifacts are suppressed on three levels: block tiling to eliminate boundary artifacts, adaptive reference maps to reduce ringing initiated by sharp edges, and block-wise deconvolution or depth-based masking to suppress artifacts initiated by neighboring depth-transition surfaces. Finally for the depth acquisition task, a multi-polarization fringe projection imaging technique is introduced to eliminate saturated points and enhance the fringe contrast by selecting the proper polarized channel measurements. The developed technique can be easily extended to include measurements captured under different exposure times to obtain more accurate shape rendering for very high dynamic range objects. Computational Imaging Deconvolution Depth Estimation Image Reconstruction Superresolution Electrical & Computer Engineering Cameras
13	Odhad hloubky ve scéně na základě obrazu a odometrie / Scene Depth Estimation Based on Odometry and Image Data Zborovský, Peter January 2018 (has links) In this work, we propose a depth estimation system based on image sequence and odometry information. The key idea is that depth estimation is decoupled from pose estimation. Such approach results in multipurpose system applicable on different robot platforms and for different depth estimation related problems. Our implementation uses various filtration techniques, operates real-time and provides appropriate results. Although the system was aimed at and tested on drone platform, it can be well used on any other type of autonomous vehicle that provides odometry information and video output.
14	Specialised global methods for binocular and trinocular stereo matching Horna Carranza, Luis Alberto January 2017 (has links) The problem of estimating depth from two or more images is a fundamental problem in computer vision, which is commonly referred as to stereo matching. The applications of stereo matching range from 3D reconstruction to autonomous robot navigation. Stereo matching is particularly attractive for applications in real life because of its simplicity and low cost, especially compared to costly laser range finders/scanners, such as for the case of 3D reconstruction. However, stereo matching has its very unique problems like convergence issues in the optimisation methods, and challenges to find matches accurately due to changes in lighting conditions, occluded areas, noisy images, etc. It is precisely because of these challenges that stereo matching continues to be a very active field of research. In this thesis we develop a binocular stereo matching algorithm that works with rectified images (i.e. scan lines in two images are aligned) to find a real valued displacement (i.e. disparity) that best matches two pixels. To accomplish this our research has developed techniques to efficiently explore a 3D space, compare potential matches, and an inference algorithm to assign the optimal disparity to each pixel in the image. The proposed approach is also extended to the trinocular case. In particular, the trinocular extension deals with a binocular set of images captured at the same time and a third image displaced in time. This approach is referred as to t +1 trinocular stereo matching, and poses the challenge of recovering camera motion, which is addressed by a novel technique we call baseline recovery. We have extensively validated our binocular and trinocular algorithms using the well known KITTI and Middlebury data sets. The performance of our algorithms is consistent across different data sets, and its performance is among the top performers in the KITTI and Middlebury datasets.
15	Correspondence-based pairwise depth estimation with parallel acceleration Bartosch, Nadine January 2018 (has links) This report covers the implementation and evaluation of a stereo vision corre- spondence-based depth estimation algorithm on a GPU. The results and feed- back are used for a Multi-view camera system in combination with Jetson TK1 devices for parallelized image processing and the aim of this system is to esti- mate the depth of the scenery in front of it. The performance of the algorithm plays the key role. Alongside the implementation, the objective of this study is to investigate the advantages of parallel acceleration inter alia the differences to the execution on a CPU which are significant for all the function, the imposed overheads particular for a GPU application like memory transfer from the CPU to the GPU and vice versa as well as the challenges for real-time and concurrent execution. The study has been conducted with the aid of CUDA on three NVIDIA GPUs with different characteristics and with the aid of knowledge gained through extensive literature study about different depth estimation algo- rithms but also stereo vision and correspondence as well as CUDA in general. Using the full set of components of the algorithm and expecting (near) real-time execution is utopic in this setup and implementation, the slowing factors are in- ter alia the semi-global matching. Investigating alternatives shows that results for disparity maps of a certain accuracy are also achieved by local methods like the Hamming Distance alone and by a filter that refines the results. Further- more, it is demonstrated that the kernel launch configuration and the usage of GPU memory types like shared memory is crucial for GPU implementations and has an impact on the performance of the algorithm. Just concurrency proves to be a more complicated task, especially in the desired way of realization. For the future work and refinement of the algorithm it is therefore recommended to invest more time into further optimization possibilities in regards of shared memory and into integrating the algorithm into the actual pipeline. Depth estimation disparity stereo vision stereo correspondence NVIDIA GPU CUDA parallelization Computer Systems Datorsystem
16	Obstacle detection using stereo vision for unmanned ground vehicles Olsson, Martin January 2009 (has links) In recent years, the market for automatized surveillance and use of unmanned ground vehicles (UGVs) has increased considerably. In order for unmanned vehicles to operate autonomously, high level algorithms of artificial intelligence need to be developed and accompanied by some way to make the robots perceive and interpret the environment. The purpose of this work is to investigate methods for real-time obstacle detection using stereo vision and implement these on an existing UGV platform. To reach real-time processing speeds, the algorithms presented in this work are designed for parallel processing architectures and implemented using programmable graphics hardware. The reader will be introduced to the basics of stereo vision and given an overview of the most common real-time stereo algorithms in literature along with possible applications. A novel wide-baseline real-time depth estimation algorithm is presented. The depth estimation is used together with a simple obstacle detection algorithm, producing an occupancy map of the environment allowing for evasion of obstacles and path planning. In addition, a complete system design for autonomous navigation in multi-UGV systems is proposed. Depth estimation Stereo vision Obstacle detection UGV
17	Depth Estimation Using Adaptive Bins via Global Attention at High Resolution Bhat, Shariq 21 April 2021 (has links) We address the problem of estimating a high quality dense depth map from a single RGB input image. We start out with a baseline encoder-decoder convolutional neural network architecture and pose the question of how the global processing of information can help improve overall depth estimation. To this end, we propose a transformer-based architecture block that divides the depth range into bins whose center value is estimated adaptively per image. The final depth values are estimated as linear combinations of the bin centers. We call our new building block AdaBins. Our results show a decisive improvement over the state-of-the-art on several popular depth datasets across all metrics. We also validate the effectiveness of the proposed block with an ablation study. Monocular Depth Estimation 3D reconstruction Transformers 3D scene understanding adaptive binning Convolutional Neural Networks
18	Semantic Segmentation For Free Drive-able Space Estimation Gallagher, Eric 02 October 2020 (has links) Autonomous Vehicles need precise information as to the Drive-able space in order to be able to safely navigate. In recent years deep learning and Semantic Segmentation have attracted intense research. It is a highly advancing and rapidly evolving field that continues to provide excellent results. Research has shown that deep learning is emerging as a powerful tool in many applications. The aim of this study is to develop a deep learning system to estimate the Free Drive-able space. Building on the state of the art deep learning techniques, semantic segmentation will be used to replace the need for highly accurate maps, that are expensive to license. Free Drive-able space is defined as the drive-able space on the correct side of the road, that can be reached without a collision with another road user or pedestrian. A state of the art deep network will be trained with a custom data-set in order to learn complex driving decisions. Motivated by good results, further deep learning techniques will be applied to measure distance from monocular images. The findings demonstrate the power of deep learning techniques in complex driving decisions. The results also indicate the economic and technical feasibility of semantic segmentation over expensive high definition maps. info:eu-repo/classification/ddc/004 ddc:004 Deep learning
19	Monocular Depth Estimation: Datasets, Methods, and Applications Bauer, Zuria 15 September 2021 (has links) The World Health Organization (WHO) stated in February 2021 at the Seventy- Third World Health Assembly that, globally, at least 2.2 billion people have a near or distance vision impairment. They also denoted the severe impact vision impairment has on the quality of life of the individual suffering from this condition, how it affects the social well-being and their economic independence in society, becoming in some cases an additional burden to also people in their immediate surroundings. In order to minimize the costs and intrusiveness of the applications and maximize the autonomy of the individual life, the natural solution is using systems that rely on computer vision algorithms. The systems improving the quality of life of the visually impaired need to solve different problems such as: localization, path recognition, obstacle detection, environment description, navigation, etc. Each of these topics involves an additional set of problems that have to be solved to address it. For example, for the task of object detection, there is the need of depth prediction to know the distance to the object, path recognition to know if the user is on the road or on a pedestrian path, alarm system to provide notifications of danger for the user, trajectory prediction of the approaching obstacle, and those are only the main key points. Taking a closer look at all of these topics, they have one key component in common: depth estimation/prediction. All of these topics are in need of a correct estimation of the depth in the scenario. In this thesis, our main focus relies on addressing depth estimation in indoor and outdoor environments. Traditional depth estimation methods, like structure from motion and stereo matching, are built on feature correspondences from multiple viewpoints. Despite the effectiveness of these approaches, they need a specific type of data for their proper performance. Since our main goal is to provide systems with minimal costs and intrusiveness that are also easy to handle we decided to infer the depth from single images: monocular depth estimation. Estimating depth of a scene from a single image is a simple task for humans, but it is notoriously more difficult for computational models to be able to achieve high accuracy and low resource requirements. Monocular Depth Estimation is this very task of estimating depth from a single RGB image. Since there is only a need of one image, this approach is used in applications such as autonomous driving, scene understanding or 3D modeling where other type of information is not available. This thesis presents contributions towards solving this task using deep learning as the main tool. The four main contributions of this thesis are: first, we carry out an extensive review of the state-of-the-art in monocular depth estimation; secondly, we introduce a novel large scale high resolution outdoor stereo dataset able to provide enough image information to solve various common computer vision problems; thirdly, we show a set of architectures able to predict monocular depth effectively; and, at last, we propose two real life applications of those architectures, addressing the topic of enhancing the perception for the visually impaired using low-cost wearable sensors. Monocular Depth Estimation RGB-D Datasets Deep Learning Machine Learning
20	Depth Estimation from Structured Light Fields Li, Yan 03 July 2020 (has links) (PDF) Light fields have been populated as a new geometry representation of 3D scenes, which is composed of multiple views, offering large potentials to improve the depth perception in the scenes. The light fields can be captured by different camera sensors, in which different acquisitions give rise to different representations, mainly containing a line of camera views - 3D light field representation, a grid of camera views - 4D light field representation. When the captured position is uniformly distributed, the outputs are the structured light fields. This thesis focuses on depth estimation from the structured light fields. The light field representations (or setups) differ not only in terms of 3D and 4D, but also the density or baseline of camera views. Rather than the objective of reconstructing high quality depths from dense (narrow-baseline) light fields, we put efforts into a general objective, i.e. reconstructing depths from a wide range of light field setups. Hence a series of depth estimation methods from light fields, including traditional and deep learningbased methods, are presented in this thesis. Extra efforts are made for achieving the high performance on aspects of depth accuracy and computation efficiency. Specifically, 1) a robust traditional framework is put forward for estimating the depth in sparse (wide-baseline) light fields, where a combination of the cost calculation, the window-based filtering and the optimization are conducted; 2) the above-mentioned framework is extended with the extra new or alternative components to the 4D light fields. This new framework shows the ability of being independent of the number of views and/or baseline of 4D light fields when predicting the depth; 3) two new deep learning-based methods are proposed for the light fields with the narrow-baseline, where the features are learned from the Epipolar-Plane-Image and light field images. One of the methods is designed as a lightweight model for more practical goals; 4) due to the dataset deficiency, a large-scale and diverse synthetic wide-baseline dataset with labeled data are created. A new lightweight deep model is proposed for the 4D light fields with the wide-baseline. Besides, this model also works on the 4D light fields with the narrow baseline if trained on the narrow-baseline datasets. Evaluations are made on the public light field datasets. Experimental results show the proposed depth estimation methods from a wide range of light field setups are capable of achieving the high quality depths, and some even outperform state-of-the-art methods. / Doctorat en Sciences de l'ingénieur et technologie / info:eu-repo/semantics/nonPublished Sciences de l'ingénieur

Search results