• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Design of a Real-time Image-based Distance Sensing System by Stereo Vision on FPGA

2012 August 1900 (has links)
A stereo vision system is a robust method to sense the distance information in a scene. This research explores the stereo vision system from the fundamentals of stereo vision and the computer stereo vision algorithm to the final implementation of the system on a FPGA chip. In a stereo vision system, images are captured by a pair of stereo image sensors. The distance information can be derived from the disparities between the stereo image pair, based on the theory of binocular geometry. With the increasing focus on 3D vision, stereo vision is becoming a hot topic in the areas of computer games, robot vision and medical applications. Particularly, most stereo vision systems are expected to be used in real-time applications. In this thesis, several stereo correspondence algorithms that determine the disparities between stereo image pair are examined. The algorithms can be categorized into global stereo algorithms and local stereo algorithms depending on the optimization techniques. The global algorithms examined are the Dynamic Time Warp (DTW) algorithm and the DTW with quantization algorithm, while the local algorithms examined are the window based Sum of Squared Differences (SSD), Sum of Absolute Differences (SAD) and Census transform correlation algorithms. With analysis among them, the window based SAD correlation algorithm is proposed for implementation on a FPGA platform. The proposed algorithm is implemented onto an Altera DE2 board featuring an Altera Cyclone II 2C35 FPGA. The implemented module of the algorithm is simulated using ModelSim-Altera to verify the correctness of its functionality. Along with a pair of stere image sensors and a LCD monitor, a stereo vision system is built. The entire system realizes a real-time video frame rate of 16.83 frames per second with an image resolution of 640 by 480 and produces disparity maps in which the objects are clearly distinguished by their relative distance information.
2

Specialised global methods for binocular and trinocular stereo matching

Horna Carranza, Luis Alberto January 2017 (has links)
The problem of estimating depth from two or more images is a fundamental problem in computer vision, which is commonly referred as to stereo matching. The applications of stereo matching range from 3D reconstruction to autonomous robot navigation. Stereo matching is particularly attractive for applications in real life because of its simplicity and low cost, especially compared to costly laser range finders/scanners, such as for the case of 3D reconstruction. However, stereo matching has its very unique problems like convergence issues in the optimisation methods, and challenges to find matches accurately due to changes in lighting conditions, occluded areas, noisy images, etc. It is precisely because of these challenges that stereo matching continues to be a very active field of research. In this thesis we develop a binocular stereo matching algorithm that works with rectified images (i.e. scan lines in two images are aligned) to find a real valued displacement (i.e. disparity) that best matches two pixels. To accomplish this our research has developed techniques to efficiently explore a 3D space, compare potential matches, and an inference algorithm to assign the optimal disparity to each pixel in the image. The proposed approach is also extended to the trinocular case. In particular, the trinocular extension deals with a binocular set of images captured at the same time and a third image displaced in time. This approach is referred as to t +1 trinocular stereo matching, and poses the challenge of recovering camera motion, which is addressed by a novel technique we call baseline recovery. We have extensively validated our binocular and trinocular algorithms using the well known KITTI and Middlebury data sets. The performance of our algorithms is consistent across different data sets, and its performance is among the top performers in the KITTI and Middlebury datasets.
3

Semantic Labeling of Large Geographic Areas Using Multi-Date and Multi-View Satellite Images and Noisy OpenStreetMap Labels

Bharath Kumar Comandur Jagannathan Raghunathan (9187466) 31 July 2020 (has links)
<div>This dissertation addresses the problem of how to design a convolutional neural network (CNN) for giving semantic labels to the points on the ground given the satellite image coverage over the area and, for the ground truth, given the noisy labels in OpenStreetMap (OSM). This problem is made challenging by the fact that -- (1) Most of the images are likely to have been recorded from off-nadir viewpoints for the area of interest on the ground; (2) The user-supplied labels in OSM are frequently inaccurate and, not uncommonly, entirely missing; and (3) The size of the area covered on the ground must be large enough to possess any engineering utility. As this dissertation demonstrates, solving this problem requires that we first construct a DSM (Digital Surface Model) from a stereo fusion of the available images, and subsequently use the DSM to map the individual pixels in the satellite images to points on the ground. That creates an association between the pixels in the images and the noisy labels in OSM. The CNN-based solution we present yields a 4-8% improvement in the per-class segmentation IoU (Intersection over Union) scores compared to the traditional approaches that use the views independently of one another. The system we present is end-to-end automated, which facilitates comparing the classifiers trained directly on true orthophotos vis-`a-vis first training them on the off-nadir images and subsequently translating the predicted labels to geographical coordinates. This work also presents, for arguably the first time, an in-depth discussion of large-area image alignment and DSM construction using tens of true multi-date and multi-view WorldView-3 satellite images on a distributed OpenStack cloud computing platform.</div>

Page generated in 0.1207 seconds