Global ETD Search

1	A Series of Improved and Novel Methods in Computer Vision Estimation Adams, James J 07 December 2023 (has links) (PDF) In this thesis, findings in three areas of computer vision estimation are presented. First, an improvement to the Kanade-Lucas-Tomasi (KLT) feature tracking algorithm is presented in which gyroscope data is incorporated to compensate for camera rotation. This improved algorithm is then compared with the original algorithm and shown to be more effective at tracking features in the presence of large rotational motion. Next, a deep neural network approach to depth estimation is presented. Equations are derived relating camera and feature motion to depth. The information necessary for depth estimation is given as inputs to a deep neural network, which is trained to predict depth across an entire scene. This deep neural network approach is shown to be effective at predicting the general structure of a scene. Finally, a method of passively estimating the position and velocity of constant velocity targets using only bearing and time-to-collision measurements is presented. This method is paired with a path planner to avoid tracked targets. Results are given to show the effectiveness of the method at avoiding collision while maneuvering as little as possible. feature tracking depth prediction DNN bearing-only tracking Engineering
2	Monocular Depth Estimation Using Deep Convolutional Neural Networks Larsson, Susanna January 2019 (has links) For a long time stereo-cameras have been deployed in visual Simultaneous Localization And Mapping (SLAM) systems to gain 3D information. Even though stereo-cameras show good performance, the main disadvantage is the complex and expensive hardware setup it requires, which limits the use of the system. A simpler and cheaper alternative are monocular cameras, however monocular images lack the important depth information. Recent works have shown that having access to depth maps in monocular SLAM system is beneficial since they can be used to improve the 3D reconstruction. This work proposes a deep neural network that predicts dense high-resolution depth maps from monocular RGB images by casting the problem as a supervised regression task. The network architecture follows an encoder-decoder structure in which multi-scale information is captured and skip-connections are used to recover details. The network is trained and evaluated on the KITTI dataset achieving results comparable to state-of-the-art methods. With further development, this network shows good potential to be incorporated in a monocular SLAM system to improve the 3D reconstruction. Depth estimation depth maps monocular SLAM mono-SLAM pixelwise depth prediction encoder-decoder network Signal Processing Signalbehandling
3	Monocular Depth Prediction in Deep Neural Networks Tang, Guanqian January 2019 (has links) With the development of artificial neural network (ANN), it has been introduced in more and more computer vision tasks. Convolutional neural networks (CNNs) are widely used in object detection, object tracking, and semantic segmentation, achieving great performance improvement than traditional algorithms. As a classical topic in computer vision, the exploration of applying deep CNNs for depth recovery from monocular images is popular, since the single-view image is more common than stereo image pair and video. However, due to the lack of motion and geometry information, monocular depth estimation is much more difficult. This thesis aims at investigating depth prediction from single images by exploiting state-of-the-art deep CNN models. Two neural networks are studied: the first network uses the idea of a global and local network, and the other one adopts a deeper fully convolutional network by using a pre-trained backbone CNN (ResNet or DenseNet). We compare the performance of the two networks and the result shows that the deeper convolutional neural network with the pre-trained backbone can achieve better performance. The pre-trained model can significantly accelerate the training process. We also find that the amount of training dataset is essential for CNN-based monocular depth prediction. / Utvecklingen av artificiella neurala nätverk (ANN) har gjort att det nu använts i flertal datorseende tekniker för att förbättra prestandan. Convolutional Neural Networks (CNN) används ofta inom objektdetektering, objektspårning och semantisk segmentering, och har en bättre prestanda än de föregående algoritmerna. Användningen av CNNs för djup prediktering för single-image har blivit populärt, på grund av att single-image är vanligare än stereo-image och filmer. På grund av avsaknaden av rörelse och geometrisk information, är det mycket svårare att veta djupet i en bild än för en film. Syftet med masteruppsatsen är att implementera en ny algoritm för djup prediktering, specifikt för bilder genom att använda CNN modeller. Två olika neurala nätverk analyserades; det första använder sig av lokalt och globalt nätverk och det andra består av ett avancerat Convolutional Neural Network som använder en pretrained backbone CNN (ResNet eller DenseNet). Våra analyser visar att avancerat Convolutional Neural Network som använder en pre-trained backbone CNN har en bättre prestanda som påskyndade inlärningsprocessen avsevärt. Vi kom även fram till att mängden data för inlärningsprocessen var avgörande för CNN-baserad monokulär djup prediktering. Elektroteknik och elektronik

1

Page generated in 0.0735 seconds