Global ETD Search

11	Online Monocular SLAM : Rittums Persson, Mikael January 2014 (has links) A classic Computer Vision task is the estimation of a 3D map from a collection of images. This thesis explores the online simultaneous estimation of camera poses and map points, often called Visual Simultaneous Localisation and Mapping [VSLAM]. In the near future the use of visual information by autonomous cars is likely, since driving is a vision dominated process. For example, VSLAM could be used to estimate the position of the car in relation to objects of interest, such as the road, other cars and pedestrians. Aimed at the creation of a real-time, robust, loop closing, single camera SLAM system, the properties of several state-of-the-art VSLAM systems and related techniques are studied. The system goals cover several important, if difficult, problems, which makes a solution widely applicable. This thesis makes two contributions: A rigorous qualitative analysis of VSLAM methods and a system designed accordingly. A novel tracking by matching scheme is proposed, which, unlike the trackers used by many similar systems, is able to deal better with forward camera motion. The system estimates general motion with loop closure in real time. The system is compared to a state-of-the-art monocular VSLAM algorithm and found to be similar in speed and performance.
12	Predição de Mapas de Profundidades a Partir de Imagens Monoculares por Redes Neurais Sem Peso PERRONI FILHO, H. 26 February 2010 (has links) Made available in DSpace on 2016-08-29T15:33:11Z (GMT). No. of bitstreams: 1 tese_3337_.pdf: 1201799 bytes, checksum: 1d2df1cae9d307715d08750e6173fe09 (MD5) Previous issue date: 2010-02-26 / Um problema central para a Visão Computacional é o de depth estimation (estimativa de profundidades) isto é, derivar, a partir de uma ou mais imagens de uma cena, um depth map (mapa de profundidades) que determine as distâncias entre o observador e cada ponto das várias superfícies capturadas. Não é surpresa, portanto, que a abordagem de stereo correspondence (correspondência estéreo), tradicionalmente usada nesse problema, seja um dos tópicos mais intensamente investigados do campo. Sistemas de correspondência estimam profundidades a partir de características binoculares do par estéreo mais especificamente, a diferença de posição de cada ponto entre as imagens de um par. Além dessa informação puramente geométrica, imagens contém uma série de características monoculares tais como variações e gradientes de textura, variações de foco, padrões de cores e reflexão, etc que podem ser exploradas para derivar estimativas de profundidade. Para isso, entretanto, é preciso acumular uma certa quantidade de conhecimento a priori, uma vez que há uma ambiguidade intrínseca entre as características de uma imagem e variações de profundidade. Através de suas pesquisas com sistemas de aprendizado de máquina baseados em Markov Random Fields (MRFs), Ashutosh Saxena demonstrou ser possível estimar mapas de profundidades com grande precisão a partir de imagens monoculares estáticas. Sua abordagem, entretanto, carece de plausibilidade biológica, visto que não há correspondência teórica conhecida entre MRFs e as redes neurais do cérebro humano. Motivados por sucessos anteriores na aplicação de Weightless Neural Networks (Redes Neurais Sem Peso, ou RNSP's) a problemas de visão computacional, neste trabalho objetivamos investigar a efetividade da aplicação de RNSPs ao problema de estimar mapas de profunidades. Com isso, esperamos alcançar uma melhoria em relação ao sistema baseado em MRFs de Saxena, além de desenvolver uma arquitetura mais útil para a avaliação de hipóteses sobre o processamento de informações visuais no córtex humano. Redes Neurais Sem Peso Percepção Monocular de Profundidades
13	Percepción y comprensión autónoma del área transitable Moreyra, Marcelo Leandro 27 June 2013 (has links) El diseño y desarrollo de vehículos inteligentes ha sido motivo de investigación durante más de tres décadas mostrando un enorme progreso en los últimos años. La existencia de proyectos a largo plazo impulsados por iniciativas gubernamentales en conjunto con grupos de investigación de la industria automotriz y de la academia, ha permitido que en algunos lugares del mundo los vehículos autónomos ya hayan demostrado con éxito que pueden circular por las calles de una ciudad. Para que un vehículo de este tipo pueda interactuar en forma segura con otros vehículos conducidos por humanos es necesario que tenga la capacidad de percibir fielmente su entorno, identificando al resto de los participantes del tráfico y los lugares por donde es posible transitar. Actualmente, los proyectos más maduros se basan en modalidades de sensado aún demasiado costosas como para permitir que un producto de este tipo tenga un alcance masivo para la población. Siendo la visión el principal elemento de navegación que utilizan los humanos para conducir un vehículo, resulta algo sorprendente que las cámaras no sean aún protagonistas fundamentales de los actuales sistemas automáticos para la percepción del ambiente, más aún si se tienen en cuenta su bajo costo y su bajo requerimiento de energía para funcionar. Uno de los problemas donde la visión sí ha permitido un gran avance es la detección del camino por el que puede transitar un vehículo. Para ésto se suele utilizar el conocimiento acerca de la apariencia y la forma geométrica del camino para proponer un modelo que se ajustará en función de las características extraídas de una imagen. Las técnicas modernas del filtrado estadístico son utilizadas para dar seguimiento al modelo a través de tiempo aumentando el rechazo al ruido y las mediciones erróneas, y reduciendo el costo computacional que implica calcular los parámetros. Estos enfoques han permitido alcanzar soluciones con alto grado de robustez ante los cambios climáticos y los cambios drásticos en la iluminación de la imagen. Sin embargo, estos sistemas fallan cuando la forma del camino cambia de una manera tal que el modelo considerado pierde validez. Para poder detectar automáticamente un cambio de este tipo hacen falta nuevas estrategias con un mayor poder de abstracción y que permitan una mayor comprensión de la escena. Dada la enorme robustez del sistema visual humano y su eficiencia en la utilización de los recursos de procesamiento, resulta de primordial interés aprender acerca de cómo las personas resuelven este problema. Con este objetivo, esta tesis propone estudiar y analizar los patrones de atención visual de las personas cuando reconocen diferentes tipos de topologías como intersecciones, bifurcaciones y uniones de caminos, entre otras. A lo largo de los capítulos se introducen los fundamentos necesarios para comprender el tema abordado y se presentan resultados experimentales que dan soporte a las hipótesis planteadas. Las evidencias encontradas sentarán la base para el desarrollo de nuevos algoritmos para la detección automática de la topología del camino. / The design and development of intelligent vehicles has been an active research area for more than 30 years, showing tremendous progress in the last few years. The existence of long-term projects promoted by government initiatives in conjunction with research groups of the automotive industry and academia, has allowed autonomous vehicles to show in some places of the world successful results while driving across urban scenarios. To be possible for a vehicle of this kind to safely interact with other human-driven cars it is neccessary to have the ability to perceive accurately the environment, identifying all other participants of the trafic and detecting the drivable areas. Currently, the most mature projects are based on sensing modalities still too expensive to allow a product of this type to massively reach the common users. While vision is the primary navigation element that humans use to drive a vehicle, it remains quite surprising that cameras are not yet essential for the current environment perception automatic systems, even more taking into account their low cost and low power requirements for operation. Road detection is one of the problems where vision has effectively had an important impact. The knowledge about the road appearance and its shape is usually considered to propose a road model that will be fitted according to the features extracted from the image. Modern statistical filtering techniques are used to track the model through time, increasing rejection to noise and erroneous measurements, and reducing the computational cost involved in estimating its parameters. These approaches have achieved solutions with a high degree of robustness to climate changes and drastic illumination variations in the image intensity. However, these systems fail to adapt when road's shape changes in a way that the considered model is no longer valid. New strategies with higher power of abstraction that allow greater understanding of the scene are needed to detect this type of changes. Given the great robustness of human visual system and its e cient use of processing resources, it results of primary interest to learn about how people solve this kind of problem. To this end, this thesis proposes to study and analize people visual attention patterns when recognizing dfferent types of topologies like road intersections, road splits and road junctions, among others. Throughout the chapters the basics needed to understand the topic addressed are introduced and experimental results that support the hypotheses are presented. The evidence found will provide the foundations for the development of new algorithms for the automatic detection of the topology of the road. Percepción Visión monocular Detección de caminos Atención visual
14	Understanding the neural basis of amblyopia. Barrett, Brendan T., Bradley, A., McGraw, Paul V. January 2004 (has links) No / Amblyopia is the condition in which reduced visual function exists despite full optical correction and an absence of observable ocular pathology. Investigation of the underlying neurology of this condition began in earnest around 40 years ago with the pioneering studies conducted by Hubel and Wiesel. Their early work on the impact of monocular deprivation and strabismus initiated what is now a rapidly developing field of cortical plasticity research. Although the monocular deprivation paradigm originated by Hubel and Wiesel remains a key experimental manipulation in studies of cortical plasticity, somewhat ironically, the neurology underlying the human conditions of strabismus and amblyopia that motivated this early work remains elusive. In this review, the authors combine contemporary research on plasticity and development with data from human and animal investigations of amblyopic populations to assess what is known and to reexamine some of the key assumptions about human amblyopia. Amblyopia Monocular deprivation Strabismus Anisometropia Ocular dominance
15	Gait Alterations Negotiating A Raised Surface Induced by Monocular Blur Vale, Anna, Buckley, John, Elliott, David 12 January 2008 (has links) No / Falls in the elderly are a major cause of serious injury and mortality. Impaired and absent stereopsis may be a significant risk factor for falls or hip fracture, although data from epidemiological studies are not consistent. Previous laboratory based studies, however, do suggest that stereoacuity is an important factor in adaptive gait. The present study investigates how acute impairment of stereopsis, through monocular blur of differing levels, ranging from 0.50 diopter (D) to a monovision correction affected gait when negotiating a raised surface in elderly subjects. Eleven elderly subjects (73.3 3.6 years) walked up to and negotiated a raised surface under nine visual conditions, binocular vision, one eye occluded and 0.50 D, 1.00 D and monovision correction (mean 2.50 D 0.20 D) with blur and occlusion either over the dominant or non-dominant eye. Analysis focused on foot positioning and toe clearance parameters. There was no effect of ocular dominance on any parameters. Monocular blur impaired stereopsis (p 0.01), with more minor effects on high and low contrast acuity. Vertical and horizontal lead limb toe clearance both increased under all levels of monocular blur including the lowest level of 0.50 DBlur (p 0.03) and monovision correction led to toe clearance levels similar to that found with occlusion of one eye. Findings demonstrated that even small amounts of monocular blur can lead to a change in gait when negotiating a raised surface, suggesting acute monocular blur affected the ability to accurately judge the height of a step in the travel path. Further work is required to investigate if similar adaptations are used by patients with chronic monocular blur. Monocular blur Stereopsis Gait Elderly Falls
16	Learning Unsupervised Depth Estimation, from Stereo to Monocular Images Pilzer, Andrea 22 June 2020 (has links) In order to interact with the real world, humans need to perform several tasks such as object detection, pose estimation, motion estimation and distance estimation. These tasks are all part of scene understanding and are fundamental tasks of computer vision. Depth estimation received unprecedented attention from the research community in recent years due to the growing interest in its practical applications (ie robotics, autonomous driving, etc.) and the performance improvements achieved with deep learning. In fact, the applications expanded from the more traditional tasks such as robotics to new fields such as autonomous driving, augmented reality devices and smartphones applications. This is due to several factors. First, with the increased availability of training data, bigger and bigger datasets were collected. Second, deep learning frameworks running on graphical cards exponentially increased the data processing capabilities allowing for higher precision deep convolutional networks, ConvNets, to be trained. Third, researchers applied unsupervised optimization objectives to ConvNets overcoming the hurdle of collecting expensive ground truth and fully exploiting the abundance of images available in datasets. This thesis addresses several proposals and their benefits for unsupervised depth estimation, i.e., (i) learning from resynthesized data, (ii) adversarial learning, (iii) coupling generator and discriminator losses for collaborative training, and (iv) self-improvement ability of the learned model. For the first two points, we developed a binocular stereo unsupervised depth estimation model that uses reconstructed data as an additional self-constraint during training. In addition to that, adversarial learning improves the quality of the reconstructions, further increasing the performance of the model. The third point is inspired by scene understanding as a structured task. A generator and a discriminator joining their efforts in a structured way improve the quality of the estimations. Our intuition may sound counterintuitive when cast in the general framework of adversarial learning. However, in our experiments we demonstrate the effectiveness of the proposed approach. Finally, self-improvement is inspired by estimation refinement, a widespread practice in dense reconstruction tasks like depth estimation. We devise a monocular unsupervised depth estimation approach, which measures the reconstruction errors in an unsupervised way, to produce a refinement of the depth predictions. Furthermore, we apply knowledge distillation to improve the student ConvNet with the knowledge of the teacher ConvNet that has access to the errors.
17	Panodepth – Panoramic Monocular Depth Perception Model and Framework Wong, Adley K 01 December 2022 (has links) (PDF) Depth perception has become a heavily researched area as companies and researchers are striving towards the development of self-driving cars. Self-driving cars rely on perceiving the surrounding area, which heavily depends on technology capable of providing the system with depth perception capabilities. In this paper, we explore developing a single camera (monocular) depth prediction model that is trained on panoramic depth images. Our model makes novel use of transfer learning efficient encoder models, pre-training on a larger dataset of flat depth images, and optimizing the model for use with a Jetson Nano. Additionally, we present a training and optimization framework to make developing and testing new monocular depth perception models easier and faster. While the model failed to achieve a high frame rate, the framework and models developed are a promising starting place for future work. Machine Learning Deep Learning Panoramic Monocular Depth Perception Monocular Depth Perception Framework Computational Engineering Robotics
18	Monocular vision based localization and mapping Jama, Michal January 1900 (has links) Doctor of Philosophy / Department of Electrical and Computer Engineering / Balasubramaniam Natarajan / Dale E. Schinstock / In this dissertation, two applications related to vision-based localization and mapping are considered: (1) improving navigation system based satellite location estimates by using on-board camera images, and (2) deriving position information from video stream and using it to aid an auto-pilot of an unmanned aerial vehicle (UAV). In the first part of this dissertation, a method for analyzing a minimization process called bundle adjustment (BA) used in stereo imagery based 3D terrain reconstruction to refine estimates of camera poses (positions and orientations) is presented. In particular, imagery obtained with pushbroom cameras is of interest. This work proposes a method to identify cases in which BA does not work as intended, i.e., the cases in which the pose estimates returned by the BA are not more accurate than estimates provided by a satellite navigation systems due to the existence of degrees of freedom (DOF) in BA. Use of inaccurate pose estimates causes warping and scaling effects in the reconstructed terrain and prevents the terrain from being used in scientific analysis. Main contributions of this part of work include: 1) formulation of a method for detecting DOF in the BA; and 2) identifying that two camera geometries commonly used to obtain stereo imagery have DOF. Also, this part presents results demonstrating that avoidance of the DOF can give significant accuracy gains in aerial imagery. The second part of this dissertation proposes a vision based system for UAV navigation. This is a monocular vision based simultaneous localization and mapping (SLAM) system, which measures the position and orientation of the camera and builds a map of the environment using a video-stream from a single camera. This is different from common SLAM solutions that use sensors that measure depth, like LIDAR, stereoscopic cameras or depth cameras. The SLAM solution was built by significantly modifying and extending a recent open-source SLAM solution that is fundamentally different from a traditional approach to solving SLAM problem. The modifications made are those needed to provide the position measurements necessary for the navigation solution on a UAV while simultaneously building the map, all while maintaining control of the UAV. The main contributions of this part include: 1) extension of the map building algorithm to enable it to be used realistically while controlling a UAV and simultaneously building the map; 2) improved performance of the SLAM algorithm for lower camera frame rates; and 3) the first known demonstration of a monocular SLAM algorithm successfully controlling a UAV while simultaneously building the map. This work demonstrates that a fully autonomous UAV that uses monocular vision for navigation is feasible, and can be effective in Global Positioning System denied environments. monocular vision bundle adjustment degrees of freedom SLAM Engineering (0537)
19	Exploração autônoma utilizando SLAM monocular esparso Pittol, Diego January 2018 (has links) Nos últimos anos, observamos o alvorecer de uma grande quantidade de aplicações que utilizam robôs autônomos. Para que um robô seja considerado verdadeiramente autônomo, é primordial que ele possua a capacidade de aprender sobre o ambiente no qual opera. Métodos de SLAM (Localização e Mapeamento Simultâneos) constroem um mapa do ambiente por onde o robô trafega ao mesmo tempo em que estimam a trajetória correta do robô. No entanto, para obter um mapa completo do ambiente de forma autônoma é preciso guiar o robô por todo o ambiente, o que é feito no problema de exploração. Câmeras são sensores baratos que podem ser utilizadas para a construção de mapas 3D. Porém, o problema de exploração em mapas gerados por métodos de SLAM monocular, i.e. que extraem informações de uma única câmera, ainda é um problema em aberto, pois tais métodos geram mapas esparsos ou semi-densos, que são inadequados para navegação e exploração. Para tal situação, é necessário desenvolver métodos de exploração capazes de lidar com a limitação das câmeras e com a falta de informação nos mapas gerados por SLAMs monoculares. Propõe-se uma estratégia de exploração que utilize mapas volumétricos locais, gerados através das linhas de visão, permitindo que o robô navegue em segurança. Nestes mapas locais, são definidos objetivos que levem o robô a explorar o ambiente desviando de obstáculos. A abordagem proposta visa responder a questão fundamental em exploração: "Para onde ir?". Além disso, busca determinar corretamente quando o ambiente está suficientemente explorado e a exploração deve parar. A abordagem proposta é avaliada através de experimentos em um ambiente simples (i.e. apenas uma sala) e em um ambiente compostos por diversas salas. / In recent years, we have seen the dawn of a large number of applications that use autonomous robots. For a robot to be considered truly autonomous, it is primordial that it has the ability to learn about the environment in which it operates. SLAM (Simultaneous Location and Mapping) methods build a map of the environment while estimating the robot’s correct trajectory. However, to autonomously obtain a complete map of the environment, it is necessary to guide the robot throughout the environment, which is done in the exploration problem. Cameras are inexpensive sensors that can be used for building 3D maps. However, the exploration problem in maps generated by monocular SLAM methods (i.e. that extract information from a single camera) is still an open problem, since such methods generate sparse or semi-dense maps that are ill-suitable for navigation and exploration. For such a situation, it is necessary to develop exploration methods capable of dealing with the limitation of the cameras and the lack of information in the maps generated by monocular SLAMs. We proposes an exploration strategy that uses local volumetric maps, generated using the lines of sight, allowing the robot to safely navigate. In these local maps, objectives are defined to lead the robot to explore the environment while avoiding obstacles. The proposed approach aims to answer the fundamental question in exploration: "Where to go?". In addition, it seeks to determine correctly when the environment is sufficiently explored and the exploration must stop. The effectiveness of the proposed approach is evaluated in experiments on single and multi-room environments. Robótica Inteligência artificial 3D exploration Monocular SLAM Point cloud
20	Monocular Vision and Image Correlation to Accomplish Autonomous Localization Schlachtman, Matthew Paul 01 June 2010 (has links) For autonomous navigation, robots and vehicles must have accurate estimates of their current state (i.e. location and orientation) within an inertial coordinate frame. If a map is given a priori, the process of determining this state is known as localization. When operating in the outdoors, localization is often assumed to be a solved problem when GPS measurements are available. However, in urban canyons and other areas where GPS accuracy is decreased, additional techniques with other sensors and filtering are required. This thesis aims to provide one such technique based on monocular vision. First, the system requires a map be generated, which consists of a set of geo-referenced video images. This map is generated offline before autonomous navigation is required. When an autonomous vehicle is later deployed, it will be equipped with an on-board camera. As the vehicle moves and obtains images, it will be able to compare its current images with images from the pre-generated map. To conduct this comparison, a method known as image correlation, developed at Johns Hopkins University by Rob Thompson, Daniel Gianola and Christopher Eberl, is used. The output from this comparison is used within a particle filter to provide an estimate of vehicle location. Experimentation demonstrates the particle filter's ability to successfully localize the vehicle within a small map that consists of a short section of road. Notably, no initial assumption of vehicle location within this map is required. robots localization monocular vision image correlation Computer and Systems Architecture Robotics

Search results