Spelling suggestions: "subject:"fisión por ordenado"" "subject:"fisión por ordenados""
1 |
Bearing-only slam methodsMunguía Alcalá, Rodrigo Francisco 19 October 2009 (has links)
SLAM (Simulatenous Localization and Mapping) es quizá el problema más importante a solucionar en robótica para construir robots móviles verdaderamente autónomos. El SLAM es acerca de cómo un robot móvil opera en un entorno a priori desconocido, utilizando únicamente sus sensores de abordo, mientras construye un mapa de dicho entorno que al mismo tiempo utiliza para localizarse.
Los sensores del robot tienen un gran impacto en los algoritmos usados en SLAM.
Los primeros enfoques se centraron en el uso de sensores de rango como sonares o láseres. Sin embargo hay algunas desventajas relacionadas con su utilización: La asociación de datos es difícil, son costosos, habitualmente están limitados a mapas 2D y tienen alto costo computacional debido al gran número de características (features) que producen.
Lo anterior ha propiciado que enfoques recientes se estén moviendo hacia el uso de cámaras como sensor principal. Estas se han vuelto muy atractivas para los investigadores de la robótica, dado que generan mucha información, facilitan la asociación de datos, están bien adaptadas para sistemas embebidos: son ligeras, baratas y ahorran energía. Usando visión, un robot puede localizarse así mismo usando objetos comunes como landmarks.
Sin embargo, a diferencia de los sensores de rango, que proveen información angular y de rango, una cámara es un sensor proyectivo que mide el bearing (ángulo) respecto a objetos de la imagen. Por lo que la profundidad (range) no puede ser obtenida en una sola toma. Este hecho ha motivado la aparición de una nueva familia de métodos de SLAM: Los Bearing-Only SLAM methods, los cuales están basados en técnicas especiales para la inicialización de features, permitiendo el uso de sensores de bearing en SLAM.
Esta tesis se centra en el estudio de la problemática del Bearing-Only SLAM: da una descripción extensa del tema, recapitula los retos actuales a resolver y propone nuevos métodos y algoritmos enfocados a tratar diferentes sub problemas concernientes esta problemática en general. Estos sub problemas deben de ser tratados, de manera que sea posible construir sistemas capaces de operar en entornos diversos y complejos.
La investigación descrita en esta disertación ha sido dividida en tres partes: 3DOF Bearing-Only SLAM: El proceso de inicialización de nuevas features es quizá el sub problema más importante a tratar en Bearing-Only SLAM. En esta parte de la tesis se introduce un nuevo método llamado Delayed Inverse Depth Features Initialization (para 3DOF y asumiendo odometría). Este método utiliza una parametrización inversa, donde la profundidad e incertidumbre iníciales de cada feature son dinámicamente estimadas previamente a que una feature sea declarada como un nuevo landmark en el mapa estocástico. También se presenta un sistema de SLAM basado en sonido, llamado SSLAM el cual usa fuentes de sonido como features del mapa. La contribución del SSLAM es demostrar la viabilidad de la inclusión del sentido auditivo en SLAM y mostrar que es factible utilizar sensores alternativos en Bearing-Only SLAM.
Métodos de asociación de datos para SLAM basado en visión: El problema de la asociación de datos es quizá uno de los problemas más difíciles en robótica y también uno de los sub problemas más importantes a tratar en SLAM. Consiste en determinar si las mediciones de un sensor tomadas en tiempos diferentes, corresponden al mismo objeto físico del mundo. En esta parte de la tesis, se proponen diferentes métodos que tratan el problema de la asociación de datos en un contexto de SLAM basado en visión.
SLAM monocular de 6DOF: El SLAM monocular de 6DOF quizá representa la variante más extrema del SLAM, dado que una cámara en mano es utilizada como la única entrada sensorial del sistema. En esta parte de la tesis, se extiende el algoritmo de 2DOF Bearing-Only SLAM para ser aplicado en un contexto de SLAM monocular. También se propone un nuevo esquema llamado SLAM Monocular Distribuido, enfocado en el problema de construir y mantener mapas consistentes de grandes entornos en tiempo real. La idea es dividir la estimación total del sistema en dos procesos de estimación concurrentes. Primero un método actual de SLAM monocular (Virtual Sensor) es modificado como un complejo sensor virtual que emula sensores típicos, como el laser para medición de rango y encoders para odometría. Después otro método tradicional de SLAM (Global SLAM) es acoplado para construir y mantener el mapa final.
Numerosas referencias bibliográficas, graficas, comparaciones, simulaciones y experimentos con datos reales de sensores, son presentador con el fin de mostrar el desempeño de los métodos propuestos. / Simultaneous Localization and Mapping (SLAM) is perhaps the most fundamental problem to solve in robotics in order to build truly autonomous mobile robots. SLAM is about on how can a mobile robot operate in an a priori unknown environment and use only onboard sensors to simultaneously build a map of its surroundings and use it to track its position. The robot’s sensors have a large impact on the algorithm used for SLAM. Early SLAM approaches focused on the use of range sensors as sonar rings or lasers. Nevertheless there are some disadvantages with the use of range sensors in SLAM: Correspondence or data association is difficult. They are expensive. They are generally limited to 2D maps and computational overhead due to large number of features. The aforementioned issues have propitiated that recent work is moving towards the use of cameras as the primary sensing modality. Cameras have become more and more interesting for the robotic research community, because it yield a lot of information allowing reliable data association. Cameras are well adapted for embedded systems: they are light, cheap and power saving. Using vision, a robot can localize itself using common objects as landmarks. On the other hand, at difference of range sensors (i.e. sonar or laser) which provides range and angular information, a camera is a projective sensor which measures the bearing of images features. Therefore depth information (range) cannot be obtained in a single frame. This fact has propitiated the emergence of a new family of SLAM methods: The Bearing-Only SLAM methods, which mainly relies in especial techniques for features system-initialization in order to enable the use of bearing sensors (as cameras) in SLAM systems. This thesis is focused on the study of the Bearing-Only SLAM problematic: It gives an extensive overview of the subject. It point out the principal challenges nowadays. And it presents new methods and algorithms which address different sub problems concerning to the Bearing-Only SLAM problematic. These sub problems must be solved, in order to build systems capable of operating in extremely diverse and complex environments.
The research described in this dissertation has been divided into three parts: 3DOF Bearing-Only SLAM: The initialization process for new features is perhaps the most important sub problem for addressing in Bearing-Only SLAM. In this part of the thesis we introduce a novel method called Delayed Inverse Depth Features Initialization for a 3DOF odometry-available context. In this method, which uses an inverse depth parameterization, initial depth and uncertainty of each feature are dynamically estimated priors to add the new landmark in the stochastic map. We also present a Sound-based SLAM system, called SSLAM, which uses “Sound Sources” as map’s features. The main contributions of the SSLAM are demonstrating the viability on the inclusion of the hearing sense in SLAM and show that is straightforward to use alternative bearing in SLAM systems. Data association methods for camera-based SLAM: the data association problem is possibly one of the hardest problems in robotic and also one of the most important sub problems to solve in SLAM. The correspondence problem is the problem of determining if sensor measurements taken at different points in time correspond to the same physical object in the world. In this part of the thesis, we propose different methods for addressing the data association problem in a context of vision-based SLAM. 6DOF Monocular SLAM: 6-DOF monocular SLAM possibly represents the harder variant of SLAM, since a low cost hand-held camera is used as the only sensory input to the system. In this part of the thesis, we extend our 2DOF Bearing-Only SLAM algorithm for being used in a monocular SLAM context. Also a novel framework called Distributed Monocular SLAM is proposed for addressing the problem of building and maintaining a global and consistent map of large environments at real time. The key idea is to divide the whole estimation into two concurrent estimation processes. First a state of the art monocular SLAM method (Called Virtual Sensor) is modified as a complex virtual sensor that emulates typical sensors such as laser for range measurement and encoders for dead reckoning. Afterward, a classic SLAM method (called Global SLAM) is plugged in for building and maintaining the final map. Several references, graphics, comparisons, simulations and experiments with real data are presented in order to demonstrate the performance of the methods.
|
2 |
Computer vision techniques for early detection of skin cancerQuintana Plana, Josep 14 June 2012 (has links)
This thesis investigates the problem of developing new computer vision techniques for early detection of skin cancer. The first part of this work presents a novel methodology to correct color reproduction in dermatological images when different cameras and/or dermoscopes are used. Next, the problem of automatic full body mapping is addressed by proposing a mosaicing method based on an on-the-shelf digital compact camera and a set of markers. This method increases the possibilities of total body photography by taking the low-resolution images of a whole body exploration and automatically combining them into a high-resolution photomosaic. The third contribution of this work consists of the development of a full body scanner for acquiring cutaneous images. On one hand, the scanner reduces the long time-consuming examinations done in dermoscopy explorations, and on the other hand, it increases the resolution of total body photography systems. / En aquesta tesi s'investiga el desenvolupament de noves tècniques de visió per computador per a la detecció del càncer de pell. La primera part del treball presenta una nova metodologia per a la correcció del color en imatges dermatològiques quan s'utilitzen diferents càmeres i/o els dermatoscops.
A continuació és proposa una solució al problema del registre automàtic d'imatges de cos complert amb la proposta d’un mètode de mosaicing basat en l'ús de càmeres compactes i un conjunt de markers. Incrementant les possibilitats de la fotografia de cos complert mitjançant la combinació automàtica d’imatges de baixa resuloció per a l'obtenció d'un fotomosaic d’alta resolució. La tercera contribució d'aquest treball consisteix en el desenvolupament d'un escàner de cos complert per a l'adquisició d'imatges cutànies. D'una banda l'escàner redueix el llarg temps necessari per a les exploracions dermatoscòpiques, i de l'altre, incrementa la resolució de la fotografia de cos complet.
|
3 |
Efficient 3D scene modeling and mosaicingNicosevici, Tudor 18 December 2009 (has links)
El modelat d'escenes és clau en un gran ventall d'aplicacions que van des de la generació mapes fins a la realitat augmentada. Aquesta tesis presenta una solució completa per a la creació de models 3D amb textura. En primer lloc es presenta un mètode de Structure from Motion seqüencial, a on el model 3D de l'entorn s'actualitza a mesura que s'adquireix nova informació visual. La proposta és més precisa i robusta que l'estat de l'art. També s'ha desenvolupat un mètode online, basat en visual bag-of-words, per a la detecció eficient de llaços. Essent una tècnica completament seqüencial i automàtica, permet la reducció de deriva, millorant la navegació i construcció de mapes. Per tal de construir mapes en àrees extenses, es proposa un algorisme de simplificació de models 3D, orientat a aplicacions online. L'eficiència de les propostes s'ha comparat amb altres mètodes utilitzant diversos conjunts de dades submarines i terrestres. / Scene modeling has a key role in applications ranging from visual mapping to augmented reality. This thesis presents an end-to-end solution for creating accurate, automatic 3D textured models, with contributions at different levels. First, we discuss a method developed within the framework of sequential Structure from Motion, where a 3D model of the environment is maintained and updated as visual information becomes available. The technique is more accurate and robust than state-of-the-art 3D modeling approaches. We also develop an online effcient loop-closure detection algorithm, allowing the reduction of drift and uncertainties for mapping and navigation. Inspired from visual bag-of-words, the technique is entirely sequential and automatic. Lastly, motivated by the need to map large areas, we propose a 3D model simplification oriented towards online applications. We discuss the efficiency of the proposals and compare them with state-of-the-art approaches, using a series of challenging datasets both in underwater and outdoor scenarios.
|
4 |
Assisted visual servoing by means of structured lightPagès Marco, Jordi 25 November 2005 (has links)
Aquesta tesi tracta sobre la combinació del control visual i la llum estructurada. El control visual clàssic assumeix que elements visuals poden ser fàcilment extrets de les imatges. Això fa que objectes d'aspecte uniforme o poc texturats no es puguin tenir en compte. En aquesta tesi proposem l'ús de la llum estructurada per dotar d'elements visuals als objectes independentment de la seva aparença.En primer lloc, es presenta un ampli estudi de la llum estructurada, el qual ens permet proposar un nou patró codificat que millora els existents. La resta de la tesi es concentra en el posicionament d'un robot dotat d'una càmara respecte diferentsobjectes, utilitzant la informació proveïda per la projecció de diferents patrons de llum. Dos configuracions han estat estudiades: quan el projector de llum es troba separat del robot,i quan el projector està embarcat en el robot juntament amb la càmara. Les tècniques proposades en la tesi estan avalades per un ampli estudi analític i validades per resultats experimentals. / This thesis treats about the combination of visual servoing and structured light. Classic visual servoing assumes that visual features can be extracted from the images. However, uniform ornon-textured objects, or objects for which extracting features is too complex or too time consuming cannot be taken into account.This thesis proposes the use of structured light patterns for providing suitable visual features independently of the object appearance.Firstly, a comprehensive survey on coded structured light patterns is presented. Then, a new pattern improving the existing ones isproposed. The remaining of the thesis is devoted to position an eye-in-hand robot with respect to objects by using features provided by light patterns. Two configurations are tested. In thefirst one, an off-board video-projector is used while in the second, an onboard structured light emitter is exploited. The techniques proposed in the thesis are supported by theoreticalanalysis and they are validated by experimental results.
|
5 |
One-shot pattern projection for dense and accurate 3D reconstruction in structured lightFernández Navarro, Sergio 22 June 2012 (has links)
This thesis focuses on the problem of 3D acquisition using coded structured light (CSL). In CSL, a projected pattern impinges artificial texture onto the object surface, increasing the number of correspondences in the retrieved image. Finally, 3D acquisition is pursued by triangulation. An active research is being done in CSL techniques for moving scenarios. In this thesis, a review of the main CSL approaches is presented. Afterwards, we perform a deep study of the two most used frequency-based techniques, and a new proposal for automatic selection of the window width using Windowed Fourier Transform (WFT). Using this analysis, we implemented a new technique for one-shot dense acquisition, able to work in moving scenarios. The technique is based on adaptive WFT and DeBruijn coding. The results show the proposed method obtains dense acquisition with accuracy levels comparable to DeBruijn algorithms. Finally, the thesis focuses on the problem of registration in SL. / Esta tesis estudia el problema de la reconstrucción 3D con Luz Estructurada (LE). En LE se proyecta un patrón en la superficie del objecto, a fin de incrementar la textura y el número de correspondencias con la imagen capturada, de la que se extrae la información 3D. Actualmente se trabaja en soluciones de LE para entornos moviles. La tesis presenta un compendio de las principales tecnicas en LE. Además, se estudian en detalles las dos propuestas de análisis frecuencial, proponiendo un algoritmo para el análisis del patrón capturado. Con ésto, se propone un método de un único patrón proyectado, obteniendo reconstrucción densa. La técnica se basa en WFT combinado con codificación DeBruijn. Los resultados muestran niveles de precisión comparables con otras técnicas DeBruijn, pero obteniendo reconstrucción densa. Finalmente, se estudia el problema de registro de
reconstrucciones LE.
|
6 |
An approach to coded structured light to obtain three dimensional informationSalvi, Joaquim 16 February 1998 (has links)
The human visual ability to perceive depth looks like a puzzle. We perceive three-dimensional spatial information quickly and efficiently by using the binocular stereopsis of our eyes and, what is mote important the learning of the most common objects which we achieved through living. Nowadays, modelling the behaviour of our brain is a fiction, that is why the huge problem of 3D perception and further, interpretation is split into a sequence of easier problems. A lot of research is involved in robot vision in order to obtain 3D information of the surrounded scene. Most of this research is based on modelling the stereopsis of humans by using two cameras as if they were two eyes. This method is known as stereo vision and has been widely studied in the past and is being studied at present, and a lot of work will be surely done in the future. This fact allows us to affirm that this topic is one of the most interesting ones in computer vision.The stereo vision principle is based on obtaining the three dimensional position of an object point from the position of its projective points in both camera image planes. However, before inferring 3D information, the mathematical models of both cameras have to be known. This step is known as camera calibration and is broadly describes in the thesis. Perhaps the most important problem in stereo vision is the determination of the pair of homologue points in the two images, known as the correspondence problem, and it is also one of the most difficult problems to be solved which is currently investigated by a lot of researchers. The epipolar geometry allows us to reduce the correspondence problem. An approach to the epipolar geometry is describes in the thesis. Nevertheless, it does not solve it at all as a lot of considerations have to be taken into account. As an example we have to consider points without correspondence due to a surface occlusion or simply due to a projection out of the camera scope.The interest of the thesis is focused on structured light which has been considered as one of the most frequently used techniques in order to reduce the problems related lo stereo vision. Structured light is based on the relationship between a projected light pattern its projection and an image sensor. The deformations between the pattern projected into the scene and the one captured by the camera, permits to obtain three dimensional information of the illuminated scene. This technique has been widely used in such applications as: 3D object reconstruction, robot navigation, quality control, and so on. Although the projection of regular patterns solve the problem of points without match, it does not solve the problem of multiple matching, which leads us to use hard computing algorithms in order to search the correct matches.In recent years, another structured light technique has increased in importance. This technique is based on the codification of the light projected on the scene in order to be used as a tool to obtain an unique match. Each token of light is imaged by the camera, we have to read the label (decode the pattern) in order to solve the correspondence problem. The advantages and disadvantages of stereo vision against structured light and a survey on coded structured light are related and discussed. The work carried out in the frame of this thesis has permitted to present a new coded structured light pattern which solves the correspondence problem uniquely and robust. Unique, as each token of light is coded by a different word which removes the problem of multiple matching. Robust, since the pattern has been coded using the position of each token of light with respect to both co-ordinate axis. Algorithms and experimental results are included in the thesis. The reader can see examples 3D measurement of static objects, and the more complicated measurement of moving objects. The technique can be used in both cases as the pattern is coded by a single projection shot. Then it can be used in several applications of robot vision.Our interest is focused on the mathematical study of the camera and pattern projector models. We are also interested in how these models can be obtained by calibration, and how they can be used to obtained three dimensional information from two correspondence points. Furthermore, we have studied structured light and coded structured light, and we have presented a new coded structured light pattern. However, in this thesis we started from the assumption that the correspondence points could be well-segmented from the captured image. Computer vision constitutes a huge problem and a lot of work is being done at all levels of human vision modelling, starting from a)image acquisition; b) further image enhancement, filtering and processing, c) image segmentation which involves thresholding, thinning, contour detection, texture and colour analysis, and so on. The interest of this thesis starts in the next step, usually known as depth perception or 3D measurement.
|
7 |
Catadioptric stereo based on structured light projectionRadu, Orghidan 24 July 2006 (has links)
La percepció per visió es millorada quan es pot gaudir d'un camp de visió ampli. Aquesta tesi es concentra en la percepció visual de la profunditat amb l'ajuda de càmeres omnidireccionals. La percepció 3D s'obté generalment en la visió per computadora utilitzant configuracions estèreo amb el desavantatge del cost computacional elevat a l'hora de buscar els elements visuals comuns entre les imatges. La solució que ofereix aquesta tesi és l'ús de la llum estructurada per resoldre el problema de relacionar les correspondències.S'ha realitzat un estudi sobre els sistemes de visió omnidireccional. S'han avaluat vàries configuracions estèreo i s'ha escollit la millor. Els paràmetres del model són difícils de mesurar directament i, en conseqüència, s'ha desenvolupat una sèrie de mètodes de calibració.Els resultats obtinguts són prometedors i demostren que el sensor pot ésser utilitzat en aplicacions per a la percepció de la profunditat com serien el modelatge de l'escena, la inspecció de canonades, navegació de robots, etc. / Vision perception is enhanced when a large field of view is available. This thesis is focused on the visual perception of depth by means of omnidirectional cameras. The 3D sensing is obtained in computer vision by means of stereo configurations with the drawback of feature matching between images. The solution offered in this dissertation uses structured light projection for solving the matching problem. First, a survey on omnidirectional vision systems was realized. Then, the sensor design was addressed and the particular stereo configuration of the proposed sensor was decided. An accurate model is obtained by a careful study of both components of the sensor. The model parameters are measured by a set of calibration methods.The results obtained are encouraging and prove that the sensor can be used in depth perception applications such as scene modeling, pipe inspections, robot navigation, etc.
|
8 |
Hand-held 3D-scanner for large surface registrationMatabosch Geronès, Carles 26 June 2007 (has links)
L'objectiu d'aquesta tesi és l'estudi de les diferents tècniques per alinear vistes tridimensionals. Aquest estudi ens ha permès detectar els principals problemes de les tècniques existents, aprotant una solució novedosa i contribuint resolent algunes de les mancances detectades especialment en l'alineament de vistes a temps real. Per tal d'adquirir les esmentades vistes, s'ha dissenyat un sensor 3D manual que ens permet fer adquisicions tridimensionals amb total llibertat de moviments. Així mateix, s'han estudiat les tècniques de minimització global per tal de reduir els efectes de la propagació de l'error. / The goal of this thesis is to study the different techniques used to register 3D acquisitions. This study detects the main drawbacks of the existing techniques, presents a new classification and provides significant solutions of some perceived shortcomings, especially in 3D real time registration. A 3D hand-held sensor has been designed to acquire these views without any motion restriction and global minimization techniques have been studied to decrease the error propagation effects.
|
9 |
Síntesis Audiovisual Realista PersonalizableMelenchón Maldonado, Javier 13 July 2007 (has links)
Es presenta un esquema únic per a la síntesi i anàlisi audiovisual personalitzable realista de seqüències audiovisuals de cares parlants i seqüències visuals de llengua de signes en àmbit domèstic. En el primer cas, amb animació totalment sincronitzada a través d'una font de text o veu; en el segon, utilitzant la tècnica de lletrejar paraules mitjançant la ma. Les seves possibilitats de personalització faciliten la creació de seqüències audiovisuals per part d'usuaris no experts. Les aplicacions possibles d'aquest esquema de síntesis comprenen des de la creació de personatges virtuals realistes per interacció natural o vídeo jocs fins vídeo conferència des de molt baix ample de banda i telefonia visual per a les persones amb problemes d'oïda, passant per oferir ajuda a la pronunciació i la comunicació a aquest mateix col·lectiu. El sistema permet processar seqüències llargues amb un consum de recursos molt reduït, sobre tot, en el referent a l'emmagatzematge, gràcies al desenvolupament d'un nou procediment de càlcul incremental per a la descomposició en valors singulars amb actualització de la informació mitja. Aquest procediment es complementa amb altres tres: el decremental, el de partició i el de composició. / Se presenta un esquema único para la síntesis y análisis audiovisual personalizable realista de secuencias audiovisuales de caras parlantes y secuencias visuales de lengua de signos en entorno doméstico. En el primer caso, con animación totalmente sincronizada a través de una fuente de texto o voz; en el segundo, utilizando la técnica de deletreo de palabras mediante la mano. Sus posibilidades de personalización facilitan la creación de secuencias audiovisuales por parte de usuarios no expertos. Las aplicaciones posibles de este esquema de síntesis comprenden desde la creación de personajes virtuales realistas para interacción natural o vídeo juegos hasta vídeo conferencia de muy bajo ancho de banda y telefonía visual para las personas con problemas de oído, pasando por ofrecer ayuda en la pronunciación y la comunicación a este mismo colectivo. El sistema permite procesar secuencias largas con un consumo de recursos muy reducido gracias al desarrollo de un nuevo procedimiento de cálculo incremental para la descomposición en valores singulares con actualización de la información media. / A shared framework for realistic and personalizable audiovisual synthesis and analysis of audiovisual sequences of talking heads and visual sequences of sign language is presented in a domestic environment. The former has full synchronized animation using a text or auditory source of information; the latter consists in finger spelling. Their personalization capabilities ease the creation of audiovisual sequences by non expert users. The applications range from realistic virtual avatars for natural interaction or videogames to low bandwidth videoconference and visual telephony for the hard of hearing, including help to speech therapists. Long sequences can be processed with reduced resources, specially storing ones. This is allowed thanks to the proposed scheme for the incremental singular value decomposition with mean preservation. This scheme is complemented with another three: the decremental, the split and the composed ones.
|
Page generated in 0.1399 seconds