Spelling suggestions: "subject:"stereovision"" "subject:"stereocision""
121 |
Visual odometry: comparing a stereo and a multi-camera approach / Odometria visual: comparando métodos estéreo e multi-câmeraAna Rita Pereira 25 July 2017 (has links)
The purpose of this project is to implement, analyze and compare visual odometry approaches to help the localization task in autonomous vehicles. The stereo visual odometry algorithm Libviso2 is compared with a proposed omnidirectional multi-camera approach. The proposed method consists of performing monocular visual odometry on all cameras individually and selecting the best estimate through a voting scheme involving all cameras. The omnidirectionality of the vision system allows the part of the surroundings richest in features to be used in the relative pose estimation. Experiments are carried out using cameras Bumblebee XB3 and Ladybug 2, fixed on the roof of a vehicle. The voting process of the proposed omnidirectional multi-camera method leads to some improvements relatively to the individual monocular estimates. However, stereo visual odometry provides considerably more accurate results. / O objetivo deste mestrado é implementar, analisar e comparar abordagens de odometria visual, de forma a contribuir para a localização de um veículo autônomo. O algoritmo de odometria visual estéreo Libviso2 é comparado com um método proposto, que usa um sistema multi-câmera omnidirecional. De acordo com este método, odometria visual monocular é calculada para cada câmera individualmente e, seguidamente, a melhor estimativa é selecionada através de um processo de votação que involve todas as câmeras. O fato de o sistema de visão ser omnidirecional faz com que a parte dos arredores mais rica em características possa sempre ser usada para estimar a pose relativa do veículo. Nas experiências são utilizadas as câmeras Bumblebee XB3 e Ladybug 2, fixadas no teto de um veículo. O processo de votação do método multi-câmera omnidirecional proposto apresenta melhorias relativamente às estimativas monoculares individuais. No entanto, a odometria visual estéreo fornece resultados mais precisos.
122 |
Signal- och bildbehandling på moderna grafikprocessorerPettersson, Erik January 2005 (has links)
En modern grafikprocessor är oerhört kraftfull och har en prestanda som potentiellt sett är många gånger högre än för en modern mikroprocessor. I takt med att grafikprocessorn blivit alltmer programmerbar har det blivit möjligt att använda den för beräkningstunga tillämpningar utanför dess normala användningsområde. Inom det här arbetet utreds vilka möjligheter och begränsningar som uppstår vid användandet av grafikprocessorer för generell programmering. Arbetet inriktas främst mot signal- och bildbehandlingstillämpningar men mycket av principerna är tillämpliga även inom andra områden. Ett ramverk för bildbehandling implementeras och några algoritmer inom bildanalys realiseras och utvärderas, bland annat stereoseende och beräkning av optiskt flöde. Resultaten visar på att vissa tillämpningar kan uppvisa en avsevärd prestandaökning i en grafikprocessor jämfört med i en mikroprocessor men att andra tillämpningar kan vara ineffektiva eller mycket svåra att implementera. / The modern graphical processing unit, GPU, is an extremely powerful unit, potentially many times more powerful than a modern microprocessor. Due to its increasing programmability it has recently become possible to use it in computation intensive applications outside its normal usage. This work investigates the possibilities and limitations of general purpose programming on GPUs. The work mainly concentrates on signal and image processing although much of the principles are applicable to other areas as well. A framework for image processing on GPUs is implemented and a few computer vision algorithms are implemented and evaluated, among them stereo vision and optical flow. The results show that some applications can gain a substantial speedup when implemented correctly in the GPU but others can be inefficent or extremly hard to implement.
123 |
Navigability estimation for autonomous vehicles using machine learning / Estimação de navegabilidade para veículos autônomos usando aprendizado de máquinaCaio César Teodoro Mendes 08 June 2017 (has links)
Autonomous navigation in outdoor, unstructured environments is one of the major challenges presents in the robotics field. One of its applications, intelligent autonomous vehicles, has the potential to decrease the number of accidents on roads and highways, increase the efficiency of traffic on major cities and contribute to the mobility of the disabled and elderly. For a robot/vehicle to safely navigate, accurate detection of navigable areas is essential. In this work, we address the task of visual road detection where, given an image, the objective is to classify its pixels into road or non-road. Instead of trying to manually derive an analytical solution for the task, we have used machine learning (ML) to learn it from a set of manually created samples. We have applied both traditional (shallow) and deep ML models to the task. Our main contribution regarding traditional ML models is an efficient and versatile way to aggregate spatially distant features, effectively providing a spatial context to such models. As for deep learning models, we have proposed a new neural network architecture focused on processing time and a new neural network layer called the semi-global layer, which efficiently provides a global context for the model. All the proposed methodology has been evaluated in the Karlsruhe Institute of Technology (KIT) road detection benchmark, achieving, in all cases, competitive results. / A navegação autônoma em ambientes externos não estruturados é um dos maiores desafios no campo da robótica. Uma das suas aplicações, os veículos inteligentes autônomos, tem o potencial de diminuir o número de acidentes nas estradas e rodovias, aumentar a eficiência do tráfego nas grandes cidades e contribuir para melhoria da mobilidade de deficientes e idosos. Para que um robô/veículo navegue com segurança, uma detecção precisa de áreas navegáveis é essencial. Neste trabalho, abordamos a tarefa de detecção visual de ruas onde, dada uma imagem, o objetivo é classificar cada um de seus pixels em rua ou não-rua. Ao invés de tentar derivar manualmente uma solução analítica para a tarefa, usamos aprendizado de máquina (AM) para aprendê-la a partir de um conjunto de amostras criadas manualmente. Nós utilizamos tanto modelos tradicionais (superficiais) quanto modelos profundos para a tarefa. A nossa principal contribuição em relação aos modelos tradicionais é uma forma eficiente e versátil de agregar características espacialmente distantes, fornecendo efetivamente um contexto espacial para esses modelos. Quanto aos modelos de aprendizagem profunda, propusemos uma nova arquitetura de rede neural focada no tempo de processamento e uma nova camada de rede neural, chamada camada semi-global, que fornece eficientemente um contexto global ao modelo. Toda a metodologia proposta foi avaliada no benchmark de detecção de ruas do Instituto de Tecnologia de Karlsruhe, alcançando, em todos os casos, resultados competitivos.
124 |
A Prototype For An Interactive And Dynamic Image-Based Relief Rendering System / En prototyp för ett interaktivt och dynamisktbildbaserat relief renderingssystemBakos, Niklas January 2002 (has links)
In the research of developing arbitrary and unique virtual views from a real- world scene, a prototype of an interactive relief texture mapping system capable of processing video using dynamic image-based rendering, is developed in this master thesis. The process of deriving depth from recorded video using binocular stereopsis is presented, together with how the depth information is adjusted to be able to manipulate the orientation of the original scene. When the scene depth is known, the recorded organic and dynamic objects can be seen from viewpoints not available in the original video.
125 |
Detecting and Tracking Players in Football Using Stereo VisionBorg, Johan January 2007 (has links)
The objective of this thesis is to investigate if it is possible to use stereo vision to find and track the players and the ball during a football game. The thesis shows that it is possible to detect all players that isn’t too occluded by another player. Situations when a player is occluded by another player is solved by tracking the players from frame to frame. The ball is also detected in most frames by looking for ball-like features. As with the players the ball is tracked from frame to frame so that when the ball is occluded, the positions is estimated by the tracker.
126 |
Polarization stereoscopic imaging prototype / Prototype d'imagerie polarimétrique stéréoscopiqueIqbal, Mohammad 02 November 2011 (has links)
La polarisation de la lumière, phénomène physique parfaitement maîtrisé, a été introduit depuis une dizaine d'années seulement dans le domaine de l'imagerie. En effet, tout comme l'œil humain, les capteurs ne sont pas, par construction, sensible à la polarisation de la lumière. Cette propriété particulièrement intéressante ne peut être obtenue qu'en ajoutant des composants optiques aux caméras classiques. L'objectif de ce travail de thèse est de développer un système à la fois stéréoscopique et sensible à l'état de polarisation. En effet, de nombreux insectes dans la nature, comme les abeilles par exemple, ont la capacité à s'orienter dans l'espace et à extraire des informations pertinentes issues de la polarisation. Le prototype ainsi développé doit permettre de reconstruire en trois dimensions des points d'intérêt tout en associant à ces points un ensemble de paramètres relatifs à l'état de polarisation. Le système proposé ici est constitué de deux caméras équipés chacune de deux composants à cristaux liquides permettant d'obtenir deux images avec des orientations de polarisation différentes. Pour chaque acquisition, quatre images sont obtenues : deux pour chacune des caméras. Le verrou majeur soulevé ici est la possibilité de remonter à des informations de polarisation à partir de deux caméras différentes. Après une première étape de calibration géométrique et photométrique, la mise en correspondance des points d'intérêt est rendue délicate en raison des composants optiques placés devant les objectifs. Une étude approfondie des différentes méthodes de mise en correspondance a permis de sélectionner la méthode la moins sensible aux effets de polarisation. Une fois les points mis en correspondance, les paramètres de polarisation de chacun des points sont calculés à partir des quatre valeurs issues des quatre images acquises. Les résultats obtenus sur des scènes réelles montrent la faisabilité et l'intérêt d'un tel système pour des applications robotiques. / The polarization of light was introduced last ten years ago in the field of imaging system is a physical phenomenon that can be controlled for the purposes of the vision system. As that found in the human eyes, in general the imaging sensors are not under construction which is sensitive to the polarization of light. These properties can be measured by adding optical components on a conventional camera. The purpose of this thesis is to develop an imaging system that is sensitive both to the stereoscopic and to the state of polarization. As well as the visual system on a various of insects in nature such as bees, that are have capability to move in space by extracted relevant information from the polarization. The developed prototype should be possible to reconstruct threedimensional of points of interest with the issues associated with a set of parameters of the state of polarization. The proposed system consists of two cameras, each camera equipped with liquid crystal components to obtain two images with different directions of polarization. For each acquisition, four images are acquired: two for each camera. Raised by the key of main capability to return polarization information from two different cameras. After an initial calibration step; geometric and photometric, the mapping of points of interest process is made difficult because of the optical components placed in front of different lenses. A detailed study of different methods of mapping was used to select sensitivity to the polarization effects. Once points are mapped, the polarization parameters of each point are calculated from the four values from four images acquired. The results on real scenes show the feasibility and desirability of this imaging system for robotic applications.
127 |
Semantic segmentation of terrain and road terrain for advanced driver assistance systemsGheorghe, I. V. January 2015 (has links)
Modern automobiles and particularly those with off-road lineage possess subsystems that can be configured to better negotiate certain terrain types. Different terrain classes amount to different adherence (or surface grip) and compressibility properties that impact vehicle ma-noeuvrability and should therefore incur a tailored throttle response, suspension stiffness and so on. This thesis explores prospective terrain recognition for an anticipating terrain response driver assistance system. Recognition of terrain and road terrain is cast as a semantic segmen-tation task whereby forward driving images or point clouds are pre-segmented into atomic units and subsequently classified. Terrain classes are typically of amorphous spatial extent con-taining homogenous or granularly repetitive patterns. For this reason, colour and texture ap-pearance is the saliency of choice for monocular vision. In this work, colour, texture and sur-face saliency of atomic units are obtained with a bag-of-features approach. Five terrain classes are considered, namely grass, dirt, gravel, shrubs and tarmac. Since colour can be ambiguous among terrain classes such as dirt and gravel, several texture flavours are explored with scalar and structured output learning in a bid to devise an appropriate visual terrain saliency and predictor combination. Texture variants are obtained using local binary patters (LBP), filter responses (or textons) and dense key-point descriptors with daisy. Learning algorithms tested include support vector machine (SVM), random forest (RF) and logistic regression (LR) as scalar predictors while a conditional random field (CRF) is used for structured output learning. The latter encourages smooth labelling by incorporating the prior knowledge that neighbouring segments with similar saliency are likely segments of the same class. Once a suitable texture representation is devised the attention is shifted from monocular vision to stereo vision. Sur-face saliency from reconstructed point clouds can be used to enhance terrain recognition. Pre-vious superpixels span corresponding supervoxels in real world coordinates and two surface saliency variants are proposed and tested with all predictors: one using the height coordinates of point clouds and the other using fast point feature histograms (FPFH). Upon realisation that road recognition and terrain recognition can be assumed as equivalent problems in urban en-vironments, the top most accurate models consisting of CRFs are augmented with composi-tional high order pattern potentials (CHOPP). This leads to models that are able to strike a good balance between smooth local labelling and global road shape. For urban environments the label set is restricted to road and non-road (or equivalently tarmac and non-tarmac). Ex-periments are conducted using a proprietary terrain dataset and a public road evaluation da-taset.
128 |
Segmentação e reconhecimento de gestos em tempo real com câmeras e aceleração gráfica / Real-time segmentation and gesture recognition with cameras and graphical accelerationDaniel Oliveira Dantas 15 March 2010 (has links)
O objetivo deste trabalho é reconhecer gestos em tempo real apenas com o uso de câmeras, sem marcadores, roupas ou qualquer outro tipo de sensor. A montagem do ambiente de captura é simples, com apenas duas câmeras e um computador. O fundo deve ser estático, e contrastar com o usuário. A ausência de marcadores ou roupas especiais dificulta a tarefa de localizar os membros. A motivação desta tese é criar um ambiente de realidade virtual para treino de goleiros, que possibilite corrigir erros de movimentação, posicionamento e de escolha do método de defesa. A técnica desenvolvida pode ser aplicada para qualquer atividade que envolva gestos ou movimentos do corpo. O reconhecimento de gestos começa com a detecção da região da imagem onde se encontra o usuário. Nessa região, localizamos as regiões mais salientes como candidatas a extremidades do corpo, ou seja, mãos, pés e cabeça. As extremidades encontradas recebem um rótulo que indica a parte do corpo que deve representar. Um vetor com as coordenadas das extremidades é gerado. Para descobrir qual a pose do usuário, o vetor com as coordenadas das suas extremidades é classificado. O passo final é a classificação temporal, ou seja, o reconhecimento do gesto. A técnica desenvolvida é robusta, funcionando bem mesmo quando o sistema foi treinado com um usuário e aplicado a dados de outro. / Our aim in this work is to recognize gestures in real time with cameras, without markers or special clothes. The capture environment setup is simple, uses just two cameras and a computer. The background must be static, and its colors must be different the users. The absence of markers or special clothes difficults the location of the users limbs. The motivation of this thesis is to create a virtual reality environment for goalkeeper training, but the technique can be applied in any activity that involves gestures or body movements. The recognition of gestures starts with the background subtraction. From the foreground, we locate the more proeminent regions as candidates to body extremities, that is, hands, feet and head. The found extremities receive a label that indicates the body part it may represent. To classify the users pose, the vector with the coordinates of his extremities is compared to keyposes and the best match is selected. The final step is the temporal classification, that is, the gesture recognition. The developed technique is robust, working well even when the system was trained with an user and applied to another users data.
129 |
Components of Embodied Visual Object Recognition : Object Perception and Learning on a Robotic PlatformWallenberg, Marcus January 2013 (has links)
Object recognition is a skill we as humans often take for granted. Due to our formidable object learning, recognition and generalisation skills, it is sometimes hard to see the multitude of obstacles that need to be overcome in order to replicate this skill in an artificial system. Object recognition is also one of the classical areas of computer vision, and many ways of approaching the problem have been proposed. Recently, visually capable robots and autonomous vehicles have increased the focus on embodied recognition systems and active visual search. These applications demand that systems can learn and adapt to their surroundings, and arrive at decisions in a reasonable amount of time, while maintaining high object recognition performance. Active visual search also means that mechanisms for attention and gaze control are integral to the object recognition procedure. This thesis describes work done on the components necessary for creating an embodied recognition system, specifically in the areas of decision uncertainty estimation, object segmentation from multiple cues, adaptation of stereo vision to a specific platform and setting, and the implementation of the system itself. Contributions include the evaluation of methods and measures for predicting the potential uncertainty reduction that can be obtained from additional views of an object, allowing for adaptive target observations. Also, in order to separate a specific object from other parts of a scene, it is often necessary to combine multiple cues such as colour and depth in order to obtain satisfactory results. Therefore, a method for combining these using channel coding has been evaluated. Finally, in order to make use of three-dimensional spatial structure in recognition, a novel stereo vision algorithm extension along with a framework for automatic stereo tuning have also been investigated. All of these components have been tested and evaluated on a purpose-built embodied recognition platform known as Eddie the Embodied. / Embodied Visual Object Recognition
130 |
Object Tracking and Interception System : Mobile Object Catching Robot using Static Stereo Vision / Objektspårning och uppfångningssystemCALMINDER, SIMON, KÄLLSTRÖM CHITTUM, MATTHEW January 2018 (has links)
The aim of this project is to examine the feasibility and reliability of the use of a low cost computer vision system to track and intercept a thrown object. A stereo vision system tracks the object using color recognition and then guides a mobile wheeled robot towards an interception point in order to capture it. Two different trajectory prediction models are compared. One model fits a second degree polynomial to the collected positional measurements of the object and the other uses the Forward Euler Method to construct the objects flight path. To accurately guide the robot, the angular position of the robot must also be measured. Two different methods of measuring the angular position are presented and their respective reliability are measured. A calibrated magnetometer is used as one method while pure computer vision is implemented as the alternative method. A functional object tracking and interception system that was able to intercept the thrown object was constructed using both the polynomial fitting trajectory prediction model as well as the one based on the Forward Euler Method. The magnetometer and pure computer vision are both viable methods of determining the angular position of the robot with an error of less than 1.5°. / I detta projekt behandlas konstruktionen av och pålitligheten i en bollfånganderobot och dess bakomliggande lågbudgetkamerasystem. För att fungera i tre dimensioner används en stereokameramodul som spårar bollen med hjälp av färgigenkänning och beräknar bollbanan samt förutspår nedslaget för att ge god tid till roboten att genskjuta bollen. Två olika bollbanemodeller testas, där den ena tar hänsyn till luftmotståndet och nedslaget beräknas numeriskt och den andra anpassar en andragradspolynom till de observerade datapunkterna. För att styra roboten till den tänkta uppfångningspunkten behövs både robotens position, vilket bestäms med kameramodulen, och robotens riktning.Riktningen bestäms medbåde en magnetometer och med kameramodulen, för att undersöka vilken metod som passar bäst. Den förslagna konstruktionen för roboten och kamerasystemet kan spåra och fånga objekt med bådadera de testade modellerna för att beräkna bollbana, dock så är tillförlitligheten i den numeriska metoden betydligt känsligare för dåliga mätvärden. Det är även möjligt att använda sig av både magnetometern eller endast kameramodulen för att bestämma robotens riktning då båda ger ett fel under 1.5°.
Page generated in 0.0678 seconds