• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 9
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 21
  • 21
  • 7
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Active Stereo Vision: Depth Perception For Navigation, Environmental Map Formation And Object Recognition

Ulusoy, Ilkay 01 September 2003 (has links) (PDF)
In very few mobile robotic applications stereo vision based navigation and mapping is used because dealing with stereo images is very hard and very time consuming. Despite all the problems, stereo vision still becomes one of the most important resources of knowing the world for a mobile robot because imaging provides much more information than most other sensors. Real robotic applications are very complicated because besides the problems of finding how the robot should behave to complete the task at hand, the problems faced while controlling the robot&rsquo / s internal parameters bring high computational load. Thus, finding the strategy to be followed in a simulated world and then applying this on real robot for real applications is preferable. In this study, we describe an algorithm for object recognition and cognitive map formation using stereo image data in a 3D virtual world where 3D objects and a robot with active stereo imaging system are simulated. Stereo imaging system is simulated so that the actual human visual system properties are parameterized. Only the stereo images obtained from this world are supplied to the virtual robot. By applying our disparity algorithm, depth map for the current stereo view is extracted. Using the depth information for the current view, a cognitive map of the environment is updated gradually while the virtual agent is exploring the environment. The agent explores its environment in an intelligent way using the current view and environmental map information obtained up to date. Also, during exploration if a new object is observed, the robot turns around it, obtains stereo images from different directions and extracts the model of the object in 3D. Using the available set of possible objects, it recognizes the object.
12

Control of reconfigurability and navigation of a wheel-legged robot based on active vision

Brooks, Douglas Antwonne 31 July 2008 (has links)
The ability of robotic units to navigate various terrains is critical to the advancement of robotic operation in real world environments. Next generation robots will need to adapt to their environment in order to accomplish tasks that are either too hazardous, too time consuming, or physically impossible for human-beings. Such tasks may include accurate and rapid explorations of various planets or potentially dangerous areas on planet Earth. This research investigates a navigation control methodology for a wheel-legged robot based on active vision. The method presented is designed to control the reconfigurability of the robot (i.e. control the usage of the wheels and legs), depending upon the obstacle/terrain, based on perception. Surface estimation for robot reconfigurability is implemented using a region growing method and a characterization and traversability assessment generated from camera data. As a result, a mathematical approach that directs necessary navigation behavior is implemented to control robot mobility. The hybrid wheeled-legged rover possesses a four-legged or six-legged walking system as well as a four-wheeled mobility system.
13

Visual Attention in Active Vision Systems : Attending, Classifying and Manipulating Objects

Rasolzadeh, Babak January 2011 (has links)
This thesis has presented a computational model for the combination of bottom-up and top-down attentional mechanisms. Furthermore, the use for this model has been demonstrated in a variety of applications of machine and robotic vision. We have observed that an attentional mechanism is imperative in any active vision system, machine as well as biological, since it not only reduces the amount of information that needs to be further processed (for say recognition, action), but also by only processing the attended image regions, such tasks become more robust to large amounts of clutter and noise in the visual field. Using various feature channels such as color, orientation, texture, depth and symmetry, as input, the presented model is able with a pre-trained artificial neural network to modulate a saliency map for a particular top-down goal, e.g. visual search for a target object. More specifically it dynamically combines the unmodulated bottom-up saliency with the modulated top-down saliency, by means of a biologically and psychophysically motivated temporal differential equation. This way the system is for instance able to detect important bottom-up cues, even while in visual search mode (top-down) for a particular object. All the computational steps for yielding the final attentional map, that ranks regions in images according to their importance for the system, are shown to be biologically plausible. It has also been demonstrated that the presented attentional model facilitates tasks other than visual search. For instance, using the covert attentional peaks that the model returns, we can improve scene understanding and segmentation through clustering or scattering of the 2D/3D components of the scene, depending on the configuration of these attentional peaks and their relations to other attributes of the scene. More specifically this is performed by means of entropy optimization of the scence under varying cluster-configurations, i.e. different groupings of the various components of the scene. Qualitative experiments demonstrated the use of this attentional model on a robotic humanoid platform and in a real-time manner control the overt attention of the robot by specifying the saccadic movements of the robot head. These experiments also exposed another highly important aspect of the model; its temporal variability, as opposed to many other attentional (saliency) models that exclusively deal with static images. Here the dynamic aspects of the attentional mechanism proved to allow for a temporally varying trade-off between top-down and bottom-up influences depending on changes in the environment of the robot. The thesis has also lay forward systematic and quantitative large scale experiments on the actual benefits and uses of this kind of attentional model. To this end a simulated 2D environment was implemented, where the system could not “see” the entire environment and needed to perform overt shifts of attention (a simulated saccade) in order to perfom a visual search task for a pre-defined sought object. This allowed for a simple and rapid substitution of the core attentional-model of the system with comparative computational models designed by other researchers. Nine such contending models were tested and compared with the presented model, in a quantitative manner. Given certain asumptions these experiments showed that the attentional model presented in this work outperforms the other models in simple visualsearch tasks. / QC 20111228
14

Decisions, Predictions, and Learning in the visual sense

Ehinger, Benedikt V. 16 November 2018 (has links)
We experience the world through our senses. But we can only make sense of the incoming information because it is weighted and interpreted against our perceptual experience which we gather throughout our lives. In this thesis I present several approaches we used to investigate the learning of prior-experience and its utilization for prediction-based computations in decision making. Teaching participants new categories is a good example to demonstrate how new information is used to learn about, and to understand the world. In the first study I present, we taught participants new visual categories using a reinforcement learning paradigm. We recorded their brain activity before, during, and after prolonged learning over 24 sessions. This allowed us to show that initial learning of categories occurs relatively late during processing, in prefrontal areas. After extended learning, categorization occurs early during processing and is likely to occur in temporal structures. One possible computational mechanism to express prior information is the prediction of future input. In this thesis, I make use of a prominent theory of brain function, predictive coding. We performed two studies. In the first, we showed that expectations of the brain can surpass the reliability of incoming information: In a perceptual decision making task, a percept based on fill-in from the physiological blind spot is judged as more reliable to an identical percept from veridical input. In the second study, we showed that expectations occur between eye movements. There, we measured brain activity while peripheral predictions were violated over eye movements. We found two sets of prediction errors early and late during processing. By changing the reliability of the stimulus using the blind spots, we in addition confirm an important theoretical idea: The strength of prediction-violation is modified based on the reliability of the prediction. So far, we used eye-movements as they are useful to understand the interaction between the current information state of the brain and expectations of future information. In a series of experiments we modulated the amount of information the visual system is allowed to extract before a new eye movement is made. We developed a new paradigm that allows for experimental control of eye-movement trajectories as well as fixation durations. We show that interrupting the extraction of information influences the planning of new eye movements. In addition, we show that eye movement planning time follow Hick's law, a logarithmic increase of saccadic reaction time with increasing number of possible targets. Most of the studies presented here tried to identify causal effects in human behavior or brain-computations. Often direct interventions in the system, like brain stimulation or lesions, are needed for such causal statements. Unfortunately, not many methods are available to directly control the neurons of the brain and even less the encoded expectations. Recent developments of the new optogenetic agent Melanopsin allow for direct activation and silencing of neuronal cells. In cooperation with researchers from the field of optogenetics, we developed a generative Bayesian model of Melanopsin, that allows to integrate physiological data over multiple experiments, include prior knowledge on bio-physical constraints and identify differences between proteins. After discussing these projects, I will take a meta-perspective on my field and end this dissertation with a discussion and outlook of open science and statistical developments in the field of cognitive science.
15

Monocular and Binocular Visual Tracking

Salama, Gouda Ismail Mohamed 06 January 2000 (has links)
Visual tracking is one of the most important applications of computer vision. Several tracking systems have been developed which either focus mainly on the tracking of targets moving on a plane, or attempt to reduce the 3-dimensional tracking problem to the tracking of a set of characteristic points of the target. These approaches are seriously handicapped in complex visual situations, particularly those involving significant perspective, textures, repeating patterns, or occlusion. This dissertation describes a new approach to visual tracking for monocular and binocular image sequences, and for both passive and active cameras. The method combines Kalman-type prediction with steepest-descent search for correspondences, using 2-dimensional affine mappings between images. This approach differs significantly from many recent tracking systems, which emphasize the recovery of 3-dimensional motion and/or structure of objects in the scene. We argue that 2-dimensional area-based matching is sufficient in many situations of interest, and we present experimental results with real image sequences to illustrate the efficacy of this approach. Image matching between two images is a simple one to one mapping, if there is no occlusion. In the presence of occlusion wrong matching is inevitable. Few approaches have been developed to address this issue. This dissertation considers the effect of occlusion on tracking a moving object for both monocular and binocular image sequences. The visual tracking system described here attempts to detect occlusion based on the residual error computed by the matching method. If the residual matching error exceeds a user-defined threshold, this means that the tracked object may be occluded by another object. When occlusion is detected, tracking continues with the predicted locations based on Kalman filtering. This serves as a predictor of the target position until it reemerges from the occlusion again. Although the method uses a constant image velocity Kalman filtering, it has been shown to function reasonably well in a non-constant velocity situation. Experimental results show that tracking can be maintained during periods of substantial occlusion. The area-based approach to image matching often involves correlation-based comparisons between images, and this requires the specification of a size for the correlation windows. Accordingly, a new approach based on moment invariants was developed to select window size adaptively. This approach is based on the sudden increasing or decreasing in the first Maitra moment invariant. We applied a robust regression model to smooth the first Maitra moment invariant to make the method robust against noise. This dissertation also considers the effect of spatial quantization on several moment invariants. Of particular interest are the affine moment invariants, which have emerged, in recent years as a useful tool for image reconstruction, image registration, and recognition of deformed objects. Traditional analysis assumes moments and moment invariants for images that are defined in the continuous domain. Quantization of the image plane is necessary, because otherwise the image cannot be processed digitally. Image acquisition by a digital system imposes spatial and intensity quantization that, in turn, introduce errors into moment and invariant computations. This dissertation also derives expressions for quantization-induced error in several important cases. Although it considers spatial quantization only, this represents an important extension of work by other researchers. A mathematical theory for a visual tracking approach of a moving object is presented in this dissertation. This approach can track a moving object in an image sequence where the camera is passive, and when the camera is actively controlled. The algorithm used here is computationally cheap and suitable for real-time implementation. We implemented the proposed method on an active vision system, and carried out experiments of monocular and binocular tracking for various kinds of objects in different environments. These experiments demonstrated that very good performance using real images for fairly complicated situations. / Ph. D.
16

Agent-Based Architecture for Multirobot Cooperative Tasks: Design and Applications

Nebot Roglá, Patricio 11 January 2008 (has links)
This thesis focuses on the development of a system in which a team of heterogeneous mobile robots can cooperate to perform a wide range of tasks. In order that a group of heterogeneous robots can cooperate among them, one of the most important parts to develop is the creation of an architecture which gives support for the cooperation. This architecture is developed by means of embedding agents and interfacing agent code with native low-level code. It also addresses the implementation of resource sharing among the whole group of robots, that is, the robots can borrow capabilities from each-other.In order to validate this architecture, some cooperative applications have been implemented. The first one is an application where a group of robots must cooperate in order to safely navigate through an unknown environment. One robot with camera calculates the optical flow values from the images, and from these values calculates the "time to contact" values. This information is shared among the team so that any robot can navigate without colliding with the obstacles.The second cooperative application consists of enabling the team of heterogeneous robots to create a certain formation and navigate maintaining this formation. The application consists of two parts or stages. The first one is the creation of the formation, where a robot with the camera can detect where the rest of the robots are in the environment and indicates to them which is their initial position in the formation. In the second stage the robots must be able to navigate through an environment following the path that the robot with the laser indicates. Due to the odometry errors of the robots, the camera of one of the robots is used so that robots which lose their correct position in the formation can re-align themselves. Finally, in an attempt to facilitate access to the robots of the team and to the information that their accessories provide, a system for the teleoperation of the team has been implemented. This system can be used for teaching robotics or to facilitate the tasks of programming and debugging in the research tasks.
17

Utveckling av ett active vision system för demonstration av EDSDK++ i tillämpningar inom datorseende

Kargén, Rolf January 2014 (has links)
Datorseende är ett snabbt växande, tvärvetenskapligt forskningsområde vars tillämpningar tar en allt mer framskjutande roll i dagens samhälle. Med ett ökat intresse för datorseende ökar också behovet av att kunna kontrollera kameror kopplade till datorseende system. Vid Linköpings tekniska högskola, på avdelningen för datorseende, har ramverket EDSDK++ utvecklats för att fjärrstyra digitala kameror tillverkade av Canon Inc. Ramverket är mycket omfattande och innehåller en stor mängd funktioner och inställningsalternativ. Systemet är därför till stor del ännu relativt oprövat. Detta examensarbete syftar till att utveckla ett demonstratorsystem till EDSDK++ i form av ett enkelt active vision system, som med hjälp av ansiktsdetektion i realtid styr en kameratilt, samt en kamera monterad på tilten, till att följa, zooma in och fokusera på ett ansikte eller en grupp av ansikten. Ett krav var att programbiblioteket OpenCV skulle användas för ansiktsdetektionen och att EDSDK++ skulle användas för att kontrollera kameran. Dessutom skulle ett API för att kontrollera kameratilten utvecklas. Under utvecklingsarbetet undersöktes bl.a. olika metoder för ansiktsdetektion. För att förbättra prestandan användes multipla ansiktsdetektorer, som med hjälp av multitrådning avsöker en bild parallellt från olika vinklar. Såväl experimentella som teoretiska ansatser gjordes för att bestämma de parametrar som behövdes för att kunna reglera kamera och kameratilt. Resultatet av arbetet blev en demonstrator, som uppfyllde samtliga krav. / Computer vision is a rapidly growing, interdisciplinary field whose applications are taking an increasingly prominent role in today's society. With an increased interest in computer vision there is also an increasing need to be able to control cameras connected to computer vision systems. At the division of computer vision, at Linköping University, the framework EDSDK++ has been developed to remotely control digital cameras made by Canon Inc. The framework is very comprehensive and contains a large amount of features and configuration options. The system is therefore largely still relatively untested. This thesis aims to develop a demonstrator to EDSDK++ in the form of a simple active vision system, which utilizes real-time face detection in order to control a camera tilt, and a camera mounted on the tilt, to follow, zoom in and focus on a face or a group of faces. A requirement was that the OpenCV library would be used for face detection and EDSDK++ would be used to control the camera. Moreover, an API to control the camera tilt was to be developed. During development, different methods for face detection were investigated. In order to improve performance, multiple, parallel face detectors using multithreading, were used to scan an image from different angles. Both experimental and theoretical approaches were made to determine the parameters needed to control the camera and camera tilt. The project resulted in a fully functional demonstrator, which fulfilled all requirements.
18

Stratégies de vision active pour la reconnaissance d'objets / Active vision strategies for object recognition

Defretin, Joseph 23 November 2011 (has links)
Cette thèse, réalisée en coopération avec l’ONERA, concerne la reconnaissance active d’objets 3D par un agent autonome muni d’une caméra d’observation. Alors qu’en reconnaissance passive les modalités d’acquisitions des observations sont imposées et génèrent parfois des ambiguïtés, la reconnaissance active exploite la possibilité de contrôler en ligne ces modalités d’acquisition au cours d’un processus d’inférence séquentiel dans le but de lever l’ambiguïté. L’objectif des travaux est d’établir des stratégies de planification dans l’acquisition de l’information avec le souci d’une mise en œuvre réaliste de la reconnaissance active. Le cadre de l’apprentissage statistique est pour cela mis à profit. La première partie des travaux se consacre à apprendre à planifier. Deux contraintes réalistes sont prise en compte : d’une part, une modélisation imparfaite des objets susceptible de générer des ambiguïtés supplémentaires - d’autre part, le budget d’apprentissage est coûteux (en temps, en énergie), donc limité. La deuxième partie des travaux s’attache à exploiter au mieux les observations au cours de la reconnaissance. La possibilité d’une reconnaissance active multi-échelles est étudiée pour permettre une interprétation au plus tôt dans le processus séquentiel d’acquisition de l’information. Les observations sont également utilisées pour estimer la pose de l’objet de manière robuste afin d’assurer la cohérence entre les modalités planifiées et celles réellement atteintes par l’agent visuel. / This PhD thesis, conducted in cooperation with ONERA, focuses on active 3D object recognition by an autonomous visual agent. Whereas in passive recognition, acquisition modalities of observations are fixed and may generate ambiguities, active recognition exploits the possibility of controling these modalities online in a sequential inference process in order to remove these ambiguities. The aim of this work is to design, in a statistical learning framework, planning strategies in the acquisition of information while achieving a realistic implementation of active recognition. The first part of the work is dedicated to learning to plan. Two realistic constraints are taken into account : on the one hand, planning with imperfect object modeling may generate further ambiguities - on the other hand, the learning cost (in time, energy) is expensive and therefore limited. The second part of this work focuses on maximally exploiting observations acquired during recognition. The possibility of an active multi-scale recognition is investigated to allow an interpretation as soon as the sequential acquisition process begins. Observations are also used to robustly estimate the pose of the object to ensure consistency between the planned and actual modality of the visual agent.
19

Navigace mobilních robotů / Navigation of mobile robots

Rozman, Jaroslav January 2011 (has links)
Mobile robotics has been very discussed and wide spread topic recently.   This due to the development in the computer technology that allows us to create   better and more sophisticated robots. The goal of this effort is to create robots   that will be able to autonomously move in the chosen environment. To achieve this goal,   it is necessary for the robot to create the map of its environment, where   the motion planning will occur. Nowadays, the probabilistic algorithms based   on the SLAM algorithm are considered standard in the mapping in these times.   This Phd. thesis deals with the proposal of the motion planning of the robot with   stereocamera placed on the pan-and-tilt unit. The motion planning is designed with   regard to the use of algorithms, which will look for the significant features   in the pair of the images. With the use of the triangulation the map, or a model will be created.     The benefits of this work can be divided into three parts. In the first one the way   of marking the free area, where the robot will plan its motion, is described. The second part   describes the motion planning of the robot in this free area. It takes into account   the properties of the SLAM algorithm and it tries to plan the exploration in order to create   the most precise map. The motion of the pan-and-tilt unit is described in the third part.   It takes advantage of the fact that the robot can observe places that are in the different   directions than the robot moves. This allows us to observe much bigger space without   losing the information about the precision of the movements.
20

Development and Evaluation of New Methods for Automating Experiments with C. Elegans Based on Active Vision

Puchalt Rodríguez, Joan Carles 10 March 2022 (has links)
Tesis por compendio / [ES] Esta tesis se centra en el desarrollo de nuevas técnicas automatizadas que permiten inspeccionar nematodos Caenorhabidits elegans (C. elegans) en placas de Petri estándar, para el análisis de sus comportamientos. C. elegans es un nemátodo de 1mm de longitud, con el cual se pueden realizar distintos experimentos para analizar los efectos de fármacos, compuestos o alteraciones genéticas en su longevidad, su salud física o su cognición. El campo principal metodológico del presente trabajo para el análisis de esos efectos es la visión por computador; y con ello, el desarrollo completo del sistema de visión activo: sistema de iluminación inteligente, sistema de captura óptimo, procesamiento de las imágenes para detección y clasificación de nematodos. Los campos secundarios en esta investigación son el control y robotización. Los C. elegans son animales sensibles a la luz y por ello el primero de los métodos está en la rama de la iluminación inteligente, con el cual se permite regular la intensidad y las longitudes de onda de la luz que reciben los nematodos. El siguiente método es el procesado para la detección y clasificación de movimiento a partir de las imágenes obtenidas con esa iluminación controlada. Tener el ambiente controlado es fundamental, los nematodos son muy sensibles a las condiciones ambientales por lo que puede alterarse su actividad biológica, y con ello los resultados, así que el tercer método es la integración de las técnicas en un nuevo dispositivo que permite automatizar ensayos de lifespan y validar los resultados automáticos comparándolos con los manuales. El movimiento del animal es clave para poder realizar inferencias estadísticas que puedan mostrar tendencias en sus comportamientos, por ello la estimulación automatizada que provoque una reacción de su movilidad es el cuarto de los métodos. Por último, el aumento de la resolución en las imágenes muestra mayor detalle, mejorando el procesamiento y extracción de características. El quinto método es un robot multivista que posibilita tomar imágenes a distintas resoluciones, lo que permite mantener el seguimiento global de los gusanos, al mismo tiempo que se toman imágenes con un encuadre de mayor detalle del nematodo objetivo. / [CA] Esta tesi doctoral se centra en el desentrollament de noves tècniques automatitzades que permeten inspeccionar nemàtodes Caenorhabidits elegans (C. elegans) en plaques de Petri estàndar, per a l'anàlisi dels seus comportaments. C. elegans és un nemàtode d'1mm de llargària, ab el qual se poden realitzar distints experiments per a analitzar els efectes de fàrmacs, composts o alteracions genètiques en sa longevitat, la seua salut física o la seua cognició. El camp principal metodològic del present treball per a l'anàlisi d'eixos efectes és la visió per computador; i ab açò, el desentrollament complet del sistema de visió actiu: sistema d'il.luminació inteligent, sistema de captura òptim, processament de les imàtgens per a detecció i classificació de nematode. Els camps secundaris en esta investigació són el control i robotització. Els C. elegans són animals sensibles a la llum i por ello el primer dels mètodes està en la branca de la il.luminació intel.ligent, ab el qual es permet regular la intensitat i les longituds d'ona de la llum que reben els nematodes. El següent mètode és el processat per a la detecció i classificació de moviment a partir de les imàtgens obtinguda ab eixa il.luminació controlada. Tindre l'ambient controlat és fonamental, els nemàtodes són molt sensibles a les condicions ambientals per lo que pot alterar-se la seua activitat biològica, i ab aço els resultats, aixina que el tercer mètode és la integració de les tècniques en un nou dispositiu que permet automatitzar ensajos de lifespan i validar els resultats automàtics comparant-los ab els manuals. El moviment de l'animal és clau per a poder realitzar inferencies estadístiques que puguen mostrar tendències en el seus comportaments, per això la estimulació automatitzada que provoque una reacció de la seua mobilitat és el quart dels mètodes. Per últim, l'augment de la resolució en les imàtgens mostra major detall, millorant el processament i extracció de característiques. El quint mètode és un robot multivista que possibilita prendre imàtgens a distintes resolucions, lo que permet mantindre el seguiment global dels cucs, al mateix temps que se prenguen imàtgens ab un enquadrament de major detall del nematode objectiu. / [EN] This thesis focuses on the development of new automated techniques that allow the inspection of Caenorhabidits elegans nematodes (C. elegans) in Petri dishes, for the analysis of their behavior. This nematode is a 1mm long worm, with which different experiments can be carried out to analyze the effects of drugs, compounds or genetic alterations on its longevity, physical health or cognition. The main methodological field of the present work for the analysis of these effects is computer vision; and with it, the complete development of the active vision system: intelligent lighting system, optimal capture system, image processing for detection and classification of nematodes. The secondary fields in this research are control and robotization. C. elegans are light-sensitive animals and therefore the first method is in the field of intelligent lighting, with which it is possible to regulate the intensity and wavelength of the light that nematodes receive. The next method is the processing for the detection and classification of movement from the images obtained with that controlled lighting. Having a controlled environment is essential, worms are very sensitive to environmental conditions so it can alter biological activity, and with it the results, so the third method is the integration of techniques in a new device that allows automating tests of lifespan and validate the automatic results comparing them with the manual ones. The movement of the animal is key to be able to carry out statistical conferences that can show trends in its behaviors, therefore the automated stimulation that causes a reaction of its mobility is the fourth of the methods. Finally, increasing the resolution in the images shows greater detail, improving the processing and extraction of features. The fifth method is a multiview robot that enables images to be taken at different resolutions, allowing global tracking of worms to be maintained, while at the same time taking images with a more detailed frame of the target worm. / Puchalt Rodríguez, JC. (2022). Development and Evaluation of New Methods for Automating Experiments with C. Elegans Based on Active Vision [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/181359 / Compendio

Page generated in 0.0886 seconds