Global ETD Search

161	Bio-inspired Optical Flow Interpretation with Fuzzy Logic for Behavior-Based Robot Control / Biologisch-Inspirierte Interpretation des Optischen Flusses mittels Fuzzy-Logik für Verhaltensbasierte Robotersteuerungen Mai, Ngoc Anh, Janschek, Klaus 10 February 2010 (has links) (PDF) This paper presents a bio-inspired approach for optical flow data interpretation based on fuzzy inference decision making for visual mobile robot navigation. The interpretation results of regionally averaged optical flow patterns with pyramid segmentation of the optical flow field deliver fuzzy topological and topographic information of the surrounding environment (topological structure from motion). It allows a topological localization in a global map as well as controlled locomotion (obstacle avoidance, goal seeking) in a changing and dynamic environment. The topological optical flow processing is embedded in a behavior based mobile robot navigation system which uses only a mono-camera as primary navigation sensor. The paper discusses the optical flow processing approach as well as the rule based fuzzy inference algorithms used. The implemented algorithms have been tested successfully with synthetic image data for a first verification and parameter tuning as well as in a real office environment with real image data. Fuzzy Logic Visual Navigation Behavior Based Control Optical Flow Fuzzy Logik Visuelle Navigation Verhaltensbasierte Robotersteuerung Optischer Fluss ddc:620 rvk:ZQ 6030 rvk:ZQ 5240
162	Entwicklung optischer Feldmessverfahren zur Charakterisierung mikrofluidischer Mischungsvorgänge / Development of optical 2d measuring methods for characterisation of microfluidic mixing processes Roetmann, Karsten 28 March 2008 (has links) No description available. 530 Physik RP Mathematics and Natural Science Mikrofluidik MTV optischer Fluss Raman Konzentrationsverteilung micro fluidic MTV optical flow Raman concentration distribution 33.05
163	Obstacle detection using a monocular camera Goroshin, Rostislav 19 May 2008 (has links) The objective of this thesis is to develop a general obstacle segmentation algorithm for use on board a ground based unmanned vehicle (GUV). The algorithm processes video data captured by a single monocular camera mounted on the GUV. We make the assumption that the GUV moves on a locally planar surface, representing the ground plane. We start by deriving the equations of the expected motion field (observed by the camera) induced by the motion of the robot on the ground plane. Given an initial view of a presumably static scene, this motion field is used to generate a predicted view of the same scene after a known camera displacement. This predicted image is compared to the actual image taken at the new camera location by means of an optical flow calculation. Because the planar assumption is used to generate the predicted image, portions of the image which mismatch the prediction correspond to salient feature points on objects which lie above or below the ground plane, we consider these objects obstacles for the GUV. We assume that these salient feature points (called seed pixels ) capture the color statistics of the obstacle and use them to initialize a Bayesian region growing routine to generate a full obstacle segmentation. Alignment of the seed pixels with the obstacle is not guaranteed due to the aperture problem, however successful segmentations were obtained for natural scenes. The algorithm was tested off line using video captured by a camera mounted on a GUV. Motion field Computer vision Optical flow Segmentation Obstacle detection Region growing Vision, Monocular Detectors Robot vision Obstacles (Military science) Vehicles, Remotely piloted Algorithms
164	Super-resolution image processing with application to face recognition Lin, Frank Chi-Hao January 2008 (has links) Subject identification from surveillance imagery has become an important task for forensic investigation. Good quality images of the subjects are essential for the surveillance footage to be useful. However, surveillance videos are of low resolution due to data storage requirements. In addition, subjects typically occupy a small portion of a camera's field of view. Faces, which are of primary interest, occupy an even smaller array of pixels. For reliable face recognition from surveillance video, there is a need to generate higher resolution images of the subject's face from low-resolution video. Super-resolution image reconstruction is a signal processing based approach that aims to reconstruct a high-resolution image by combining a number of low-resolution images. The low-resolution images that differ by a sub-pixel shift contain complementary information as they are different "snapshots" of the same scene. Once geometrically registered onto a common high-resolution grid, they can be merged into a single image with higher resolution. As super-resolution is a computationally intensive process, traditional reconstruction-based super-resolution methods simplify the problem by restricting the correspondence between low-resolution frames to global motion such as translational and affine transformation. Surveillance footage however, consists of independently moving non-rigid objects such as faces. Applying global registration methods result in registration errors that lead to artefacts that adversely affect recognition. The human face also presents additional problems such as selfocclusion and reflectance variation that even local registration methods find difficult to model. In this dissertation, a robust optical flow-based super-resolution technique was proposed to overcome these difficulties. Real surveillance footage and the Terrascope database were used to compare the reconstruction quality of the proposed method against interpolation and existing super-resolution algorithms. Results show that the proposed robust optical flow-based method consistently produced more accurate reconstructions. This dissertation also outlines a systematic investigation of how super-resolution affects automatic face recognition algorithms with an emphasis on comparing reconstruction- and learning-based super-resolution approaches. While reconstruction-based super-resolution approaches like the proposed method attempt to recover the aliased high frequency information, learning-based methods synthesise them instead. Learning-based methods are able to synthesise plausible high frequency detail at high magnification ratios but the appearance of the face may change to the extent that the person no longer looks like him/herself. Although super-resolution has been applied to facial imagery, very little has been reported elsewhere on measuring the performance changes from super-resolved images. Intuitively, super-resolution improves image fidelity, and hence should improve the ability to distinguish between faces and consequently automatic face recognition accuracy. This is the first study to comprehensively investigate the effect of super-resolution on face recognition. Since super-resolution is a computationally intensive process it is important to understand the benefits in relation to the trade-off in computations. A framework for testing face recognition algorithms with multi-resolution images was proposed, using the XM2VTS database as a sample implementation. Results show that super-resolution offers a small improvement over bilinear interpolation in recognition performance in the absence of noise and that super-resolution is more beneficial when the input images are noisy since noise is attenuated during the frame fusion process. super-resolution face recognition optical flow image processing surveillance video computer vision pattern recognition biometrics principal components analysis elastic bunch graph matching
165	Le mouvement en action : estimation du flot optique et localisation d'actions dans les vidéos / Motion in action : optical flow estimation and action localization in videos Weinzaepfel, Philippe 23 September 2016 (has links) Avec la récente et importante croissance des contenus vidéos, la compréhension automatique de vidéos est devenue un problème majeur.Ce mémoire présente plusieurs contributions sur deux tâches de la compréhension automatique de vidéos : l'estimation du flot optique et la localisation d'actions humaines.L'estimation du flot optique consiste à calculer le déplacement de chaque pixel d'une vidéo et fait face à plusieurs défis tels que les grands déplacements non rigides, les occlusions et les discontinuités du mouvement.Nous proposons tout d'abord une méthode pour le calcul du flot optique, basée sur un modèle variationnel qui incorpore une nouvelle méthode d'appariement.L'algorithme d'appariement proposé repose sur une architecture corrélationnelle hiérarchique à plusieurs niveaux et gère les déformations non rigides ainsi que les textures répétitives.Il permet d'améliorer l'estimation du flot en présence de changements d'apparence significatifs et de grands déplacements.Nous présentons également une nouvelle approche pour l'estimation du flot optique basée sur une interpolation dense de correspondances clairsemées tout en respectant les contours.Cette méthode tire profit d'une distance géodésique basée sur les contours qui permet de respecter les discontinuités du mouvement et de gérer les occlusions.En outre, nous proposons une approche d'apprentissage pour détecter les discontinuités du mouvement.Les motifs de discontinuité du mouvement sont prédits au niveau d'un patch en utilisant des forêts aléatoires structurées.Nous montrons expérimentalement que notre approche surclasse la méthode basique construite sur le gradient du flot tant sur des données synthétiques que sur des vidéos réelles.Nous présentons à cet effet une base de données contenant des vidéos d'utilisateurs.La localisation d'actions humaines consiste à reconnaître les actions présentes dans une vidéo, comme `boire' ou `téléphoner', ainsi que leur étendue temporelle et spatiale.Nous proposons tout d'abord une nouvelle approche basée sur les réseaux de neurones convolutionnels profonds.La méthode passe par l'extraction de tubes dépendants de la classe à détecter, tirant parti des dernières avancées en matière de détection et de suivi.La description des tubes est enrichie par des descripteurs spatio-temporels locaux.La détection temporelle est effectuée à l'aide d'une fenêtre glissante à l'intérieur de chaque tube.Notre approche surclasse l'état de l'art sur des bases de données difficiles de localisation d'actions.Deuxièmement, nous présentons une méthode de localisation d'actions faiblement supervisée, c'est-à-dire qui ne nécessite pas l'annotation de boîtes englobantes.Des candidats de localisation d'actions sont calculés en extrayant des tubes autour des humains.Cela est fait en utilisant un détecteur d'humains robuste aux poses inhabituelles et aux occlusions, appris sur une base de données de poses humaines.Un rappel élevé est atteint avec seulement quelques tubes, permettant d'appliquer un apprentissage à plusieurs instances.En outre, nous présentons une nouvelle base de données pour la localisation d'actions humaines.Elle surmonte les limitations des bases existantes, telles la diversité et la durée des vidéos.Notre approche faiblement supervisée obtient des résultats proches de celles totalement supervisées alors qu'elle réduit significativement l'effort d'annotations requis. / With the recent overwhelming growth of digital video content, automatic video understanding has become an increasingly important issue.This thesis introduces several contributions on two automatic video understanding tasks: optical flow estimation and human action localization.Optical flow estimation consists in computing the displacement of every pixel in a video andfaces several challenges including large non-rigid displacements, occlusions and motion boundaries.We first introduce an optical flow approach based on a variational model that incorporates a new matching method.The proposed matching algorithm is built upon a hierarchical multi-layer correlational architecture and effectively handles non-rigid deformations and repetitive textures.It improves the flow estimation in the presence of significant appearance changes and large displacements.We also introduce a novel scheme for estimating optical flow based on a sparse-to-dense interpolation of matches while respecting edges.This method leverages an edge-aware geodesic distance tailored to respect motion boundaries and to handle occlusions.Furthermore, we propose a learning-based approach for detecting motion boundaries.Motion boundary patterns are predicted at the patch level using structured random forests.We experimentally show that our approach outperforms the flow gradient baseline on both synthetic data and real-world videos,including an introduced dataset with consumer videos.Human action localization consists in recognizing the actions that occur in a video, such as `drinking' or `phoning', as well as their temporal and spatial extent.We first propose a novel approach based on Deep Convolutional Neural Network.The method extracts class-specific tubes leveraging recent advances in detection and tracking.Tube description is enhanced by spatio-temporal local features.Temporal detection is performed using a sliding window scheme inside each tube.Our approach outperforms the state of the art on challenging action localization benchmarks.Second, we introduce a weakly-supervised action localization method, ie, which does not require bounding box annotation.Action proposals are computed by extracting tubes around the humans.This is performed using a human detector robust to unusual poses and occlusions, which is learned on a human pose benchmark.A high recall is reached with only several human tubes, allowing to effectively apply Multiple Instance Learning.Furthermore, we introduce a new dataset for human action localization.It overcomes the limitations of existing benchmarks, such as the diversity and the duration of the videos.Our weakly-supervised approach obtains results close to fully-supervised ones while significantly reducing the required amount of annotations. Flot optique Détection d'actions Réseaux de neurones convolutionnels Analyse de vidéos Vision par ordinateur Apprentissage machine Optical flow Action detection Convolutional neural network Video analysis Computer vision Machine learning 510
166	Tensor baseado em fluxo óptico para descrição global de movimento em vídeos Mota, Virgínia Fernandes 28 February 2011 (has links) Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-03-02T19:31:32Z No. of bitstreams: 1 virginiafernandesmota.pdf: 2597727 bytes, checksum: df1d36b8c756398774e8649591f66a32 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-03-06T20:02:23Z (GMT) No. of bitstreams: 1 virginiafernandesmota.pdf: 2597727 bytes, checksum: df1d36b8c756398774e8649591f66a32 (MD5) / Made available in DSpace on 2017-03-06T20:02:23Z (GMT). No. of bitstreams: 1 virginiafernandesmota.pdf: 2597727 bytes, checksum: df1d36b8c756398774e8649591f66a32 (MD5) Previous issue date: 2011-02-28 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Movimento é uma das características fundamentais que refletem a informação semântica em vídeos. Uma das técnicas de estimativa do movimento é o cálculo do fluxo óptico. Este é uma representação 2D (bidimensional) das velocidades aparentes de uma sequência de quadros (frames) adjacentes, ou seja, a projeção 2D do movimento 3D (tridimensional) projetado na câmera. Neste trabalho é proposto um descritor global de movimento baseado no tensor de orientação. O mesmo é formado à partir dos coeficientes dos polinômios de Legendre calculados para cada quadro de um vídeo. Os coeficientes são encontrados através da projeção do fluxo óptico nos polinômios de Legendre, obtendo-se uma representação polinomial do movimento. O descritor tensorial criado é avaliado classificando-se a base de vídeos KTH com um classificador SVM (máquina de vetor de suporte). É possível concluir que a precisão da abordagem deste trabalho supera às encontradas pelos descritores globais encontrados na literatura. / Motion is one of the main characteristics that describe the semantic information of videos. One of the techniques of motion estimation is the extraction of optical flow. The optical flow is a bidimensional representation of velocities in a sequence of adjacent frames, in other words, is the 2D projection of the 3D motion projected on the camera. In this work it is proposed a global video descriptor based on orientation tensor. This descriptor is composed by coefficients of Legendre polynomials calculated for each video frame. The coefficients are found though the projection of the optical flow on Legendre polynomials, obtaining a polynomial representation of the motion. The tensorial descriptor created is evaluated by a classification of the KTH video database with a SVM (support vector machine) classifier. Results show that the precision of our approach is greater than those obtained by global descriptors in the literature. CNPQ::CIENCIAS EXATAS E DA TERRA Descritor de movimento Tensor de orientação SVM Fluxo óptico Modelagem do movimento Motion descriptor Orientation tensor SVM Optical Flow Motion modeling
167	Matting of Natural Image Sequences using Bayesian Statistics Karlsson, Fredrik January 2004 (has links) The problem of separating a non-rectangular foreground image from a background image is a classical problem in image processing and analysis, known as matting or keying. A common example is a film frame where an actor is extracted from the background to later be placed on a different background. Compositing of these objects against a new background is one of the most common operations in the creation of visual effects. When the original background is of non-constant color the matting becomes an under determined problem, for which a unique solution cannot be found. This thesis describes a framework for computing mattes from images with backgrounds of non-constant color, using Bayesian statistics. Foreground and background color distributions are modeled as oriented Gaussians and optimal color and opacity values are determined using a maximum a posteriori approach. Together with information from optical flow algorithms, the framework produces mattes for image sequences without needing user input for each frame. The approach used in this thesis differs from previous research in a few areas. The optimal order of processing is determined in a different way and sampling of color values is changed to work more efficiently on high-resolution images. Finally a gradient-guided local smoothness constraint can optionally be used to improve results for cases where the normal technique produces poor results. Datorteknik Digital Domain matting alpha estimation compositing visual effects computer graphics Bayesian framework maximum a posteriori color-vector clustering optical flow Datorteknik Computer Engineering Datorteknik
168	Facial Gestures for Infotainment Systems Tantai, Along, Chen, Da January 2014 (has links) The long term purpose of this project is to reduce the attention demand of drivers whenusing infotainment systems in a car setting. With the development of the car industry,a contradiction between safety issue and entertainment demands in cars has arisen.Speech-recognition-based controls meet their bottleneck in the presence of backgroundaudio (such as engine noise, other passengers speech and/or the infotainment systemitself). We propose a new method to control the infotainment system using computervision technology in this thesis. This project uses algorithms of object detection, opticalflow(estimated motion) and feature analysis to build a communication channel betweenhuman and machine. By tracking the driver’s head and measuring the optical flow overthe lip region, the driver’s mouth feature can be indicated. Performance concerning theefficiency and accuracy of the system is analyzed. The contribution of this thesis is toprovide a method using facial gestures to communicate with the system, and we focuson the movement of lips especially. This method offers a possibility to create a new modeof interaction between human and machine. Computer Vision HMI Control Effortless Object-Tracking Optical-Flow Human Computer Interaction
169	Approximate Nearest Neighbour Field Computation and Applications Avinash Ramakanth, S January 2014 (has links) (PDF) Approximate Nearest-Neighbour Field (ANNF\ maps between two related images are commonly used by computer vision and graphics community for image editing, completion, retargetting and denoising. In this work we generalize ANNF computation to unrelated image pairs. For accurate ANNF map computation we propose Feature Match, in which the low-dimensional features approximate image patches along with global colour adaptation. Unlike existing approaches, the proposed algorithm does not assume any relation between image pairs and thus generalises ANNF maps to any unrelated image pairs. This generalization enables ANNF approach to handle a wider range of vision applications more efficiently. The following is a brief description of the applications developed using the proposed Feature Match framework. The first application addresses the problem of detecting the optic disk from retinal images. The combination of ANNF maps and salient properties of optic disks leads to an efficient optic disk detector that does not require tedious training or parameter tuning. The proposed approach is evaluated on many publicly available datasets and an average detection accuracy of 99% is achieved with computation time of 0.2s per image. The second application aims to super-resolve a given synthetic image using a single source image as dictionary, avoiding the expensive training involved in conventional approaches. In the third application, we make use of ANNF maps to accurately propagate labels across video for segmenting video objects. The proposed approach outperforms the state-of-the-art on the widely used benchmark SegTrack dataset. In the fourth application, ANNF maps obtained between two consecutive frames of video are enhanced for estimating sub-pixel accurate optical flow, a critical step in many vision applications. Finally a summary of the framework for various possible applications like image encryption, scene segmentation etc. is provided. Digital Image Processing Computer Vision Optic Disc Detection Video Object Segmentation Optical Flow PatchMatch Super Resolution Imaging Feature Match Computer Science
170	Compression vidéo basée sur l'exploitation d'un décodeur intelligent / Video compression based on smart decoder Vo Nguyen, Dang Khoa 18 December 2015 (has links) Cette thèse de doctorat étudie le nouveau concept de décodeur intelligent (SDec) dans lequel le décodeur est doté de la possibilité de simuler l’encodeur et est capable de mener la compétition R-D de la même manière qu’au niveau de l’encodeur. Cette technique vise à réduire la signalisation des modes et des paramètres de codage en compétition. Le schéma général de codage SDec ainsi que plusieurs applications pratiques sont proposées, suivis d’une approche en amont qui exploite l’apprentissage automatique pour le codage vidéo. Le schéma de codage SDec exploite un décodeur complexe capable de reproduire le choix de l’encodeur calculé sur des blocs de référence causaux, éliminant ainsi la nécessité de signaler les modes de codage et les paramètres associés. Plusieurs applications pratiques du schéma SDec sont testées, en utilisant différents modes de codage lors de la compétition sur les blocs de référence. Malgré un choix encore simple et limité des blocs de référence, les gains intéressants sont observés. La recherche en amont présente une méthode innovante qui permet d’exploiter davantage la capacité de traitement d’un décodeur. Les techniques d’apprentissage automatique sont exploitées pour but de réduire la signalisation. Les applications pratiques sont données, utilisant un classificateur basé sur les machines à vecteurs de support pour prédire les modes de codage d’un bloc. La classification des blocs utilise des descripteurs causaux qui sont formés à partir de différents types d’histogrammes. Des gains significatifs en débit sont obtenus, confirmant ainsi le potentiel de l’approche. / This Ph.D. thesis studies the novel concept of Smart Decoder (SDec) where the decoder is given the ability to simulate the encoder and is able to conduct the R-D competition similarly as in the encoder. The proposed technique aims to reduce the signaling of competing coding modes and parameters. The general SDec coding scheme and several practical applications are proposed, followed by a long-term approach exploiting machine learning concept in video coding. The SDec coding scheme exploits a complex decoder able to reproduce the choice of the encoder based on causal references, eliminating thus the need to signal coding modes and associated parameters. Several practical applications of the general outline of the SDec scheme are tested, using different coding modes during the competition on the reference blocs. Despite the choice for the SDec reference block being still simple and limited, interesting gains are observed. The long-term research presents an innovative method that further makes use of the processing capacity of the decoder. Machine learning techniques are exploited in video coding with the purpose of reducing the signaling overhead. Practical applications are given, using a classifier based on support vector machine to predict coding modes of a block. The block classification uses causal descriptors which consist of different types of histograms. Significant bit rate savings are obtained, which confirms the potential of the approach. Compression vidéo HEVC Décodeur intelligent Codage intra Codage Inter Flux optique Apprentissage automatique Vidéo compression HEVC Smart decoder Intra coding Inter coding Optical flow Machine learning

Search results