• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 93
  • 23
  • 17
  • 15
  • 13
  • 12
  • 5
  • 4
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 223
  • 223
  • 74
  • 63
  • 60
  • 55
  • 42
  • 37
  • 36
  • 33
  • 30
  • 28
  • 27
  • 23
  • 22
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
161

Entwicklung optischer Feldmessverfahren zur Charakterisierung mikrofluidischer Mischungsvorgänge / Development of optical 2d measuring methods for characterisation of microfluidic mixing processes

Roetmann, Karsten 28 March 2008 (has links)
No description available.
162

Obstacle detection using a monocular camera

Goroshin, Rostislav 19 May 2008 (has links)
The objective of this thesis is to develop a general obstacle segmentation algorithm for use on board a ground based unmanned vehicle (GUV). The algorithm processes video data captured by a single monocular camera mounted on the GUV. We make the assumption that the GUV moves on a locally planar surface, representing the ground plane. We start by deriving the equations of the expected motion field (observed by the camera) induced by the motion of the robot on the ground plane. Given an initial view of a presumably static scene, this motion field is used to generate a predicted view of the same scene after a known camera displacement. This predicted image is compared to the actual image taken at the new camera location by means of an optical flow calculation. Because the planar assumption is used to generate the predicted image, portions of the image which mismatch the prediction correspond to salient feature points on objects which lie above or below the ground plane, we consider these objects obstacles for the GUV. We assume that these salient feature points (called seed pixels ) capture the color statistics of the obstacle and use them to initialize a Bayesian region growing routine to generate a full obstacle segmentation. Alignment of the seed pixels with the obstacle is not guaranteed due to the aperture problem, however successful segmentations were obtained for natural scenes. The algorithm was tested off line using video captured by a camera mounted on a GUV.
163

Super-resolution image processing with application to face recognition

Lin, Frank Chi-Hao January 2008 (has links)
Subject identification from surveillance imagery has become an important task for forensic investigation. Good quality images of the subjects are essential for the surveillance footage to be useful. However, surveillance videos are of low resolution due to data storage requirements. In addition, subjects typically occupy a small portion of a camera's field of view. Faces, which are of primary interest, occupy an even smaller array of pixels. For reliable face recognition from surveillance video, there is a need to generate higher resolution images of the subject's face from low-resolution video. Super-resolution image reconstruction is a signal processing based approach that aims to reconstruct a high-resolution image by combining a number of low-resolution images. The low-resolution images that differ by a sub-pixel shift contain complementary information as they are different "snapshots" of the same scene. Once geometrically registered onto a common high-resolution grid, they can be merged into a single image with higher resolution. As super-resolution is a computationally intensive process, traditional reconstruction-based super-resolution methods simplify the problem by restricting the correspondence between low-resolution frames to global motion such as translational and affine transformation. Surveillance footage however, consists of independently moving non-rigid objects such as faces. Applying global registration methods result in registration errors that lead to artefacts that adversely affect recognition. The human face also presents additional problems such as selfocclusion and reflectance variation that even local registration methods find difficult to model. In this dissertation, a robust optical flow-based super-resolution technique was proposed to overcome these difficulties. Real surveillance footage and the Terrascope database were used to compare the reconstruction quality of the proposed method against interpolation and existing super-resolution algorithms. Results show that the proposed robust optical flow-based method consistently produced more accurate reconstructions. This dissertation also outlines a systematic investigation of how super-resolution affects automatic face recognition algorithms with an emphasis on comparing reconstruction- and learning-based super-resolution approaches. While reconstruction-based super-resolution approaches like the proposed method attempt to recover the aliased high frequency information, learning-based methods synthesise them instead. Learning-based methods are able to synthesise plausible high frequency detail at high magnification ratios but the appearance of the face may change to the extent that the person no longer looks like him/herself. Although super-resolution has been applied to facial imagery, very little has been reported elsewhere on measuring the performance changes from super-resolved images. Intuitively, super-resolution improves image fidelity, and hence should improve the ability to distinguish between faces and consequently automatic face recognition accuracy. This is the first study to comprehensively investigate the effect of super-resolution on face recognition. Since super-resolution is a computationally intensive process it is important to understand the benefits in relation to the trade-off in computations. A framework for testing face recognition algorithms with multi-resolution images was proposed, using the XM2VTS database as a sample implementation. Results show that super-resolution offers a small improvement over bilinear interpolation in recognition performance in the absence of noise and that super-resolution is more beneficial when the input images are noisy since noise is attenuated during the frame fusion process.
164

Le mouvement en action : estimation du flot optique et localisation d'actions dans les vidéos / Motion in action : optical flow estimation and action localization in videos

Weinzaepfel, Philippe 23 September 2016 (has links)
Avec la récente et importante croissance des contenus vidéos, la compréhension automatique de vidéos est devenue un problème majeur.Ce mémoire présente plusieurs contributions sur deux tâches de la compréhension automatique de vidéos : l'estimation du flot optique et la localisation d'actions humaines.L'estimation du flot optique consiste à calculer le déplacement de chaque pixel d'une vidéo et fait face à plusieurs défis tels que les grands déplacements non rigides, les occlusions et les discontinuités du mouvement.Nous proposons tout d'abord une méthode pour le calcul du flot optique, basée sur un modèle variationnel qui incorpore une nouvelle méthode d'appariement.L'algorithme d'appariement proposé repose sur une architecture corrélationnelle hiérarchique à plusieurs niveaux et gère les déformations non rigides ainsi que les textures répétitives.Il permet d'améliorer l'estimation du flot en présence de changements d'apparence significatifs et de grands déplacements.Nous présentons également une nouvelle approche pour l'estimation du flot optique basée sur une interpolation dense de correspondances clairsemées tout en respectant les contours.Cette méthode tire profit d'une distance géodésique basée sur les contours qui permet de respecter les discontinuités du mouvement et de gérer les occlusions.En outre, nous proposons une approche d'apprentissage pour détecter les discontinuités du mouvement.Les motifs de discontinuité du mouvement sont prédits au niveau d'un patch en utilisant des forêts aléatoires structurées.Nous montrons expérimentalement que notre approche surclasse la méthode basique construite sur le gradient du flot tant sur des données synthétiques que sur des vidéos réelles.Nous présentons à cet effet une base de données contenant des vidéos d'utilisateurs.La localisation d'actions humaines consiste à reconnaître les actions présentes dans une vidéo, comme `boire' ou `téléphoner', ainsi que leur étendue temporelle et spatiale.Nous proposons tout d'abord une nouvelle approche basée sur les réseaux de neurones convolutionnels profonds.La méthode passe par l'extraction de tubes dépendants de la classe à détecter, tirant parti des dernières avancées en matière de détection et de suivi.La description des tubes est enrichie par des descripteurs spatio-temporels locaux.La détection temporelle est effectuée à l'aide d'une fenêtre glissante à l'intérieur de chaque tube.Notre approche surclasse l'état de l'art sur des bases de données difficiles de localisation d'actions.Deuxièmement, nous présentons une méthode de localisation d'actions faiblement supervisée, c'est-à-dire qui ne nécessite pas l'annotation de boîtes englobantes.Des candidats de localisation d'actions sont calculés en extrayant des tubes autour des humains.Cela est fait en utilisant un détecteur d'humains robuste aux poses inhabituelles et aux occlusions, appris sur une base de données de poses humaines.Un rappel élevé est atteint avec seulement quelques tubes, permettant d'appliquer un apprentissage à plusieurs instances.En outre, nous présentons une nouvelle base de données pour la localisation d'actions humaines.Elle surmonte les limitations des bases existantes, telles la diversité et la durée des vidéos.Notre approche faiblement supervisée obtient des résultats proches de celles totalement supervisées alors qu'elle réduit significativement l'effort d'annotations requis. / With the recent overwhelming growth of digital video content, automatic video understanding has become an increasingly important issue.This thesis introduces several contributions on two automatic video understanding tasks: optical flow estimation and human action localization.Optical flow estimation consists in computing the displacement of every pixel in a video andfaces several challenges including large non-rigid displacements, occlusions and motion boundaries.We first introduce an optical flow approach based on a variational model that incorporates a new matching method.The proposed matching algorithm is built upon a hierarchical multi-layer correlational architecture and effectively handles non-rigid deformations and repetitive textures.It improves the flow estimation in the presence of significant appearance changes and large displacements.We also introduce a novel scheme for estimating optical flow based on a sparse-to-dense interpolation of matches while respecting edges.This method leverages an edge-aware geodesic distance tailored to respect motion boundaries and to handle occlusions.Furthermore, we propose a learning-based approach for detecting motion boundaries.Motion boundary patterns are predicted at the patch level using structured random forests.We experimentally show that our approach outperforms the flow gradient baseline on both synthetic data and real-world videos,including an introduced dataset with consumer videos.Human action localization consists in recognizing the actions that occur in a video, such as `drinking' or `phoning', as well as their temporal and spatial extent.We first propose a novel approach based on Deep Convolutional Neural Network.The method extracts class-specific tubes leveraging recent advances in detection and tracking.Tube description is enhanced by spatio-temporal local features.Temporal detection is performed using a sliding window scheme inside each tube.Our approach outperforms the state of the art on challenging action localization benchmarks.Second, we introduce a weakly-supervised action localization method, ie, which does not require bounding box annotation.Action proposals are computed by extracting tubes around the humans.This is performed using a human detector robust to unusual poses and occlusions, which is learned on a human pose benchmark.A high recall is reached with only several human tubes, allowing to effectively apply Multiple Instance Learning.Furthermore, we introduce a new dataset for human action localization.It overcomes the limitations of existing benchmarks, such as the diversity and the duration of the videos.Our weakly-supervised approach obtains results close to fully-supervised ones while significantly reducing the required amount of annotations.
165

Tensor baseado em fluxo óptico para descrição global de movimento em vídeos

Mota, Virgínia Fernandes 28 February 2011 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-03-02T19:31:32Z No. of bitstreams: 1 virginiafernandesmota.pdf: 2597727 bytes, checksum: df1d36b8c756398774e8649591f66a32 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-03-06T20:02:23Z (GMT) No. of bitstreams: 1 virginiafernandesmota.pdf: 2597727 bytes, checksum: df1d36b8c756398774e8649591f66a32 (MD5) / Made available in DSpace on 2017-03-06T20:02:23Z (GMT). No. of bitstreams: 1 virginiafernandesmota.pdf: 2597727 bytes, checksum: df1d36b8c756398774e8649591f66a32 (MD5) Previous issue date: 2011-02-28 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Movimento é uma das características fundamentais que refletem a informação semântica em vídeos. Uma das técnicas de estimativa do movimento é o cálculo do fluxo óptico. Este é uma representação 2D (bidimensional) das velocidades aparentes de uma sequência de quadros (frames) adjacentes, ou seja, a projeção 2D do movimento 3D (tridimensional) projetado na câmera. Neste trabalho é proposto um descritor global de movimento baseado no tensor de orientação. O mesmo é formado à partir dos coeficientes dos polinômios de Legendre calculados para cada quadro de um vídeo. Os coeficientes são encontrados através da projeção do fluxo óptico nos polinômios de Legendre, obtendo-se uma representação polinomial do movimento. O descritor tensorial criado é avaliado classificando-se a base de vídeos KTH com um classificador SVM (máquina de vetor de suporte). É possível concluir que a precisão da abordagem deste trabalho supera às encontradas pelos descritores globais encontrados na literatura. / Motion is one of the main characteristics that describe the semantic information of videos. One of the techniques of motion estimation is the extraction of optical flow. The optical flow is a bidimensional representation of velocities in a sequence of adjacent frames, in other words, is the 2D projection of the 3D motion projected on the camera. In this work it is proposed a global video descriptor based on orientation tensor. This descriptor is composed by coefficients of Legendre polynomials calculated for each video frame. The coefficients are found though the projection of the optical flow on Legendre polynomials, obtaining a polynomial representation of the motion. The tensorial descriptor created is evaluated by a classification of the KTH video database with a SVM (support vector machine) classifier. Results show that the precision of our approach is greater than those obtained by global descriptors in the literature.
166

Matting of Natural Image Sequences using Bayesian Statistics

Karlsson, Fredrik January 2004 (has links)
The problem of separating a non-rectangular foreground image from a background image is a classical problem in image processing and analysis, known as matting or keying. A common example is a film frame where an actor is extracted from the background to later be placed on a different background. Compositing of these objects against a new background is one of the most common operations in the creation of visual effects. When the original background is of non-constant color the matting becomes an under determined problem, for which a unique solution cannot be found. This thesis describes a framework for computing mattes from images with backgrounds of non-constant color, using Bayesian statistics. Foreground and background color distributions are modeled as oriented Gaussians and optimal color and opacity values are determined using a maximum a posteriori approach. Together with information from optical flow algorithms, the framework produces mattes for image sequences without needing user input for each frame. The approach used in this thesis differs from previous research in a few areas. The optimal order of processing is determined in a different way and sampling of color values is changed to work more efficiently on high-resolution images. Finally a gradient-guided local smoothness constraint can optionally be used to improve results for cases where the normal technique produces poor results.
167

Facial Gestures for Infotainment Systems

Tantai, Along, Chen, Da January 2014 (has links)
The long term purpose of this project is to reduce the attention demand of drivers whenusing infotainment systems in a car setting. With the development of the car industry,a contradiction between safety issue and entertainment demands in cars has arisen.Speech-recognition-based controls meet their bottleneck in the presence of backgroundaudio (such as engine noise, other passengers speech and/or the infotainment systemitself). We propose a new method to control the infotainment system using computervision technology in this thesis. This project uses algorithms of object detection, opticalflow(estimated motion) and feature analysis to build a communication channel betweenhuman and machine. By tracking the driver’s head and measuring the optical flow overthe lip region, the driver’s mouth feature can be indicated. Performance concerning theefficiency and accuracy of the system is analyzed. The contribution of this thesis is toprovide a method using facial gestures to communicate with the system, and we focuson the movement of lips especially. This method offers a possibility to create a new modeof interaction between human and machine.
168

Approximate Nearest Neighbour Field Computation and Applications

Avinash Ramakanth, S January 2014 (has links) (PDF)
Approximate Nearest-Neighbour Field (ANNF\ maps between two related images are commonly used by computer vision and graphics community for image editing, completion, retargetting and denoising. In this work we generalize ANNF computation to unrelated image pairs. For accurate ANNF map computation we propose Feature Match, in which the low-dimensional features approximate image patches along with global colour adaptation. Unlike existing approaches, the proposed algorithm does not assume any relation between image pairs and thus generalises ANNF maps to any unrelated image pairs. This generalization enables ANNF approach to handle a wider range of vision applications more efficiently. The following is a brief description of the applications developed using the proposed Feature Match framework. The first application addresses the problem of detecting the optic disk from retinal images. The combination of ANNF maps and salient properties of optic disks leads to an efficient optic disk detector that does not require tedious training or parameter tuning. The proposed approach is evaluated on many publicly available datasets and an average detection accuracy of 99% is achieved with computation time of 0.2s per image. The second application aims to super-resolve a given synthetic image using a single source image as dictionary, avoiding the expensive training involved in conventional approaches. In the third application, we make use of ANNF maps to accurately propagate labels across video for segmenting video objects. The proposed approach outperforms the state-of-the-art on the widely used benchmark SegTrack dataset. In the fourth application, ANNF maps obtained between two consecutive frames of video are enhanced for estimating sub-pixel accurate optical flow, a critical step in many vision applications. Finally a summary of the framework for various possible applications like image encryption, scene segmentation etc. is provided.
169

Compression vidéo basée sur l'exploitation d'un décodeur intelligent / Video compression based on smart decoder

Vo Nguyen, Dang Khoa 18 December 2015 (has links)
Cette thèse de doctorat étudie le nouveau concept de décodeur intelligent (SDec) dans lequel le décodeur est doté de la possibilité de simuler l’encodeur et est capable de mener la compétition R-D de la même manière qu’au niveau de l’encodeur. Cette technique vise à réduire la signalisation des modes et des paramètres de codage en compétition. Le schéma général de codage SDec ainsi que plusieurs applications pratiques sont proposées, suivis d’une approche en amont qui exploite l’apprentissage automatique pour le codage vidéo. Le schéma de codage SDec exploite un décodeur complexe capable de reproduire le choix de l’encodeur calculé sur des blocs de référence causaux, éliminant ainsi la nécessité de signaler les modes de codage et les paramètres associés. Plusieurs applications pratiques du schéma SDec sont testées, en utilisant différents modes de codage lors de la compétition sur les blocs de référence. Malgré un choix encore simple et limité des blocs de référence, les gains intéressants sont observés. La recherche en amont présente une méthode innovante qui permet d’exploiter davantage la capacité de traitement d’un décodeur. Les techniques d’apprentissage automatique sont exploitées pour but de réduire la signalisation. Les applications pratiques sont données, utilisant un classificateur basé sur les machines à vecteurs de support pour prédire les modes de codage d’un bloc. La classification des blocs utilise des descripteurs causaux qui sont formés à partir de différents types d’histogrammes. Des gains significatifs en débit sont obtenus, confirmant ainsi le potentiel de l’approche. / This Ph.D. thesis studies the novel concept of Smart Decoder (SDec) where the decoder is given the ability to simulate the encoder and is able to conduct the R-D competition similarly as in the encoder. The proposed technique aims to reduce the signaling of competing coding modes and parameters. The general SDec coding scheme and several practical applications are proposed, followed by a long-term approach exploiting machine learning concept in video coding. The SDec coding scheme exploits a complex decoder able to reproduce the choice of the encoder based on causal references, eliminating thus the need to signal coding modes and associated parameters. Several practical applications of the general outline of the SDec scheme are tested, using different coding modes during the competition on the reference blocs. Despite the choice for the SDec reference block being still simple and limited, interesting gains are observed. The long-term research presents an innovative method that further makes use of the processing capacity of the decoder. Machine learning techniques are exploited in video coding with the purpose of reducing the signaling overhead. Practical applications are given, using a classifier based on support vector machine to predict coding modes of a block. The block classification uses causal descriptors which consist of different types of histograms. Significant bit rate savings are obtained, which confirms the potential of the approach.
170

Évaluation de la corrélation inter-substitut pour le suivi de tumeurs pulmonaires indirect

Ahumada, Daniel F. 08 1900 (has links)
Le but principal de ce projet est de préparer l’implantation clinique du système Clarity qui utilise une sonde ultrasonore pour visualiser l’anatomie interne du patient. Ce système est utilisé pour les cas de prostate et nécessite d’être adapté pour les cas de poumon. L’utilité de ce système est de suivre un substitut afin d’inférer la position d’une tumeur pulmonaire. L’hypothèse de cette étude est qu’un substitut interne serait mieux corrélé avec une tumeur pulmonaire que le seraient des marqueurs externes. Les sous-objectifs sont : 1) aborder l’adaptation du montage pour faire des acquisitions sur des patients ; 2) explorer la performance des algorithmes de détection de mouvements ainsi que des métriques de qualité d’image sur des images US et ciné IRM; 3) démontrer que la corrélation entre un substitut interne et une structure pulmonaire est plus grande que celle avec un substitut externe. Pour les acquisitions d’images US, la sonde est placée sur les volontaires et fixée à la table de traitement à l’aide d’un bras mécanique. Il a été démontré qu’une pression insuffisante peut causer une perte de signal dû à la forme curviligne de la sonde. La diminution de la moyenne des intensités de l’image et de l’écart-type confirme une perte de signal lors d’amplitudes respiratoires élevées justifiée par une perte de contact entre la sonde et la peau malgré la fixation de la sonde. Entre les algorithmes de corrélation croisée normalisée (NCC), d’erreur moyenne quadratique (RMS) et de flux optique, la méthode NCC semble la plus robuste pour suivre le substitut interne (structure dans le foie) dans les images IRM pour 5/9 volontaires sains ( = 0, 050). Cette méthode est utilisé présentement pour les cas de prostate. Le flux optique s’est montré plus efficace pour des cas spécifiques ce qui démontre l’intérêt d’adapter l’algorithme pour les cas de poumons. Enfin, il a été démontré sur les images IRM qu’un substitut interne au niveau du foie est plus efficace pour la majorité des volontaires (8/9) en comparaison avec un marqueur sur la peau placé dans la région abdominale. Le marqueur abdominal possède une meilleure corrélation qu’un marqueur thoracique (9/9) illustrant l’importance du positionnement d’un marqueur externe pour le suivi d’une tumeur pulmonaire. / The main objective of this thesis is to prepare the clinical implementation of the Clarity ultrasound system for indirect lung tumours tracking using a surrogate. It is currently used for motion management during prostate treatments and requires adaptation. Our hypothesis is that an internal marker would have a better correlation with the tumour’s position than an external surrogate. The sub-objectives are : 1) test different setups for the image acquisition on patients ; 2) explore the algorithms’ performance for motion detection as well as the image quality metrics on US and dynamic MRI images ; 3) evaluate the correlation between surrogates and a lung structure to determine which performs best. The ultrasound probe is fixed on the treatment couch for the acquisition on healthy volunteers using a mechanical arm. Low pressure on the patient’s skin results in a loss of signal due to the curvilinear shape of the probe. We observed a loss of contact between the probe and the volunteers’ skin due to ample movements causing a deterioration of the image quality. We tested three different motion detection algorithms on dynamic MRI images : normalized cross-correlation (NCC), root mean square error (RMS) and optical flow. The NCC algorithm is the most robust out of the three for 5/9 volunteers for the internal surrogate tracking ( < 0.050). In specific cases, the optical flow method performed better indicating an interest in developping a new algorithm for indirect lung tracking. Finally, the correlation between the surrogates and a lung structure were calculated using the MRI images. The internal surrogate inside the liver was proven more efficient for indirect lung tumour tracking for 8/9 volunteers. External markers give a greater prediction error. It has also been shown that the positioning of the external marker on the patient’s skin impacts the correlation. The abdominal marker is better than the thoracic one for all the volunteers.

Page generated in 0.1328 seconds