• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 194
  • 24
  • 17
  • 10
  • 9
  • 6
  • 6
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 334
  • 211
  • 141
  • 103
  • 69
  • 58
  • 55
  • 47
  • 44
  • 43
  • 42
  • 42
  • 37
  • 36
  • 34
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

A Comparative Study On Pose Estimation Algorithms Using Visual Data

Cetinkaya, Guven 01 February 2012 (has links) (PDF)
Computation of the position and orientation of an object with respect to a camera from its images is called pose estimation problem. Pose estimation is one of the major problems in computer vision, robotics and photogrammetry. Object tracking, object recognition, self-localization of robots are typical examples for the use of pose estimation. Determining the pose of an object from its projections requires 3D model of an object in its own reference system, the camera parameters and 2D image of the object. Most of the pose estimation algorithms require the correspondences between the 3D model points of the object and 2D image points. In this study, four well-known pose estimation algorithms requiring the 2D-3D correspondences to be known a priori / namely, Orthogonal Iterations, POSIT, DLT and Efficient PnP are compared. Moreover, two other well-known algorithms that solve the correspondence and pose problems simultaneously / Soft POSIT and Blind- PnP are also compared in the scope of this thesis study. In the first step of the simulations, synthetic data is formed using a realistic motion scenario and the algorithms are compared using this data. In the next step, real images captured by a calibrated camera for an object with known 3D model are exploited. The simulation results indicate that POSIT algorithm performs the best among the algorithms requiring point correspondences. Another result obtained from the experiments is that Soft-POSIT algorithm can be considered to perform better than Blind-PnP algorithm.
112

Statistical methods for 2D image segmentation and 3D pose estimation

Sandhu, Romeil Singh 26 October 2010 (has links)
The field of computer vision focuses on the goal of developing techniques to exploit and extract information from underlying data that may represent images or other multidimensional data. In particular, two well-studied problems in computer vision are the fundamental tasks of 2D image segmentation and 3D pose estimation from a 2D scene. In this thesis, we first introduce two novel methodologies that attempt to independently solve 2D image segmentation and 3D pose estimation separately. Then, by leveraging the advantages of certain techniques from each problem, we couple both tasks in a variational and non-rigid manner through a single energy functional. Thus, the three theoretical components and contributions of this thesis are as follows: Firstly, a new distribution metric for 2D image segmentation is introduced. This is employed within the geometric active contour (GAC) framework. Secondly, a novel particle filtering approach is proposed for the problem of estimating the pose of two point sets that differ by a rigid body transformation. Thirdly, the two techniques of image segmentation and pose estimation are coupled in a single energy functional for a class of 3D rigid objects. After laying the groundwork and presenting these contributions, we then turn to their applicability to real world problems such as visual tracking. In particular, we present an example where we develop a novel tracking scheme for 3-D Laser RADAR imagery. However, we should mention that the proposed contributions are solutions for general imaging problems and therefore can be applied to medical imaging problems such as extracting the prostate from MRI imagery
113

Face Detection and Pose Estimation using Triplet Invariants / Ansiktsdetektering med hjälp av triplet-invarianter

Isaksson, Marcus January 2002 (has links)
<p>Face detection and pose estimation are two widely studied problems - mainly because of their use as subcomponents in important applications, e.g. face recognition. In this thesis I investigate a new approach to the general problem of object detection and pose estimation and apply it to faces. Face detection can be considered a special case of this general problem, but is complicated by the fact that faces are non-rigid objects. The basis of the new approach is the use of scale and orientation invariant feature structures - feature triplets - extracted from the image, as well as a biologically inspired associative structure which maps from feature triplets to desired responses (position, pose, etc.). The feature triplets are constructed from curvature features in the image and coded in a way to represent distances between major facial features (eyes, nose and mouth). The final system has been evaluated on different sets of face images.</p>
114

Channel-Coded Feature Maps for Computer Vision and Machine Learning

Jonsson, Erik January 2008 (has links)
<p>This thesis is about channel-coded feature maps applied in view-based object recognition, tracking, and machine learning. A channel-coded feature map is a soft histogram of joint spatial pixel positions and image feature values. Typical useful features include local orientation and color. Using these features, each channel measures the co-occurrence of a certain orientation and color at a certain position in an image or image patch. Channel-coded feature maps can be seen as a generalization of the SIFT descriptor with the options of including more features and replacing the linear interpolation between bins by a more general basis function.</p><p>The general idea of channel coding originates from a model of how information might be represented in the human brain. For example, different neurons tend to be sensitive to different orientations of local structures in the visual input. The sensitivity profiles tend to be smooth such that one neuron is maximally activated by a certain orientation, with a gradually decaying activity as the input is rotated.</p><p>This thesis extends previous work on using channel-coding ideas within computer vision and machine learning. By differentiating the channel-coded feature maps with respect to transformations of the underlying image, a method for image registration and tracking is constructed. By using piecewise polynomial basis functions, the channel coding can be computed more efficiently, and a general encoding method for N-dimensional feature spaces is presented.</p><p>Furthermore, I argue for using channel-coded feature maps in view-based pose estimation, where a continuous pose parameter is estimated from a query image given a number of training views with known pose. The optimization of position, rotation and scale of the object in the image plane is then included in the optimization problem, leading to a simultaneous tracking and pose estimation algorithm. Apart from objects and poses, the thesis examines the use of channel coding in connection with Bayesian networks. The goal here is to avoid the hard discretizations usually required when Markov random fields are used on intrinsically continuous signals like depth for stereo vision or color values in image restoration.</p><p>Channel coding has previously been used to design machine learning algorithms that are robust to outliers, ambiguities, and discontinuities in the training data. This is obtained by finding a linear mapping between channel-coded input and output values. This thesis extends this method with an incremental version and identifies and analyzes a key feature of the method -- that it is able to handle a learning situation where the correspondence structure between the input and output space is not completely known. In contrast to a traditional supervised learning setting, the training examples are groups of unordered input-output points, where the correspondence structure within each group is unknown. This behavior is studied theoretically and the effect of outliers and convergence properties are analyzed.</p><p>All presented methods have been evaluated experimentally. The work has been conducted within the cognitive systems research project COSPAL funded by EC FP6, and much of the contents has been put to use in the final COSPAL demonstrator system.</p>
115

Pose Estimation and Calibration Algorithms for Vision and Inertial Sensors

Hol, Jeroen Diederik January 2008 (has links)
<p>This thesis deals with estimating position and orientation in real-time, using measurements from vision and inertial sensors. A system has been developed to solve this problem in unprepared environments, assuming that a map or scene model is available. Compared to ‘camera-only’ systems, the combination of the complementary sensors yields an accurate and robust system which can handle periods with uninformative or no vision data and reduces the need for high frequency vision updates.</p><p>The system achieves real-time pose estimation by fusing vision and inertial sensors using the framework of nonlinear state estimation for which state space models have been developed. The performance of the system has been evaluated using an augmented reality application where the output from the system is used to superimpose virtual graphics on the live video stream. Furthermore, experiments have been performed where an industrial robot providing ground truth data is used to move the sensor unit. In both cases the system performed well.</p><p>Calibration of the relative position and orientation of the camera and the inertial sensor turn out to be essential for proper operation of the system. A new and easy-to-use algorithm for estimating these has been developed using a gray-box system identification approach. Experimental results show that the algorithm works well in practice.</p>
116

Système de réalité augmentée basé sur l'observation de structures planes:<br />conception et évaluation

Vigueras-Gomez, Flavio 29 January 2007 (has links) (PDF)
L'objectif de la Réalité Augmentée (RA) est d'intégrer des objets virtuels dans des images d'une scène réelle.<br />Les applications de la RA nécessitent que la scène augmentée soit continuellement mise à jour en fonction des mouvements de la caméra dans la scène.<br />Il est donc primordial de pouvoir calculer à chaque instant les paramètres de la caméra pour avoir une composition cohérente.<br />Cependant, les paramètres calculés sont souvent affectés par des fluctuations statistiques, ce qui nuit à l'impression visuelle de la scène augmentée.<br />Le problème de stabilisation de la caméra a été considéré par Kanatani et Matsuaga qui classifient les déplacements de la caméra par un certain nombre de modèles de mouvement.<br />Nous avons proposé de poursuivre leurs travaux dans un cadre d'environnements de type multi-planaire et de tester différents critères de sélection de modèles, ce qui a mis en évidence que l'usage de critères impliquant l'information sur la covariance des paramètres calculés améliorait la précision et la robustesse des points de vues calculées.<br /><br />Idéalement, un système de RA devrait fonctionner dans un environnement sans besoin de préparer la scène.<br />Dans cette thèse, nous considérons les problèmes du calcul du point de vue et des paramètres intrinsèques de la caméra dans le cadre d'environnements de type mu<br />lti-planaire.<br />De telles structures sont très courantes en intérieurs comme en extérieurs et le domaine d'application de nos méthodes est donc assez large.<br /><br />Nos évaluations expérimentales montrent que les stratégies ici proposées améliorent la précision et la stabilité dans le calcul des paramètres de la caméra et,<br />par conséquent, la qualité des séquences augmentées.
117

Palm Programmierung unter Linux

Jahre, Daniel 12 March 2002 (has links)
Die PDAs von Palm Inc. und seinen Lizenznehmern werden gerne zur Adress- und Terminverwaltung eingesetzt. Damit ist ihr Leistungspotential jedoch nicht erschöpft. Wer gerne selbst Applikationen für Palm PDAs entwickeln möchte, ist dabei nicht zwingend auf eine windowsbasierte Entwicklungsumgebung angewiesen. Unter Linux gibt es Compiler, Ressourceeditoren und Emulatoren für PalmOS. Ich werde in meinem Vortrag diese Werkzeuge vorstellen, demonstrieren und ein Beispielprogramm zeigen.
118

Visual surveillance: dynamic behavior analysis at multiple levels

Breitenstein, Michael D. January 2009 (has links)
Zugl.: Zürich, Techn. Hochsch., Diss., 2009
119

Security with visual understanding : Kinect human recognition capabilities applied in a home security system / Kinect human recognition capabilities applied in a home security system

Fluckiger, S Joseph 08 August 2012 (has links)
Vision is the most celebrated human sense. Eighty percent of the information humans receive is obtained through vision. Machines capable of capturing images are now ubiquitous, but until recently, they have been unable to recognize objects in the images they capture. In effect, machines have been blind. This paper explores the revolutionary new capability of a camera to recognize whether a human is present in an image and take detailed measurements of the person’s dimensions. It explains how the hardware and software of the camera work to provide this remarkable capability in just 200 milliseconds per image. To demonstrate these capabilities, a home security application has been built called Security with Visual Understanding (SVU). SVU is a hardware/software solution that detects a human and then performs biometric authentication by comparing the dimensions of the seen person against a database of known people. If the person is unrecognized, an alarm is sounded, and a picture of the intruder is sent via SMS text message to the home owner. Analysis is performed to measure the tolerance of the SVU algorithm for differentiating between two people based on their body dimensions. / text
120

Steps towards the object semantic hierarchy

Xu, Changhai, 1977- 17 November 2011 (has links)
An intelligent robot must be able to perceive and reason robustly about its world in terms of objects, among other foundational concepts. The robot can draw on rich data for object perception from continuous sensory input, in contrast to the usual formulation that focuses on objects in isolated still images. Additionally, the robot needs multiple object representations to deal with different tasks and/or different classes of objects. We propose the Object Semantic Hierarchy (OSH), which consists of multiple representations with different ontologies. The OSH factors the problems of object perception so that intermediate states of knowledge about an object have natural representations, with relatively easy transitions from less structured to more structured representations. Each layer in the hierarchy builds an explanation of the sensory input stream, in terms of a stochastic model consisting of a deterministic model and an unexplained "noise" term. Each layer is constructed by identifying new invariants from the previous layer. In the final model, the scene is explained in terms of constant background and object models, and low-dimensional dynamic poses of the observer and objects. The OSH contains two types of layers: the Object Layers and the Model Layers. The Object Layers describe how the static background and each foreground object are individuated, and the Model Layers describe how the model for the static background or each foreground object evolves from less structured to more structured representations. Each object or background model contains the following layers: (1) 2D object in 2D space (2D2D): a set of constant 2D object views, and the time-variant 2D object poses, (2) 2D object in 3D space (2D3D): a collection of constant 2D components, with their individual time-variant 3D poses, and (3) 3D object in 3D space (3D3D): the same collection of constant 2D components but with invariant relations among their 3D poses, and the time-variant 3D pose of the object as a whole. In building 2D2D object models, a fundamental problem is to segment out foreground objects in the pixel-level sensory input from the background environment, where motion information is an important cue to perform the segmentation. Traditional approaches for moving object segmentation usually appeal to motion analysis on pure image information without exploiting the robot's motor signals. We observe, however, that the background motion (from the robot's egocentric view) has stronger correlation to the robot's motor signals than the motion of foreground objects. Based on this observation, we propose a novel approach to segmenting moving objects by learning homography and fundamental matrices from motor signals. In building 2D3D and 3D3D object models, estimating camera motion parameters plays a key role. We propose a novel method for camera motion estimation that takes advantage of both planar features and point features and fuses constraints from both homography and essential matrices in a single probabilistic framework. Using planar features greatly improves estimation accuracy over using point features only, and with the help of point features, the solution ambiguity from a planar feature is resolved. Compared to the two classic approaches that apply the constraint of either homography or essential matrix, the proposed method gives more accurate estimation results and avoids the drawbacks of the two approaches. / text

Page generated in 0.0173 seconds