Global ETD Search

31	3D Vision Geometry for Rolling Shutter Cameras / Géométrie pour la vision 3D avec des caméras Rolling Shutter Lao, Yizhen 16 May 2019 (has links) De nombreuses caméras CMOS modernes sont équipées de capteurs Rolling Shutter (RS). Ces caméras à bas coût et basse consommation permettent d’atteindre de très hautes fréquences d’acquisition. Dans ce mode d’acquisition, les lignes de pixels sont exposées séquentiellement du haut vers le bas de l'image. Par conséquent, les images capturées alors que la caméra et/ou la scène est en mouvement présentent des distorsions qui rendent les algorithmes classiques au mieux moins précis, au pire inutilisables en raison de singularités ou de configurations dégénérées. Le but de cette thèse est de revisiter la géométrie de la vision 3D avec des caméras RS en proposant des solutions pour chaque sous-tâche du pipe-line de Structure-from-Motion (SfM).Le chapitre II présente une nouvelle méthode de correction du RS en utilisant les droites. Contrairement aux méthodes existantes, qui sont itératives et font l’hypothèse dite Manhattan World (MW), notre solution est linéaire et n’impose aucune contrainte sur l’orientation des droites 3D. De plus, la méthode est intégrée dans un processus de type RANSAC permettant de distinguer les courbes qui sont des projections de segments droits de celles qui correspondent à de vraies courbes 3D. La méthode de correction est ainsi plus robuste et entièrement automatisée.Le chapitre III revient sur l'ajustement faisceaux ou bundle adjustment (BA). Nous proposons un nouvel algorithme basé sur une erreur de projection dans laquelle l’index de ligne des points projetés varie pendant l’optimisation afin de garder une cohérence géométrique contrairement aux méthodes existantes qui considère un index fixe (celui mesurés dans l’image). Nous montrons que cela permet de lever la dégénérescence dans le cas où les directions de scan des images sont trop proches (cas très communs avec des caméras embraquées sur un véhicule par exemple). Dans le chapitre VI nous étendons le concept d'homographie aux cas d’images RS en démontrant que la relation point-à-point entre deux images d’un nuage de points coplanaires pouvait s’exprimer sous la forme de 3 à 7 matrices de taille 3X3 en fonction du modèle de mouvement utilisé. Nous proposons une méthode linéaire pour le calcul de ces matrices. Ces dernières sont ensuite utilisées pour résoudre deux problèmes classiques en vision par ordinateur à savoir le calcul du mouvement relatif et le « mosaïcing » dans le cas RS.Dans le chapitre V nous traitons le problème de calcul de pose et de reconstruction multi-vues en établissant une analogie avec les méthodes utilisées pour les surfaces déformables telles que SfT (Structure-from-Template) et NRSfM (Non Rigid Structure-from-Motion). Nous montrons qu’une image RS d’une scène rigide en mouvement peut être interprétée comme une image Global Shutter (GS) d’une surface virtuellement déformée (par l’effet RS). La solution proposée pour estimer la pose et la structure 3D de la scène est ainsi composée de deux étapes. D’abord les déformations virtuelles sont d’abord calculées grâce à SfT ou NRSfM en assumant un modèle GS classique (relaxation du modèle RS). Ensuite, ces déformations sont réinterprétées comme étant le résultat du mouvement durant l’acquisition (réintroduction du modèle RS). L’approche proposée présente ainsi de meilleures propriétés de convergence que les approches existantes. / Many modern CMOS cameras are equipped with Rolling Shutter (RS) sensors which are considered as low cost, low consumption and fast cameras. In this acquisition mode, the pixel rows are exposed sequentially from the top to the bottom of the image. Therefore, images captured by moving RS cameras produce distortions (e.g. wobble and skew) which make the classic algorithms at best less precise, at worst unusable due to singularities or degeneracies. The goal of this thesis is to propose a general framework for modelling and solving structure from motion (SfM) with RS cameras. Our approach consists in addressing each sub-task of the SfM pipe-line (namely image correction, absolute and relative pose estimation and bundle adjustment) and proposing improvements.The first part of this manuscript presents a novel RS correction method which uses line features. Unlike existing methods, which uses iterative solutions and make Manhattan World (MW) assumption, our method R4C computes linearly the camera instantaneous-motion using few image features. Besides, the method was integrated into a RANSAC-like framework which enables us to detect curves that correspond to actual 3D straight lines and reject outlier curves making image correction more robust and fully automated.The second part revisits Bundle Adjustment (BA) for RS images. It deals with a limitation of existing RS bundle adjustment methods in case of close read-out directions among RS views which is a common configuration in many real-life applications. In contrast, we propose a novel camera-based RS projection algorithm and incorporate it into RSBA to calculate reprojection errors. We found out that this new algorithm makes SfM survive the degenerate configuration mentioned above.The third part proposes a new RS Homography matrix based on point correspondences from an RS pair. Linear solvers for the computation of this matrix are also presented. Specifically, a practical solver with 13 point correspondences is proposed. In addition, we present two essential applications in computer vision that use RS homography: plane-based RS relative pose estimation and RS image stitching. The last part of this thesis studies absolute camera pose problem (PnP) and SfM which handle RS effects by drawing analogies with non-rigid vision, namely Shape-from-Template (SfT) and Non-rigid SfM (NRSfM) respectively. Unlike all existing methods which perform 3D-2D registration after augmenting the Global Shutter (GS) projection model with the velocity parameters under various kinematic models, we propose to use local differential constraints. The proposed methods outperform stat-of-the-art and handles configurations that are critical for existing methods. Rolling shutter Pose absolue et relative Homographie S-f-M Ajustement de faisceaux Rolling shutter Image correction Pose estimation Relative pose estimation Homography Structure from Motion Bundle Adjustment
32	Lipreading across multiple views Lucey, Patrick Joseph January 2007 (has links) Visual information from a speaker's mouth region is known to improve automatic speech recognition (ASR) robustness, especially in the presence of acoustic noise. Currently, the vast majority of audio-visual ASR (AVASR) studies assume frontal images of the speaker's face, which is a rather restrictive human-computer interaction (HCI) scenario. The lack of research into AVASR across multiple views has been dictated by the lack of large corpora that contains varying pose/viewpoint speech data. Recently, research has concentrated on recognising human be- haviours within &quotmeeting " or &quotlecture " type scenarios via &quotsmart-rooms ". This has resulted in the collection of audio-visual speech data which allows for the recognition of visual speech from both frontal and non-frontal views to occur. Using this data, the main focus of this thesis was to investigate and develop various methods within the confines of a lipreading system which can recognise visual speech across multiple views. This reseach constitutes the first published work within the field which looks at this particular aspect of AVASR. The task of recognising visual speech from non-frontal views (i.e. profile) is in principle very similar to that of frontal views, requiring the lipreading system to initially locate and track the mouth region and subsequently extract visual features. However, this task is far more complicated than the frontal case, because the facial features required to locate and track the mouth lie in a much more limited spatial plane. Nevertheless, accurate mouth region tracking can be achieved by employing techniques similar to frontal facial feature localisation. Once the mouth region has been extracted, the same visual feature extraction process can take place to the frontal view. A novel contribution of this thesis, is to quantify the degradation in lipreading performance between the frontal and profile views. In addition to this, novel patch-based analysis of the various views is conducted, and as a result a novel multi-stream patch-based representation is formulated. Having a lipreading system which can recognise visual speech from both frontal and profile views is a novel contribution to the field of AVASR. How- ever, given both the frontal and profile viewpoints, this begs the question, is there any benefit of having the additional viewpoint? Another major contribution of this thesis, is an exploration of a novel multi-view lipreading system. This system shows that there does exist complimentary information in the additional viewpoint (possibly that of lip protrusion), with superior performance achieved in the multi-view system compared to the frontal-only system. Even though having a multi-view lipreading system which can recognise visual speech from both front and profile views is very beneficial, it can hardly considered to be realistic, as each particular viewpoint is dedicated to a single pose (i.e. front or profile). In an effort to make the lipreading system more realistic, a unified system based on a single camera was developed which enables a lipreading system to recognise visual speech from both frontal and profile poses. This is called pose-invariant lipreading. Pose-invariant lipreading can be performed on either stationary or continuous tasks. Methods which effectively normalise the various poses into a single pose were investigated for the stationary scenario and in another contribution of this thesis, an algorithm based on regularised linear regression was employed to project all the visual speech features into a uniform pose. This particular method is shown to be beneficial when the lipreading system was biased towards the dominant pose (i.e. frontal). The final contribution of this thesis is the formulation of a continuous pose-invariant lipreading system which contains a pose-estimator at the start of the visual front-end. This system highlights the complexity of developing such a system, as introducing more flexibility within the lipreading system invariability means the introduction of more error. All the works contained in this thesis present novel and innovative contributions to the field of AVASR, and hopefully this will aid in the future deployment of an AVASR system in realistic scenarios. lipreading frontal pose profile pose multi-view visual front-end visual feature extraction pose-invariance multi-stream fusion
33	3D POSE ESTIMATION IN THE CONTEXT OF GRIP POSITION FOR PHRI Norman, Jacob January 2021 (has links) For human-robot interaction with the intent to grip a human arm, it is necessary that the ideal gripping location can be identified. In this work, the gripping location is situated on the arm and thus it can be extracted using the position of the wrist and elbow joints. To achieve this human pose estimation is proposed as there exist robust methods that work both in and outside of lab environments. One such example is OpenPose which thanks to the COCO and MPII datasets has recorded impressive results in a variety of different scenarios in real-time. However, most of the images in these datasets are taken from a camera mounted at chest height on people that for the majority of the images are oriented upright. This presents the potential problem that prone humans which are the primary focus of this project can not be detected. Especially if seen from an angle that makes the human appear upside down in the camera frame. To remedy this two different approaches were tested, both aimed at creating a rotation-invariant 2D pose estimation method. The first method rotates the COCO training data in an attempt to create a model that can find humans regardless of orientation in the image. The second approach adds a RotationNet as a preprocessing step to correctly orient the images so that OpenPose can be used to estimate the 2D pose before rotating back the resulting skeletons. 3D pose estimation human pose estimation pose estimation rotation-invariant Computer Sciences Datavetenskap (datalogi)
34	SIMULATED AND EXPERIMENTAL KINEMATIC CALIBRATION OF A 4 DEGREES OF FREEDOM PARALLEL MANIPULATOR Horne, Andrew 07 January 2013 (has links) This thesis discusses the kinematic calibration of the constraining linkage of a four degrees of freedom parallel manipulator. The manipulator has hybrid actuation of joints and wires, however the wires are not considered in this calibration. Two of the passive joints of the manipulator contain sensing so the calibration of the constraining linkage can be considered. Four kinematic models are developed for the manipulator. For each of these models, an independent set of model parameters are identified through an analysis of the augmented identification Jacobian matrix. Three different methods for formulating the augmented identification Jacobian matrix are explored. For the calibration, an optical tracking system is used to track the end-effector of the manipulator. The procedure to collect the calibration data is explained and the sources of error are considered. To further analyze the sources of error, simulated input data is created and the calibration using the experimental data and the simulated data are compared. In an attempt to improve the calibration, the selection of measured poses to be used for calibration is explored. Several different pose selection criteria have been proposed in the literature and five are evaluated in this work. The pose selection criteria were applied to the experimental manipulator and also a simulated two degrees of freedom manipulator. It is found that the pose selection criteria have a large impact when few poses are used; however the best results occur when a large number of poses are used for the calibration. An experimental calibration is carried out for the manipulator. Using the joint encoders and the kinematic model, the expected pose of the end-effector is calculated. The actual pose is measured using a vision tracking system and the difference between the actual and expected pose is minimized by adjusting the model parameters using a nonlinear optimization method. / Thesis (Master, Mechanical and Materials Engineering) -- Queen's University, 2013-01-06 22:46:05.076 Calibration Robotics Calibration Pose Selection Mechanical Engineering Kinematic Calibration
35	Concept Design and Testing of a GPS-less System for Autonomous Shovel-Truck Spotting OWENS, BRETT 29 January 2013 (has links) Haul truck drivers frequently have difficulties spotting beside shovels. This is typically a combination of reduced visibility and poor mining conditions. Based on first-hand data collected from the Goldstrike Open Pit, it was learned that, on average, 9% of all spotting actions required corrective movements to facilitate loading. This thesis investigates an automated solution to haul truck spotting that does not rely on the use of the satellite global positioning system (GPS), since GPS can perform unreliably. This thesis proposes that if spotting was automated, a significant decrease in cycle times could result. Using conventional algorithms and techniques from the field of mobile robotics, vehicle pose estimation and control algorithms were designed to enable autonomous shovel-truck spotting. The developed algorithms were verified by using both simulation and field testing with real hardware. Tests were performed in analog conditions on an automation-ready Kubota RTV 900 utility truck. When initiated from a representative pose, the RTV successfully spotted to the desired location (within 1 m) in 95% of the conducted trials. The results demonstrate that the proposed approach is a strong candidate for an auto-spot system. / Thesis (Master, Mining Engineering) -- Queen's University, 2013-01-28 09:49:20.584 mobile robotics GPS pose estimation autonomous vehicles spotting times
36	Mots visuels pour le calcul de pose / Visual words for pose computation Bhat, Srikrishna 22 January 2013 (has links) Nous abordons le problème de la mise en correspondance de points dans des images pour calculer la pose d'une caméra par l'algorithme Perspective-n-Point (PnP). Nous calculons la carte 3D, c'est-à-dire les coordonnées 3D et les caractéristiques visuelles de quelques points dans l'environnement grâce à une procédure d'apprentissage hors ligne utilisant un ensemble d'images d'apprentissage. Étant donné une nouvelle image nous utilisons PnP à partir des coordonnées 2D dans l'image de points 3D détectés à l'aide de la carte 3D. Pendant la phase d'apprentissage nous groupons les descripteurs SIFT extraits des images d'apprentissage pour obtenir des collections de positions 2D dans ces images de quelques-uns des points 3D dans l'environnement. Le calcul de SFM (Structure From Motion) est effectué pour obtenir les coordonnées des points correspondants 3D. Pendant la phase de test, les descripteurs SIFT associés aux points 2D projection d'un point 3D de la carte sont utilisés pour reconnaître le point 3D dans une image donnée. Le cadre de travail est semblable à celui des mots visuels utilisés dans différents domaines de la vision par ordinateur. Pendant l'apprentissage, la formation des mots visuelle est effectuée via l'identification de groupes et pendant les tests des points 3D sont identifiés grâce à la reconnaissance des mots visuels. Nous menons des expériences avec des méthodes de formation différentes (k-means et mean-shift) et proposons un nouveau schéma pour la formation des mots visuels pour la phase d'apprentissage. Nous utilisons différentes règles de mise en correspondance, y compris quelques-unes des méthodes standards de classification supervisée pour effectuer la reconnaissance des mots visuels pendant la phase de test. Nous évaluons ces différentes stratégies dans les deux étapes. Afin d'assurer la robustesse aux variations de pose entre images d'apprentissage et images de test, nous explorons différentes façons d'intégrer les descripteurs SIFT extraits de vues synthétiques générées à partir des images d'apprentissage. Nous proposons également une stratégie d'accélération exacte pour l'algorithme mean-shift / We address the problem of establishing point correspondences in images for computing camera pose through Perspective-n-Point (PnP) algorithm. We compute the 3D map i.e. 3D coordinates and visual characteristics of some of the points in the environment through an offline training stage using a set of training images. Given a new test image we apply PnP using the 2D coordinates of 3D points in the image detected by using the 3D map. During the training stage we cluster the SIFT descriptors extracted from training images to obtain 2D-tracks of some of the 3D points in the environment. Each 2D-track consists of a set of 2D image coordinates of a single 3D point in different training images. SfM (Structure from Motion) is performed on these 2D-tracks to obtain the coordinates of the corresponding 3D points. During the test stage, the SIFT descriptors associated the 2D-track of a 3D point is used to recognize the 3D point in a given image. The overall process is similar to visual word framework used in different fields of computer vision. During training, visual word formation is performed through clustering and during testing 3D points are identified through visual word recognition. We experiment with different clustering schemes (k-means and mean-shift) and propose a novel scheme for visual word formation for training stage. We use different matching rules including some of the popular supervised pattern classification methods to perform visual word recognition during test stage. We evaluate these various matching strategies in both stages. In order to achieve robustness against pose variation between train and test images, we explore different ways of incorporating SIFT descriptors extracted from synthetic views generated from the training images. We also propose an exact acceleration strategy for mean-shift computation Calcul de pose Mot visuels Simulation de vue Mean-shift 006.6
37	A Ladar-Based Pose Estimation Algorithm for Determining Relative Motion of a Spacecraft for Autonomous Rendezvous and Dock Fenton, Ronald Christopher 01 May 2008 (has links) Future autonomous space missions will require autonomous rendezvous and docking operations. The servicing spacecraft must be able to determine the relative 6 degree-of-freedom (6 DOF) motion between the vehicle and the target spacecraft. One method to determine the relative 6 DOF position and attitude is with 3D ladar imaging. Ladar sensor systems can capture close-proximity range images of the target spacecraft, producing 3D point cloud data sets. These sequentially collected point-cloud data sets were then registered with one another using a point correspondence-less variant of the Iterative Closest Points (ICP) algorithm to determine the relative 6 DOF displacements. Simulation experiments were performed and indicated that the mean-squared error (MSE), angular error, mean, and standard deviations for position and orientation estimates did not vary as a function of position and attitude and meet most minimum angular and translational error requirements for rendezvous and dock. Furthermore, the computational times required by this algorithm were comparable to previously reported variants of the point-to-point and point-to-plane-based ICP variants for single iterations when the initialization was already performed. ladar ICP pose estimation rendezvous and dock Electrical and Electronics
38	Belief driven autonomous manipulator pose selection for less controlled environments Webb, Stephen Scott, Mechanical & Manufacturing Engineering, Faculty of Engineering, UNSW January 2008 (has links) This thesis presents a new approach for selecting a manipulator arm configuration (a pose) in an environment where the positions of the work items are not able to be fully controlled. The approach utilizes a belief formed from a priori knowledge, observations and predictive models to select manipulator poses and motions. Standard methods for manipulator control provide a fully specified Cartesian pose as the input to a robot controller which is assumed to act as an ideal Cartesian motion device. While this approach simplifies the controller and makes it more portable, it is not well suited for less-controlled environments where the work item position or orientation may not be completely observable and where a measure of the accuracy of the available observations is required. The proposed approach suggests selecting a manipulator configuration using two types of rating function. When uncertainty is high, configurations are rated by combining a belief, represented by a probability density function, and a value function in a decision theoretic manner enabling selection of the sensor??s motion based on its probabilistic contribution to information gain. When uncertainty is low the mean or mode of the environment state probability density function is utilized in task specific linear or angular distances constraints to map a configuration to a cost. The contribution of this thesis is in providing two formulations that allow joint configurations to be found using non-linear optimization algorithms. The first formulation shows how task specific linear and angular distance constraints are combined in a cost function to enable a satisfying pose to be selected. The second formulation is based on the probabilistic belief of the predicted environment state. This belief is formed by utilizing a Bayesian estimation framework to combine the a priori knowledge with the output of sensor data processing, a likelihood function over the state space, thereby handling the uncertainty associated with sensing in a less controlled environment. Forward models are used to transform the belief to a predicted state which is utilized in motion selection to provide the benefits of a feedforward control strategy. Extensive numerical analysis of the proposed approach shows that using the fed-forward belief improves tracking performance by up to 19%. It is also shown that motion selection based on the dynamically maintained belief reduces time to target detection by up to 50% compared to two other control approaches. These and other results show how the proposed approach is effectively able to utilize an uncertain environment state belief to select manipulator arm configurations. Belief driven control Manipulator pose selection Bayesian estimation
39	Reconstruction de Surfaces Réfléchissantes à partir d'Images Bonfort, Thomas 20 February 2006 (has links) (PDF) La reconstruction de surfaces spéculaires à partir d'images est un domaine relativement peu exploré, du fait du caractère peu commun de ces objets, et la complexité induite par rapport aux surfaces mattes. Ceci est du au fait que la texture apparente de telles surfaces est dépendante du point de vue, ou formulé autrement, que le chemin lumineux entre un point d'intérêt et un pixel n'est pas une ligne droite. De ce fait, la plupart des algorithmes de reconstructions ignorent les contributions spéculaires, alors que nous montrons que le contraintes qu'elles apportent permettent d'obtenir des informations géométriques de localisation et d'orientation précises, et ce sans les contraintes de continuité ou de régularité habituellement requises. Cette thèse présente deux méthodes permettant d'obtenir la position et l'orientation de points d'une surface parfaitement spéculaire, à partir de la réflection de points environnants connus. La première étend les approches de ”space carving”, et obtient des voxels d'un objet spéculaire en utilisant une mesure de consistance géométrique putôt que photométrique, qui dans ce cas n'a pas de sens. La deuxième procède par triangulation, en supposant une caméra fixe observant la refléction d'au moins 2 points connus par point de la surface à reconstruire. Finalement, nous proposons des méthodes pour obtenir la pose d'objets de calibration alors qu'ils ne sont pas dans le champ de vue d'une caméra, à travers la reflection d'objets spéculaires. La première suppose que cet objet est vu à travers la réflection de 3 miroirs plans inconnus, et obtient par ailleurs la pose de ces miroirs. La seconde présente une contrainte géométrique permettant théoriquement d'obtenir la pose d'un tel objet placé à deux endroits différents, vu à travers la reflection d'une surface spéculaire quelconque. Surfaces spéculaires Reconstruction Estimation de pose
40	Robust Real-Time Estimation of Region Displacements in Video Sequences Skoglund, Johan January 2007 (has links) <p>The possibility to use real-time computer vision in video sequences gives many opportunities for a system to interact with the environment. Possible ways for interaction are e.g. augmented reality like in the MATRIS project where the purpose is to add new objects into the video sequence, or surveillance where the purpose is to find abnormal events.</p><p>The increase of the speed of computers the last years has simplified this process and it is now possible to use at least some of the more advanced computer vision algorithms that are available. The computational speed of computers is however still a problem, for an efficient real-time system efficient code and methods are necessary. This thesis deals with both problems, one part is about efficient implementations using single instruction multiple data (SIMD) instructions and one part is about robust tracking.</p><p>An efficient real-time system requires efficient implementations of the used computer vision methods. Efficient implementations requires knowledge about the CPU and the possibilities given. In this thesis, one method called SIMD is explained. SIMD is useful when the same operation is applied to multiple data which usually is the case in computer vision, the same operation is executed on each pixel.</p><p>Following the position of a feature or object in a video sequence is called tracking. Tracking can be used for a number of applications. The application in this thesis is to use tracking for pose estimation. One way to do tracking is to cut out a small region around the feature, creating a patch and find the position on this patch in the other frames. To find the position, a measure of the difference between the patch and the image in a given position is used. This thesis thoroughly investigates the sum of absolute difference (SAD) error measure. The investigation involves different ways to improve the robustness and to decrease the average error. One method to estimate the average error, the covariance of the position error is proposed. An estimate of the average error is needed when different measurements are combined.</p><p>Finally, a system for camera pose estimation is presented. The computer vision part of this system is based on the result in this thesis. This presentation contains also a discussion about the result of this system.</p> / Report code: LIU-TEK-LIC-2007:5. The report code in the thesis is incorrect. tracking subpixel real time covariance pose estimation Image analysis Bildanalys

Search results