Spelling suggestions: "subject:"dose"" "subject:"pose""
1 |
Vers l’immersion mobile en réalité augmentée : une approche basée sur le suivi robuste de cibles naturelles et sur l’interaction 3D / Toward mobile immersion in augmented reality : an approach based on robust natural feature tracking and 3D interactionBellarbi, Abdelkader 26 April 2017 (has links)
L’estimation de pose et l’interaction 3D sont les fondements de base d’un système de réalité augmentée (RA). L’objectif de cette thèse étant de traiter ces deux problématiques, nous présentons dans ce mémoire un état de l’art qui regroupe : approches, techniques et technologies relatives à l’estimation de pose et à l’interaction 3D en RA. Puis nous faisons le bilan sur les travaux menés jusqu'à aujourd’hui.A cet effet, nos contributions dans ce vaste domaine sont dans les deux parties : vision et interaction 3D. Nous avons proposé un nouveau détecteur et descripteur binaire nommé MOBIL qui effectue une comparaison binaire des moments géométriques. Par la suite nous avons proposé deux améliorations de notre descripteur. MOBIL_2B et POLAR_MOBIL.En outre, nous avons utilisé notre descripteur avec l’approche PTAM (Parallel Tracking and Mapping) afin d’assurer le recalage des objets virtuels en immersion mobile de l’utilisateur en RA.Nous avons également proposé une technique d’interaction pour la RA, appelée « Zoom-in » qui facilite la sélection et la manipulation des objets virtuels distants. Cette technique est basée sur le zoom de l’image et des objets virtuels recalé sur l’image. Les objets virtuels sont mis à la portée de l’utilisateur en gardant le recalage par rapport à la scène.Ce mémoire se termine par une conclusion générale qui fait le point sur l’essentiel de ce travail et ouvre de nouvelles perspectives. / Pose estimation and 3D interaction are the essential basis for any Augmented Reality (AR) system. We aim to treat those two fields in order to offer a pertinent AR system that allows a mobile immersion and natural interaction. In this optic, this thesis provides an overall consistent state of the art in both pose estimation and 3D interaction for AR.In addition, this thesis details our contributions that consists of MOBIL: a binary descriptor that compares geometric moments of the patch through a binary test. Two improvements of this descriptor: MOBIL_2B and POLAR_MOBIL are proposed in order to enhance its robustness.We used this descriptor with PTAM technique to ensure the user pose estimation respectively for the selection/manipulation task and the navigation task.On the other hand, we proposed a novel 3D interaction technique called “Zoom-In”, designed for augmented reality applications. This technique is based on the zoom of the captured image. It calculates the 3D transformation relative to the selected object. This technique allows user selecting and manipulating distant virtual objects by bringing them within the user arm’s reach by zooming in the captured image, and re-estimating the user pose thanks to our proposed descriptor. Finally, we present a conclusion that describes the essential of this work and provide perspective and future work.
|
2 |
Zhiwen_Dissertation.pdfZhiwen Cao (15347242) 29 April 2023 (has links)
<p>In this work, we presented a novel approach to the mathematical representation of facial pose, followed by the design of a neural network (NN) capable of leveraging these representations to solve the task of facial pose estimation. Our core contribution lay in the development of advanced mathematical representations for face orientation, which include: 1) three column-vector-based representation, 2) an Anisotropic Spherical Gaussian (ASG)-based Label Distribution Learning (LDL) representation, and 3) the SO(3) Hopf coordinate-based LDL representation. These representations provided continuous and unique descriptions of the facial orientation and avoided the Gimbal lock issue of Euler angles and the antipodal issue of quaternions. Building upon these mathematical representations, we specifically designed neural network architectures to utilize these features. Key components of our NN design included 1) orthogonal loss function for column-vector-based representations which encouraged the orthogonality of predicted vectors. 2) dynamic distribution parameter learning for ASG- and SO(3)-based LDL representations which allowed the NN to adjust the contributions of adjacent labels adaptively. Our proposed mathematical representations of rotations, combined with our NN architectures, provided a powerful framework for robust and accurate facial pose estimation.</p>
<p><br></p>
|
3 |
On Finding the Location of an Underwater Mobile Robot Using Optimization TechniquesTunuguntla, Sai S. 12 August 1998 (has links)
This research aims at solving an engineering design problem encountered in the field of robotics using mathematical programming techniques. The problem addressed is an indispensable part of designing the operation of Ursula, an underwater mobile robot, and involves finding its location as it moves along the circumference of a nuclear reactor vessel. The study has been conducted with an intent to aid a laser based global positioning system to make this determination.
The physical nature of this problem enables it to be conceptualized as a position and orientation determination problem. Ursula tests the weldments in the reactor vessel, and its position and orientation needs to be found continuously in real-time. The kinematic errors in the setup and the use of a laser based positioning system distinguish this from traditional position and orientation determination problems. The aim of this research effort is to construct a suitable representative mathematical model for this problem, and to design and compare various solution methodologies that are computationally competitive, numerically stable, and accurate. / Master of Science
|
4 |
Performance-Guided Character Bind Pose for DeformationsPena, Benito 2011 May 1900 (has links)
Current production methods for creating a motion system for a deformable digital character model involve providing an underlying joint structure based o of a T-Pose, A-Pose or another arbitrary bind pose of the character. A bind pose is required to establish the skeleton-to-geometry spatial relationship that will be used as a mathematical reference to determine geometry deformations in animated poses. Using
a set of human motion capture performances as input animation, the impact of the standard T-Pose and A-Pose on the stretching and compression of human character model geometry is compared relative to novel mean poses derived from each performance. Results demonstrate that using an averaged joint position of each specific performance as the bind pose for the performance reduces the overall deformation of the model. Appropriate applications of the mean pose as a bind pose could impact the resources required to repair deformation artifacts in animated deformable digital characters.
|
5 |
Détection et estimation de pose d'instances d'objet rigide pour la manipulation robotisée / Detection and pose estimation of instances of a rigid object for robotic bin-pickingBrégier, Romain 11 June 2018 (has links)
La capacité à détecter des objets dans une scène et à estimer leur pose constitue un préalable essentiel à l'automatisation d'un grand nombre de tâches, qu'il s'agisse d'analyser automatiquement une situation, de proposer une expérience de réalité augmentée, ou encore de permettre à un robot d'interagir avec son environnement.Dans cette thèse, nous nous intéressons à cette problématique à travers le scénario du dévracage industriel, dans lequel il convient de détecter des instances d'un objet rigide au sein d'un vrac et d'estimer leur pose -- c'est-à-dire leur position et orientation -- à des fins de manipulation robotisée.Nous développons pour ce faire une méthode basée sur l'exploitation d'une image de profondeur, procédant par agrégation d'hypothèses générées par un ensemble d'estimateurs locaux au moyen d'une forêt de décision.La pose d'un objet rigide est usuellement modélisée sous forme d'une transformation rigide 6D dans la littérature. Cette représentation se révèle cependant inadéquate lorsqu'il s'agit de traiter des objets présentant des symétries, pourtant nombreux parmi les objets manufacturés.Afin de contourner ces difficultés, nous introduisons une formulation de la notion de pose compatible avec tout objet rigide physiquement admissible, et munissons l'espace des poses d'une distance quantifiant la longueur du plus petit déplacement entre deux poses. Ces notions fournissent un cadre théorique rigoureux à partir duquel nous développons des outils permettant de manipuler efficacement le concept de pose, et constituent le socle de notre approche du problème du dévracage.Les standards d'évaluation utilisés dans l'état de l'art souffrant de certaines limitations et n'étant que partiellement adaptés à notre contexte applicatif, nous proposons une méthodologie d'évaluation adaptée à des scènes présentant un nombre variable d'instances d'objet arbitraire, potentiellement occultées. Nous mettons celle-ci en œuvre sur des données synthétiques et réelles, et montrons la viabilité de la méthode proposée, compatible avec les problématiques de temps de cycle, de performance et de simplicité de mise en œuvre du dévracage industriel. / Visual object detection and estimation of their poses -- i.e. position and orientation for a rigid object -- is of utmost interest for automatic scene understanding.In this thesis, we address this topic through the bin-picking scenario, in which instances of a rigid object have to be automatically detected and localized in bulk, so as to be manipulated by a robot for various industrial tasks such as machine feeding, assembling, packing, etc.To this aim, we propose a novel method for object detection and pose estimation given an input depth image, based on the aggregation of local predictions through an Hough forest technique, that is suitable with industrial constraints of performance and ease of use.Overcoming limitations of existing approaches that assume objects not to have any proper symmetries, we develop a theoretical and practical framework enabling us to consider any physical rigid object, thanks to a novel definition of the notion of pose and an associated distance.This framework provides tools to deal with poses efficiently for operations such as pose averaging or neighborhood queries, and is based on rigorous mathematical developments.Evaluation benchmarks used in the literature are not very representative of our application scenario and suffer from some intrinsic limitations, therefore we formalize a methodology suited for scenes in which many object instances, partially occluded, in arbitrary poses may be considered. We apply this methodology on real and synthetic data, and demonstrate the soundness of our approach compared to the state of the art.
|
6 |
A Human Kinetic Dataset and a Hybrid Model for 3D Human Pose EstimationWang, Jianquan 12 November 2020 (has links)
Human pose estimation represents the skeleton of a person in color or depth images to improve a machine’s understanding of human movement. 3D human pose estimation uses a three-dimensional skeleton to represent the human body posture, which is more stereoscopic than a two-dimensional skeleton. Therefore, 3D human pose estimation can enable machines to play a role in physical education and health recovery, reducing labor costs and the risk of disease transmission. However, the existing datasets for 3D pose estimation do not involve fast motions that would cause optical blur for a monocular camera but would allow the subjects’ limbs to move in a more extensive range of angles. The existing models cannot guarantee both real-time performance and high accuracy, which are essential in physical education and health recovery applications. To improve real-time performance, researchers have tried to minimize the size of the model and have studied more efficient deployment methods. To improve accuracy, researchers have tried to use heat maps or point clouds to represent features, but this increases the difficulty of model deployment.
To address the lack of datasets that include fast movements and easy-to-deploy models, we present a human kinetic dataset called the Kivi dataset and a hybrid model that combines the benefits of a heat map-based model and an end-to-end model for 3D human pose estimation. We describe the process of data collection and cleaning in this thesis. Our proposed Kivi dataset contains large-scale movements of humans. In the dataset, 18 joint points represent the human skeleton. We collected data from 12 people, and each person performed 38 sets of actions. Therefore, each frame of data has a corresponding person and action label. We design a preliminary model and propose an improved model to infer 3D human poses in real time. When validating our method on the Invariant Top-View (ITOP) dataset, we found that compared with the initial model, our improved model improves the mAP@10cm by 29%. When testing on the Kivi dataset, our improved model improves the mAP@10cm by 15.74% compared to the preliminary model. Our improved model can reach 65.89 frames per second (FPS) on the TensorRT platform.
|
7 |
Learned structural and temporal context for dynamic 3D pose optimization and trackingPatel, Mahir 30 September 2022 (has links)
Accurate 3D tracking of animals from video recordings is critical for many behavioral studies. However, other than for humans, there is a lack of publicly available datasets of videos of animals that the computer vision community could use for model development. Furthermore, due to occlusion and the uncontrollable nature of the animals, existing pose estimation models suffer from inadequate precision. People rely on biomechanical expertise to design mathematical models to optimize poses to mitigate this issue at the cost of generalization. We propose OptiPose, a generalizable attention-based deep learning pose optimization model, as a part of a post-processing pipeline for refining 3D poses estimated by pre-existing systems. Our experiments show how OptiPose is highly robust to noise and occlusion and can be used to optimize pose sequences provided by state-of-the-art models for animal pose estimation. Furthermore, we will make Rodent3D, a multimodal (RGB, Thermal, and Depth) dataset for rats, publicly available.
|
8 |
3D reconstruction of a catheter path from a single view X-ray sequenceWeng, Ji Yao January 2003 (has links)
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
|
9 |
A Deep 3D Object Pose Estimation Framework for Robots with RGB-D SensorsWagh, Ameya Yatindra 24 April 2019 (has links)
The task of object detection and pose estimation has widely been done using template matching techniques. However, these algorithms are sensitive to outliers and occlusions, and have high latency due to their iterative nature. Recent research in computer vision and deep learning has shown great improvements in the robustness of these algorithms. However, one of the major drawbacks of these algorithms is that they are specific to the objects. Moreover, the estimation of pose depends significantly on their RGB image features. As these algorithms are trained on meticulously labeled large datasets for object's ground truth pose, it is difficult to re-train these for real-world applications. To overcome this problem, we propose a two-stage pipeline of convolutional neural networks which uses RGB images to localize objects in 2D space and depth images to estimate a 6DoF pose. Thus the pose estimation network learns only the geometric features of the object and is not biased by its color features. We evaluate the performance of this framework on LINEMOD dataset, which is widely used to benchmark object pose estimation frameworks. We found the results to be comparable with the state of the art algorithms using RGB-D images. Secondly, to show the transferability of the proposed pipeline, we implement this on ATLAS robot for a pick and place experiment. As the distribution of images in LINEMOD dataset and the images captured by the MultiSense sensor on ATLAS are different, we generate a synthetic dataset out of very few real-world images captured from the MultiSense sensor. We use this dataset to train just the object detection networks used in the ATLAS Robot experiment.
|
10 |
Simultaneous Pose and Correspondence Problem for Visual ServoingChiu, Raymond January 2010 (has links)
Pose estimation is a common problem in computer vision. The pose is the combination of the position and orientation of a particular object relative to some reference coordinate system. The pose estimation problem involves determining the pose of an object from one or multiple images of the object. This problem often arises in the area of robotics. It is necessary to determine the pose of an object before it can be manipulated by the robot. In particular, this research focuses on pose estimation for initialization of position-based visual servoing.
A closely related problem is the correspondence problem. This is the problem of finding a set of features from the image of an object that can be identified as the same feature from a model of the object. Solving for pose without known corre- spondence is also refered to as the simultaneous pose and correspondence problem, and it is a lot more difficult than solving for pose with known correspondence.
This thesis explores a number of methods to solve the simultaneous pose and correspondence problem, with focuses on a method called SoftPOSIT. It uses the idea that the pose is easily determined if correspondence is known. It first produces an initial guess of the pose and uses it to determine a correspondence. With the correspondence, it determines a new pose. This new pose is assumed to be a better estimate, thus a better correspondence can be determined. The process is repeated until the algorithm converges to a correspondence pose estimate. If this pose estimate is not good enough, the algorithm is restarted with a new initial guess.
An improvement is made to this algorithm. An early termination condition is added to detect conditions where the algorithm is unlikely to converge towards a good pose. This leads to an reduction in the runtime by as much as 50% and improvement in the success rate of the algorithm by approximately 5%.
The proposed solution is tested and compared with the RANSAC method and simulated annealing in a simulation environment. It is shown that the proposed solution has the potential for use in commercial environments for pose estimation.
|
Page generated in 0.0515 seconds