• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 123
  • 10
  • 7
  • 5
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 186
  • 186
  • 97
  • 71
  • 48
  • 35
  • 33
  • 32
  • 30
  • 29
  • 28
  • 27
  • 26
  • 24
  • 24
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

Human-Robot Interaction with Pose Estimation and Dual-Arm Manipulation Using Artificial Intelligence

Ren, Hailin 16 April 2020 (has links)
This dissertation focuses on applying artificial intelligence techniques to human-robot interaction, which involves human pose estimation and dual-arm robotic manipulation. The motivating application behind this work is autonomous victim extraction in disaster scenarios using a conceptual design of a Semi-Autonomous Victim Extraction Robot (SAVER). SAVER is equipped with an advanced sensing system and two powerful robotic manipulators as well as a head and neck stabilization system to achieve autonomous safe and effective victim extraction, thereby reducing the potential risk to field medical providers. This dissertation formulates the autonomous victim extraction process using a dual-arm robotic manipulation system for human-robot interaction. According to the general process of Human-Robot Interaction (HRI), which includes perception, control, and decision-making, this research applies machine learning techniques to human pose estimation, robotic manipulator modeling, and dual-arm robotic manipulation, respectively. In the human pose estimation, an efficient parallel ensemble-based neural network is developed to provide real-time human pose estimation on 2D RGB images. A 13-limb, 14-joint skeleton model is used in this perception neural network and each ensemble of the neural network is designed for a specific limb detection. The parallel structure poses two main benefits: (1) parallel ensembles architecture and multiple Graphics Processing Units (GPU) make distributed computation possible, and (2) each individual ensemble can be deployed independently, making the processing more efficient when the detection of only some specific limbs is needed for the tasks. Precise robotic manipulator modeling benefits from the simplicity of the controller design and improves the performance of trajectory following. Traditional system modeling relies on first principles, simplifying assumptions and prior knowledge. Any imperfection in the above could lead to an analytical model that is different from the real system. Machine learning techniques have been applied in this field to pursue faster computation and more accurate estimation. However, a large dataset is always needed for these techniques, while obtaining the data from the real system could be costly in terms of both time and maintenance. In this research, a series of different Generative Adversarial Networks (GANs) are proposed to efficiently identify inverse kinematics and inverse dynamics of the robotic manipulators. One four-Degree-of-Freedom (DOF) robotic manipulator and one six-DOF robotic manipulator are used with different sizes of the dataset to evaluate the performance of the proposed GANs. The general methods can also be adapted to other systems, whose dataset is limited using general machine learning techniques. In dual-arm robotic manipulation, basic behaviors such as reaching, pushing objects, and picking objects up are learned using Reinforcement Learning. A Teacher-Student advising framework is proposed to learn a single neural network to control dual-arm robotic manipulators with previous knowledge of controlling a single robotic manipulator. Simulation and experimental results present the efficiency of the proposed framework compared to the learning process from scratch. Another concern in robotic manipulation is safety constraints. A variable-reward hierarchical reinforcement learning framework is proposed to solve sparse reward and tasks with constraints. A task of picking up and placing two objects to target positions while keeping them in a fixed distance within a threshold is used to evaluate the performance of the proposed method. Comparisons to other state-of-the-art methods are also presented. Finally, all the three proposed components are integrated as a single system. Experimental evaluation with a full-size manikin was performed to validate the concept of applying artificial intelligence techniques to autonomous victim extraction using a dual-arm robotic manipulation system. / Doctor of Philosophy / Using mobile robots for autonomous victim extraction in disaster scenarios reduces the potential risk to field medical providers. This dissertation focuses on applying artificial intelligence techniques to this human-robot interaction task involving pose estimation and dual-arm manipulation for victim extraction. This work is based on a design of a Semi-Autonomous Victim Extraction Robot (SAVER). SAVER is equipped with an advanced sensing system and two powerful robotic manipulators as well as a head and neck stabilization system attached on an embedded declining stretcher to achieve autonomous safe and effective victim extraction. Therefore, the overall research in this dissertation addresses: human pose estimation, robotic manipulator modeling, and dual-arm robotic manipulation for human pose adjustment. To accurately estimate the human pose for real-time applications, the dissertation proposes a neural network that could take advantages of multiple Graphics Processing Units (GPU). Considering the cost in data collection, the dissertation proposed novel machine learning techniques to obtain the inverse dynamic model and the inverse kinematic model of the robotic manipulators using limited collected data. Applying safety constraints is another requirement when robots interacts with humans. This dissertation proposes reinforcement learning techniques to efficiently train a dual-arm manipulation system not only to perform the basic behaviors, such as reaching, pushing objects and picking up and placing objects, but also to take safety constraints into consideration in performing tasks. Finally, the three components mentioned above are integrated together as a complete system. Experimental validation and results are discussed at the end of this dissertation.
112

Video-based analysis of Gait pathologies

Nguyen, Hoang Anh 12 1900 (has links)
L’analyse de la marche a émergé comme l’un des domaines médicaux le plus im- portants récemment. Les systèmes à base de marqueurs sont les méthodes les plus fa- vorisées par l’évaluation du mouvement humain et l’analyse de la marche, cependant, ces systèmes nécessitent des équipements et de l’expertise spécifiques et sont lourds, coûteux et difficiles à utiliser. De nombreuses approches récentes basées sur la vision par ordinateur ont été développées pour réduire le coût des systèmes de capture de mou- vement tout en assurant un résultat de haute précision. Dans cette thèse, nous présentons notre nouveau système d’analyse de la démarche à faible coût, qui est composé de deux caméras vidéo monoculaire placées sur le côté gauche et droit d’un tapis roulant. Chaque modèle 2D de la moitié du squelette humain est reconstruit à partir de chaque vue sur la base de la segmentation dynamique de la couleur, l’analyse de la marche est alors effectuée sur ces deux modèles. La validation avec l’état de l’art basée sur la vision du système de capture de mouvement (en utilisant le Microsoft Kinect) et la réalité du ter- rain (avec des marqueurs) a été faite pour démontrer la robustesse et l’efficacité de notre système. L’erreur moyenne de l’estimation du modèle de squelette humain par rapport à la réalité du terrain entre notre méthode vs Kinect est très prometteur: les joints des angles de cuisses (6,29◦ contre 9,68◦), jambes (7,68◦ contre 11,47◦), pieds (6,14◦ contre 13,63◦), la longueur de la foulée (6.14cm rapport de 13.63cm) sont meilleurs et plus stables que ceux de la Kinect, alors que le système peut maintenir une précision assez proche de la Kinect pour les bras (7,29◦ contre 6,12◦), les bras inférieurs (8,33◦ contre 8,04◦), et le torse (8,69◦contre 6,47◦). Basé sur le modèle de squelette obtenu par chaque méthode, nous avons réalisé une étude de symétrie sur différentes articulations (coude, genou et cheville) en utilisant chaque méthode sur trois sujets différents pour voir quelle méthode permet de distinguer plus efficacement la caractéristique symétrie / asymétrie de la marche. Dans notre test, notre système a un angle de genou au maximum de 8,97◦ et 13,86◦ pour des promenades normale et asymétrique respectivement, tandis que la Kinect a donné 10,58◦et 11,94◦. Par rapport à la réalité de terrain, 7,64◦et 14,34◦, notre système a montré une plus grande précision et pouvoir discriminant entre les deux cas. / Gait analysis has emerged as one of the most important medical field recently due to its wide range of applications. Marker-based systems are the most favoured methods of human motion assessment and gait analysis, however, these systems require specific equipment and expertise, and are cumbersome, costly and difficult to use. Many re- cent computer-vision-based approaches have been developed to reduce the cost of the expensive motion capture systems while ensuring high accuracy result. In this thesis, we introduce our new low-cost gait analysis system that is composed of two low-cost monocular cameras (camcorders) placed on the left and right sides of a treadmill. Each 2D left or right human skeleton model is reconstructed from each view based on dy- namic color segmentation, the gait analysis is then performed on these two models. The validation with one state-of-the-art vision-based motion capture system (using the Mi- crosoft Kinect v.1) and one ground-truth (with markers) was done to demonstrate the robustness and efficiency of our system. The average error in human skeleton model estimation compared to ground-truth between our method vs. Kinect are very promis- ing: the joints angles of upper legs (6.29◦ vs. 9.68◦), lower legs (7.68◦ vs. 11.47◦), feet (6.14◦ vs. 13.63◦), stride lengths (6.14cm vs. 13.63cm) were better and more stable than those from the Kinect, while the system could maintain a reasonably close accu- racy to the Kinect for upper arms (7.29◦ vs. 6.12◦), lower arms (8.33◦ vs. 8.04◦), and torso (8.69◦ vs. 6.47◦). Based on the skeleton model obtained by each method, we per- formed a symmetry study on various joints (elbow, knee and ankle) using each method on two different subjects to see which method can distinguish more efficiently the sym- metry/asymmetry characteristic of gaits. In our test, our system reported a maximum knee angle of 8.97◦ and 13.86◦ for normal and asymmetric walks respectively, while the Kinect gave 10.58◦ and 11.94◦. Compared to the ground-truth, 7.64◦ and 14.34◦, our system showed more accuracy and discriminative power between the two cases.
113

Deep learning for human motion analysis / Apprentissage automatique de représentations profondes pour l’analyse du mouvement humain

Neverova, Natalia 08 April 2016 (has links)
L'objectif de ce travail est de développer des méthodes avancées d'apprentissage pour l’analyse et l'interprétation automatique du mouvement humain à partir de sources d'information diverses, telles que les images, les vidéos, les cartes de profondeur, les données de type “MoCap” (capture de mouvement), les signaux audio et les données issues de capteurs inertiels. A cet effet, nous proposons plusieurs modèles neuronaux et des algorithmes d’entrainement associés pour l’apprentissage supervisé et semi-supervisé de caractéristiques. Nous proposons des approches de modélisation des dépendances temporelles, et nous montrons leur efficacité sur un ensemble de tâches fondamentales, comprenant la détection, la classification, l’estimation de paramètres et la vérification des utilisateurs (la biométrie). En explorant différentes stratégies de fusion, nous montrons que la fusion des modalités à plusieurs échelles spatiales et temporelles conduit à une augmentation significative des taux de reconnaissance, ce qui permet au modèle de compenser les erreurs des classifieurs individuels et le bruit dans les différents canaux. En outre, la technique proposée assure la robustesse du classifieur face à la perte éventuelle d’un ou de plusieurs canaux. Dans un deuxième temps nous abordons le problème de l’estimation de la posture de la main en présentant une nouvelle méthode de régression à partir d’images de profondeur. Dernièrement, dans le cadre d’un projet séparé (mais lié thématiquement), nous explorons des modèles temporels pour l'authentification automatique des utilisateurs de smartphones à partir de leurs habitudes de tenir, de bouger et de déplacer leurs téléphones. Dans ce contexte, les données sont acquises par des capteurs inertiels embraqués dans les appareils mobiles. / The research goal of this work is to develop learning methods advancing automatic analysis and interpreting of human motion from different perspectives and based on various sources of information, such as images, video, depth, mocap data, audio and inertial sensors. For this purpose, we propose a several deep neural models and associated training algorithms for supervised classification and semi-supervised feature learning, as well as modelling of temporal dependencies, and show their efficiency on a set of fundamental tasks, including detection, classification, parameter estimation and user verification. First, we present a method for human action and gesture spotting and classification based on multi-scale and multi-modal deep learning from visual signals (such as video, depth and mocap data). Key to our technique is a training strategy which exploits, first, careful initialization of individual modalities and, second, gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. Moving forward, from 1 to N mapping to continuous evaluation of gesture parameters, we address the problem of hand pose estimation and present a new method for regression on depth images, based on semi-supervised learning using convolutional deep neural networks, where raw depth data is fused with an intermediate representation in the form of a segmentation of the hand into parts. In separate but related work, we explore convolutional temporal models for human authentication based on their motion patterns. In this project, the data is captured by inertial sensors (such as accelerometers and gyroscopes) built in mobile devices. We propose an optimized shift-invariant dense convolutional mechanism and incorporate the discriminatively-trained dynamic features in a probabilistic generative framework taking into account temporal characteristics. Our results demonstrate, that human kinematics convey important information about user identity and can serve as a valuable component of multi-modal authentication systems.
114

Human pose estimation and action recognition by multi-robot systems / Estimation de pose humaine et reconnaissance d’action par un système multi-robots

Dogan, Emre 07 July 2017 (has links)
L'estimation de la pose humaine et la reconnaissance des activités humaines sont des étapes importantes dans de nombreuses applications comme la robotique, la surveillance et la sécurité, etc. Actuellement abordées dans le domaine, ces tâches ne sont toujours pas résolues dans des environnements non-coopératifs particulièrement. Ces tâches admettent de divers défis comme l'occlusion, les variations des vêtements, etc. Les méthodes qui exploitent des images de profondeur ont l’avantage concernant les défis liés à l'arrière-plan et à l'apparence, pourtant, l’application est limitée pour des raisons matérielles. Dans un premier temps, nous nous sommes concentrés sur la reconnaissance des actions complexes depuis des vidéos. Pour ceci, nous avons introduit une représentation spatio-temporelle indépendante du point de vue. Plus précisément, nous avons capturé le mouvement de la personne en utilisant un capteur de profondeur et l'avons encodé en 3D pour le représenter. Un descripteur 3D a ensuite été utilisé pour la classification des séquences avec la méthodologie bag-of-words. Pour la deuxième partie, notre objectif était l'estimation de pose articulée, qui est souvent une étape intermédiaire pour la reconnaissance de l'activité. Notre motivation était d'incorporer des informations à partir de capteurs multiples et de les fusionner pour surmonter le problème de l'auto-occlusion. Ainsi, nous avons proposé un modèle de flexible mixtures-of-parts multi-vues inspiré par la méthodologie classique de structure pictural. Nous avons démontré que les contraintes géométriques et les paramètres de cohérence d'apparence sont efficaces pour renforcer la cohérence entre les points de vue, aussi que les paramètres classiques. Finalement, nous avons évalué ces nouvelles méthodes sur des datasets publics, qui vérifie que l'utilisation de représentations indépendantes de la vue et l'intégration d'informations à partir de points de vue multiples améliore la performance pour les tâches ciblées dans le cadre de cette manuscrit. / Estimating human pose and recognizing human activities are important steps in many applications, such as human computer interfaces (HCI), health care, smart conferencing, robotics, security surveillance etc. Despite the ongoing effort in the domain, these tasks remained unsolved in unconstrained and non cooperative environments in particular. Pose estimation and activity recognition face many challenges under these conditions such as occlusion or self occlusion, variations in clothing, background clutter, deformable nature of human body and diversity of human behaviors during activities. Using depth imagery has been a popular solution to address appearance and background related challenges, but it has restricted application area due to its hardware limitations and fails to handle remaining problems. Specifically, we considered action recognition scenarios where the position of the recording device is not fixed, and consequently require a method which is not affected by the viewpoint. As a second prob- lem, we tackled the human pose estimation task in particular settings where multiple visual sensors are available and allowed to collaborate. In this thesis, we addressed these two related problems separately. In the first part, we focused on indoor action recognition from videos and we consider complex ac- tivities. To this end, we explored several methodologies and eventually introduced a 3D spatio-temporal representation for a video sequence that is viewpoint independent. More specifically, we captured the movement of the person over time using depth sensor and we encoded it in 3D to represent the performed action with a single structure. A 3D feature descriptor was employed afterwards to build a codebook and classify the actions with the bag-of-words approach. As for the second part, we concentrated on articulated pose estimation, which is often an intermediate step for activity recognition. Our motivation was to incorporate information from multiple sources and views and fuse them early in the pipeline to overcome the problem of self-occlusion, and eventually obtain robust estimations. To achieve this, we proposed a multi-view flexible mixture of parts model inspired by the classical pictorial structures methodology. In addition to the single-view appearance of the human body and its kinematic priors, we demonstrated that geometrical constraints and appearance- consistency parameters are effective for boosting the coherence between the viewpoints in a multi-view setting. Both methods that we proposed was evaluated on public benchmarks and showed that the use of view-independent representations and integrating information from multiple viewpoints improves the performance of action recognition and pose estimation tasks, respectively.
115

CAD-Based Pose Estimation - Algorithm Investigation

Lef, Annette January 2019 (has links)
One fundamental task in robotics is random bin-picking, where it is important to be able to detect an object in a bin and estimate its pose to plan the motion of a robotic arm. For this purpose, this thesis work aimed to investigate and evaluate algorithms for 6D pose estimation when the object was given by a CAD model. The scene was given by a point cloud illustrating a partial 3D view of the bin with multiple instances of the object. Two algorithms were thus implemented and evaluated. The first algorithm was an approach based on Point Pair Features, and the second was Fast Global Registration. For evaluation, four different CAD models were used to create synthetic data with ground truth annotations. It was concluded that the Point Pair Feature approach provided a robust localization of objects and can be used for bin-picking. The algorithm appears to be able to handle different types of objects, however, with small limitations when the object has flat surfaces and weak texture or many similar details. The disadvantage with the algorithm was the execution time. Fast Global Registration, on the other hand, did not provide a robust localization of objects and is thus not a good solution for bin-picking.
116

Vérification automatique des montages d'usinage par vision : application à la sécurisation de l'usinage / Vision-based automatic verification of machining setup : application to machine tools safety

Karabagli, Bilal 06 November 2013 (has links)
Le terme "usinage à porte fermée", fréquemment employé par les PME de l’aéronautique et de l’automobile, désigne l’automatisation sécurisée du processus d’usinage des pièces mécaniques. Dans le cadre de notre travail, nous nous focalisons sur la vérification du montage d’usinage, avant de lancer la phase d’usinage proprement dite. Nous proposons une solution sans contact, basée sur la vision monoculaire (une caméra), permettant de reconnaitre automatiquement les éléments du montage (brut à usiner, pions de positionnement, tiges de fixation,etc.), de vérifier que leur implantation réelle (réalisée par l’opérateur) est conforme au modèle 3D numérique de montage souhaité (modèle CAO), afin de prévenir tout risque de collision avec l’outil d’usinage. / In High Speed Machining it is of key importance to avoid any collision between the machining tool and the machining setup. If the machining setup has not been assembled correctly by the operator and is not conform to the 3D CAD model sent to the machining unit, such collisions can occur. We have developed a vision system, that utilizes a single camera, to automatically check the conformity of the actual machining setup within the desired 3D CAD model, before launching the machining operation. First, we propose a configuration of the camera within the machining setup to ensure a best acquisition of the scene. In the aim to segmente the image in regions of interest, e.g. regions of the clamping elements and piece, based-on 3D CAD model, we realise a matching between graphes, theorical and real graphe computed from theorical image of 3D-CAD model and real image given by real camera. The graphs are constructed from a simple feature, such as circles and lines, that are manely present in the machining setup. In the aim to define the regions of interest (ROI) in real image within ROI given by 3D CAD model, we project a 3D CAD model in the real image, e.g. augmented reality. To automatically check the accordance between every region defined, we propose to compute three parametres, such as skeleton to represente the form, edges to represent a geometry and Area to represent dimension. We compute a score of accordance between three parameters that will be analyzed in fuzzy system to get a decision of conformity of the clamping element within it definition given in the CAD model. Some cases of machining setup configurations require 3D information to test the trajectory of the machine tool. To get out this situation, we have proposed a new depth from defocus based-method to compute a depth map of the scene. Finally, we present the result of our solution and we show the feasibility and robustness of the proposed solution in differents case of machining setup.
117

Enriching Remote Labs with Computer Vision and Drones / Enrichir les laboratoires distants grâce à la vision par ordinateur avec drone.

Khattar, Fawzi 13 December 2018 (has links)
Avec le progrès technologique, de nouvelles technologies sont en cours de développement afin de contribuer à une meilleure expérience dans le domaine de l’éducation. En particulier, les laboratoires distants constituent un moyen intéressant et pratique qui peut motiver les étudiants à apprendre. L'étudiant peut à tout moment, et de n'importe quel endroit, accéder au laboratoire distant et faire son TP (travail pratique). Malgré les nombreux avantages, les technologies à distance dans l’éducation créent une distance entre l’étudiant et l’enseignant. Les élèves peuvent avoir des difficultés à faire le TP si aucune intervention appropriée ne peut être prise pour les aider. Dans cette thèse, nous visons à enrichir un laboratoire électronique distant conçu pour les étudiants en ingénierie et appelé «LaboREM» (pour remote laboratory) de deux manières: tout d'abord, nous permettons à l'étudiant d'envoyer des commandes de haut niveau à un mini-drone disponible dans le laboratoire distant. L'objectif est d'examiner les faces-avant des instruments de mesure électroniques, à l'aide de la caméra intégrée au drone. De plus, nous autorisons la communication élève-enseignant à distance à l'aide du drone, au cas où un enseignant serait présent dans le laboratoire distant. Enfin, le drone doit revenir pour atterrir sur la plate-forme de recharge automatique des batteries, quand la mission est terminée. Nous proposons aussi un système automatique pour estimer l'état de l'étudiant (frustré / concentré..) afin de prendre les interventions appropriées pour assurer un bon déroulement du TP distant. Par exemple, si l'élève a des difficultés majeures, nous pouvons lui donner des indications ou réduire le niveau de difficulté de l’exercice. Nous proposons de faire cela en utilisant des signes visuels (estimation de la pose de la tête et analyse de l'expression faciale). De nombreuses évidences sur l'état de l'étudiant peuvent être acquises, mais elles sont incomplètes, parfois inexactes et ne couvrent pas tous les aspects de l'état de l'étudiant. C'est pourquoi nous proposons dans cette thèse de fusionner les preuves en utilisant la théorie de Dempster-Shafer qui permet la fusion de preuves incomplètes. / With the technological advance, new learning technologies are being developed in order to contribute to better learning experience. In particular, remote labs constitute an interesting and a practical way that can motivate nowadays students to learn. The student can at any time, and from anywhere, access the remote lab and do his lab-work. Despite many advantages, remote technologies in education create a distance between the student and the teacher. Without the presence of a teacher, students can have difficulties, if no appropriate interventions can be taken to help them. In this thesis, we aim to enrich an existing remote electronic lab made for engineering students called “LaboREM” (for remote Laboratory) in two ways: first we enable the student to send high level commands to a mini-drone available in the remote lab facility. The objective is to examine the front panels of electronic measurement instruments, by the camera embedded on the drone. Furthermore, we allow remote student-teacher communication using the drone, in case there is a teacher present in the remote lab facility. Finally, the drone has to go back home when the mission is over to land on a platform for automatic recharge of the batteries. Second, we propose an automatic system that estimates the affective state of the student (frustrated/ confused/ flow..) in order to take appropriate interventions to ensure good learning outcomes. For example, if the student is having major difficulties we can try to give him hints or reduce the difficulty level. We propose to do this by using visual cues (head pose estimation and facial expression analysis). Many evidences on the state of the student can be acquired, however these evidences are incomplete, sometimes inaccurate, and do not cover all the aspects of the state of the student alone. This is why we propose to fuse evidences using the theory of Dempster-Shafer that allows the fusion of incomplete evidence.
118

Detecção de faces e rastreamento da pose da cabeça

Schramm, Rodrigo 20 March 2009 (has links)
Submitted by Mariana Dornelles Vargas (marianadv) on 2015-04-27T19:08:59Z No. of bitstreams: 1 deteccao_faces.pdf: 3878917 bytes, checksum: 2fbf8222ef54d5fc0b1df0bf3b3a5292 (MD5) / Made available in DSpace on 2015-04-27T19:08:59Z (GMT). No. of bitstreams: 1 deteccao_faces.pdf: 3878917 bytes, checksum: 2fbf8222ef54d5fc0b1df0bf3b3a5292 (MD5) Previous issue date: 2009-03-20 / HP - Hewlett-Packard Brasil Ltda / As câmeras de vídeo já fazem parte dos novos modelos de interação entre o homem e a máquina. Através destas, a face e a pose da cabeça podem ser detectadas promovendo novos recursos para o usuário. Entre o conjunto de aplicações que têm se beneficiado deste tipo de recurso estão a vídeo-conferência, os jogos educacionais e de entretenimento, o controle de atenção de motoristas e a medida de foco de atenção. Nesse contexto insere-se essa proposta de mestrado, a qual propõe um novo modelo para detectar e rastrear a pose da cabeça a partir de uma seqüência de vídeo obtida com uma câmera monocular. Para alcançar esse objetivo, duas etapas principais foram desenvolvidas: a detecção da face e o rastreamento da pose. Nessa etapa, a face é detectada em pose frontal utilizando-se um detector com haar-like features. Na segunda etapa do algoritmo, após a detecção da face em pose frontal, atributos específicos da mesma são rastreados para estimar a variação da pose de cabeça. / Video cameras are already part of the new man-machine interaction models. Through these, the face and pose of the head can be found, providing new resources for users. Among the applications that have benefited from this type of resource are video conference, educational and entertainment games, and measurement of attention focus. In this context, this Master's thesis proposes a new model to detect and track the pose of the head in a video sequence captured by a monocular camera. To achieve this goal, two main stages were developed: face detection and head pose tracking. The first stage is the starting point for tracking the pose. In this stage, the face is detected in frontal pose using a detector with Haar-like features. In the second step of the algorithm, after detecting the face in frontal pose, specific attributes of the read are tracked to estimate the change in the pose of the head.
119

HANDHELD LIDAR ODOMETRY ESTIMATION AND MAPPING SYSTEM

Holmqvist, Niclas January 2018 (has links)
Ego-motion sensors are commonly used for pose estimation in Simultaneous Localization And Mapping (SLAM) algorithms. Inertial Measurement Units (IMUs) are popular sensors but suffer from integration drift over longer time scales. To remedy the drift they are often used in combination with additional sensors, such as a LiDAR. Pose estimation is used when scans, produced by these additional sensors, are being matched. The matching of scans can be computationally heavy as one scan can contain millions of data points. Methods exist to simplify the problem of finding the relative pose between sensor data, such as the Normal Distribution Transform SLAM algorithm. The algorithm separates the point cloud data into a voxelgrid and represent each voxel as a normal distribution, effectively decreasing the amount of data points. Registration is based on a function which converges to a minimum. Sub-optimal conditions can cause the function to converge at a local minimum. To remedy this problem this thesis explores the benefits of combining IMU sensor data to estimate the pose to be used in the NDT SLAM algorithm.
120

Video See-Through Augmented Reality Application on a Mobile Computing Platform Using Position Based Visual POSE Estimation

Fischer, Daniel 22 August 2013 (has links)
A technique for real time object tracking in a mobile computing environment and its application to video see-through Augmented Reality (AR) has been designed, verified through simulation, and implemented and validated on a mobile computing device. Using position based visual position and orientation (POSE) methods and the Extended Kalman Filter (EKF), it is shown how this technique lends itself to be flexible to tracking multiple objects and multiple object models using a single monocular camera on different mobile computing devices. Using the monocular camera of the mobile computing device, feature points of the object(s) are located through image processing on the display. The relative position and orientation between the device and the object(s) is determined recursively by an EKF process. Once the relative position and orientation is determined for each object, three dimensional AR image(s) are rendered onto the display as if the device is looking at the virtual object(s) in the real world. This application and the framework presented could be used in the future to overlay additional informational onto displays in mobile computing devices. Example applications include robotic aided surgery where animations could be overlaid to assist the surgeon, in training applications that could aid in operation of equipment or in search and rescue operations where critical information such as floor plans and directions could be virtually placed onto the display. Current approaches in the field of real time object tracking are discussed along with the methods used for video see-through AR applications on mobile computing devices. The mathematical framework for the real time object tracking and video see-through AR rendering is discussed in detail along with some consideration to extension to the handling of multiple AR objects. A physical implementation for a mobile computing device is proposed detailing the algorithmic approach along with design decisions. The real time object tracking and video see-through AR system proposed is verified through simulation and details around the accuracy, robustness, constraints, and an extension to multiple object tracking are presented. The system is then validated using a ground truth measurement system and the accuracy, robustness, and its limitations are reviewed. A detailed validation analysis is also presented showing the feasibility of extending this approach to multiple objects. Finally conclusions from this research are presented based on the findings of this work and further areas of study are proposed.

Page generated in 0.9918 seconds