• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 193
  • 24
  • 17
  • 10
  • 9
  • 6
  • 6
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 333
  • 210
  • 140
  • 102
  • 69
  • 57
  • 55
  • 47
  • 43
  • 42
  • 42
  • 42
  • 36
  • 36
  • 34
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Human-Robot Interaction with Pose Estimation and Dual-Arm Manipulation Using Artificial Intelligence

Ren, Hailin 16 April 2020 (has links)
This dissertation focuses on applying artificial intelligence techniques to human-robot interaction, which involves human pose estimation and dual-arm robotic manipulation. The motivating application behind this work is autonomous victim extraction in disaster scenarios using a conceptual design of a Semi-Autonomous Victim Extraction Robot (SAVER). SAVER is equipped with an advanced sensing system and two powerful robotic manipulators as well as a head and neck stabilization system to achieve autonomous safe and effective victim extraction, thereby reducing the potential risk to field medical providers. This dissertation formulates the autonomous victim extraction process using a dual-arm robotic manipulation system for human-robot interaction. According to the general process of Human-Robot Interaction (HRI), which includes perception, control, and decision-making, this research applies machine learning techniques to human pose estimation, robotic manipulator modeling, and dual-arm robotic manipulation, respectively. In the human pose estimation, an efficient parallel ensemble-based neural network is developed to provide real-time human pose estimation on 2D RGB images. A 13-limb, 14-joint skeleton model is used in this perception neural network and each ensemble of the neural network is designed for a specific limb detection. The parallel structure poses two main benefits: (1) parallel ensembles architecture and multiple Graphics Processing Units (GPU) make distributed computation possible, and (2) each individual ensemble can be deployed independently, making the processing more efficient when the detection of only some specific limbs is needed for the tasks. Precise robotic manipulator modeling benefits from the simplicity of the controller design and improves the performance of trajectory following. Traditional system modeling relies on first principles, simplifying assumptions and prior knowledge. Any imperfection in the above could lead to an analytical model that is different from the real system. Machine learning techniques have been applied in this field to pursue faster computation and more accurate estimation. However, a large dataset is always needed for these techniques, while obtaining the data from the real system could be costly in terms of both time and maintenance. In this research, a series of different Generative Adversarial Networks (GANs) are proposed to efficiently identify inverse kinematics and inverse dynamics of the robotic manipulators. One four-Degree-of-Freedom (DOF) robotic manipulator and one six-DOF robotic manipulator are used with different sizes of the dataset to evaluate the performance of the proposed GANs. The general methods can also be adapted to other systems, whose dataset is limited using general machine learning techniques. In dual-arm robotic manipulation, basic behaviors such as reaching, pushing objects, and picking objects up are learned using Reinforcement Learning. A Teacher-Student advising framework is proposed to learn a single neural network to control dual-arm robotic manipulators with previous knowledge of controlling a single robotic manipulator. Simulation and experimental results present the efficiency of the proposed framework compared to the learning process from scratch. Another concern in robotic manipulation is safety constraints. A variable-reward hierarchical reinforcement learning framework is proposed to solve sparse reward and tasks with constraints. A task of picking up and placing two objects to target positions while keeping them in a fixed distance within a threshold is used to evaluate the performance of the proposed method. Comparisons to other state-of-the-art methods are also presented. Finally, all the three proposed components are integrated as a single system. Experimental evaluation with a full-size manikin was performed to validate the concept of applying artificial intelligence techniques to autonomous victim extraction using a dual-arm robotic manipulation system. / Doctor of Philosophy / Using mobile robots for autonomous victim extraction in disaster scenarios reduces the potential risk to field medical providers. This dissertation focuses on applying artificial intelligence techniques to this human-robot interaction task involving pose estimation and dual-arm manipulation for victim extraction. This work is based on a design of a Semi-Autonomous Victim Extraction Robot (SAVER). SAVER is equipped with an advanced sensing system and two powerful robotic manipulators as well as a head and neck stabilization system attached on an embedded declining stretcher to achieve autonomous safe and effective victim extraction. Therefore, the overall research in this dissertation addresses: human pose estimation, robotic manipulator modeling, and dual-arm robotic manipulation for human pose adjustment. To accurately estimate the human pose for real-time applications, the dissertation proposes a neural network that could take advantages of multiple Graphics Processing Units (GPU). Considering the cost in data collection, the dissertation proposed novel machine learning techniques to obtain the inverse dynamic model and the inverse kinematic model of the robotic manipulators using limited collected data. Applying safety constraints is another requirement when robots interacts with humans. This dissertation proposes reinforcement learning techniques to efficiently train a dual-arm manipulation system not only to perform the basic behaviors, such as reaching, pushing objects and picking up and placing objects, but also to take safety constraints into consideration in performing tasks. Finally, the three components mentioned above are integrated together as a complete system. Experimental validation and results are discussed at the end of this dissertation.
72

Repousser les limites de l'identification faciale en contexte de vidéo-surveillance / Breaking the limits of facial identification in video-surveillance context.

Fiche, Cécile 31 January 2012 (has links)
Les systèmes d'identification de personnes basés sur le visage deviennent de plus en plus répandus et trouvent des applications très variées, en particulier dans le domaine de la vidéosurveillance. Or, dans ce contexte, les performances des algorithmes de reconnaissance faciale dépendent largement des conditions d'acquisition des images, en particulier lorsque la pose varie mais également parce que les méthodes d'acquisition elles mêmes peuvent introduire des artéfacts. On parle principalement ici de maladresse de mise au point pouvant entraîner du flou sur l'image ou bien d'erreurs liées à la compression et faisant apparaître des effets de blocs. Le travail réalisé au cours de la thèse porte donc sur la reconnaissance de visages à partir d'images acquises à l'aide de caméras de vidéosurveillance, présentant des artéfacts de flou ou de bloc ou bien des visages avec des poses variables. Nous proposons dans un premier temps une nouvelle approche permettant d'améliorer de façon significative la reconnaissance des visages avec un niveau de flou élevé ou présentant de forts effets de bloc. La méthode, à l'aide de métriques spécifiques, permet d'évaluer la qualité de l'image d'entrée et d'adapter en conséquence la base d'apprentissage des algorithmes de reconnaissance. Dans un second temps, nous nous sommes focalisés sur l'estimation de la pose du visage. En effet, il est généralement très difficile de reconnaître un visage lorsque celui-ci n'est pas de face et la plupart des algorithmes d'identification de visages considérés comme peu sensibles à ce paramètre nécessitent de connaître la pose pour atteindre un taux de reconnaissance intéressant en un temps relativement court. Nous avons donc développé une méthode d'estimation de la pose en nous basant sur des méthodes de reconnaissance récentes afin d'obtenir une estimation rapide et suffisante de ce paramètre. / The person identification systems based on face recognition are becoming increasingly widespread and are being used in very diverse applications, particularly in the field of video surveillance. In this context, the performance of the facial recognition algorithms largely depends on the image acquisition context, especially because the pose can vary, but also because the acquisition methods themselves can introduce artifacts. The main issues are focus imprecision, which can lead to blurred images, or the errors related to compression, which can introduce the block artifact. The work done during the thesis focuses on facial recognition in images taken by video surveillance cameras, in cases where the images contain blur or block artifacts or show various poses. First, we are proposing a new approach that allows to significantly improve facial recognition in images with high blur levels or with strong block artifacts. The method, which makes use of specific noreference metrics, starts with the evaluation of the quality level of the input image and then adapts the training database of the recognition algorithms accordingly. Second, we have focused on the facial pose estimation. Normally, it is very difficult to recognize a face in an image taken from another viewpoint than the frontal one and the majority of facial identification algorithms which are robust to pose variation need to know the pose in order to achieve a satisfying recognition rate in a relatively short time. We have therefore developed a fast and satisfying pose estimation method based on recent recognition techniques.
73

Real-Time Head Pose Estimation in Low-Resolution Football Footage / Realtidsestimering av huvudets vridning i lågupplösta videosekvenser från fotbollsmatcher

Launila, Andreas January 2009 (has links)
<p>This report examines the problem of real-time head pose estimation in low-resolution football footage. A method is presented for inferring the head pose using a combination of footage and knowledge of the locations of the football and players. An ensemble of randomized ferns is compared with a support vector machine for processing the footage, while a support vector machine performs pattern recognition on the location data. Combining the two sources of information outperforms either in isolation. The location of the football turns out to be an important piece of information.</p> / QC 20100707 / Capturing and Visualizing Large scale Human Action (ACTVIS)
74

Video-based analysis of Gait pathologies

Nguyen, Hoang Anh 12 1900 (has links)
L’analyse de la marche a émergé comme l’un des domaines médicaux le plus im- portants récemment. Les systèmes à base de marqueurs sont les méthodes les plus fa- vorisées par l’évaluation du mouvement humain et l’analyse de la marche, cependant, ces systèmes nécessitent des équipements et de l’expertise spécifiques et sont lourds, coûteux et difficiles à utiliser. De nombreuses approches récentes basées sur la vision par ordinateur ont été développées pour réduire le coût des systèmes de capture de mou- vement tout en assurant un résultat de haute précision. Dans cette thèse, nous présentons notre nouveau système d’analyse de la démarche à faible coût, qui est composé de deux caméras vidéo monoculaire placées sur le côté gauche et droit d’un tapis roulant. Chaque modèle 2D de la moitié du squelette humain est reconstruit à partir de chaque vue sur la base de la segmentation dynamique de la couleur, l’analyse de la marche est alors effectuée sur ces deux modèles. La validation avec l’état de l’art basée sur la vision du système de capture de mouvement (en utilisant le Microsoft Kinect) et la réalité du ter- rain (avec des marqueurs) a été faite pour démontrer la robustesse et l’efficacité de notre système. L’erreur moyenne de l’estimation du modèle de squelette humain par rapport à la réalité du terrain entre notre méthode vs Kinect est très prometteur: les joints des angles de cuisses (6,29◦ contre 9,68◦), jambes (7,68◦ contre 11,47◦), pieds (6,14◦ contre 13,63◦), la longueur de la foulée (6.14cm rapport de 13.63cm) sont meilleurs et plus stables que ceux de la Kinect, alors que le système peut maintenir une précision assez proche de la Kinect pour les bras (7,29◦ contre 6,12◦), les bras inférieurs (8,33◦ contre 8,04◦), et le torse (8,69◦contre 6,47◦). Basé sur le modèle de squelette obtenu par chaque méthode, nous avons réalisé une étude de symétrie sur différentes articulations (coude, genou et cheville) en utilisant chaque méthode sur trois sujets différents pour voir quelle méthode permet de distinguer plus efficacement la caractéristique symétrie / asymétrie de la marche. Dans notre test, notre système a un angle de genou au maximum de 8,97◦ et 13,86◦ pour des promenades normale et asymétrique respectivement, tandis que la Kinect a donné 10,58◦et 11,94◦. Par rapport à la réalité de terrain, 7,64◦et 14,34◦, notre système a montré une plus grande précision et pouvoir discriminant entre les deux cas. / Gait analysis has emerged as one of the most important medical field recently due to its wide range of applications. Marker-based systems are the most favoured methods of human motion assessment and gait analysis, however, these systems require specific equipment and expertise, and are cumbersome, costly and difficult to use. Many re- cent computer-vision-based approaches have been developed to reduce the cost of the expensive motion capture systems while ensuring high accuracy result. In this thesis, we introduce our new low-cost gait analysis system that is composed of two low-cost monocular cameras (camcorders) placed on the left and right sides of a treadmill. Each 2D left or right human skeleton model is reconstructed from each view based on dy- namic color segmentation, the gait analysis is then performed on these two models. The validation with one state-of-the-art vision-based motion capture system (using the Mi- crosoft Kinect v.1) and one ground-truth (with markers) was done to demonstrate the robustness and efficiency of our system. The average error in human skeleton model estimation compared to ground-truth between our method vs. Kinect are very promis- ing: the joints angles of upper legs (6.29◦ vs. 9.68◦), lower legs (7.68◦ vs. 11.47◦), feet (6.14◦ vs. 13.63◦), stride lengths (6.14cm vs. 13.63cm) were better and more stable than those from the Kinect, while the system could maintain a reasonably close accu- racy to the Kinect for upper arms (7.29◦ vs. 6.12◦), lower arms (8.33◦ vs. 8.04◦), and torso (8.69◦ vs. 6.47◦). Based on the skeleton model obtained by each method, we per- formed a symmetry study on various joints (elbow, knee and ankle) using each method on two different subjects to see which method can distinguish more efficiently the sym- metry/asymmetry characteristic of gaits. In our test, our system reported a maximum knee angle of 8.97◦ and 13.86◦ for normal and asymmetric walks respectively, while the Kinect gave 10.58◦ and 11.94◦. Compared to the ground-truth, 7.64◦ and 14.34◦, our system showed more accuracy and discriminative power between the two cases.
75

Human pose estimation and action recognition by multi-robot systems / Estimation de pose humaine et reconnaissance d’action par un système multi-robots

Dogan, Emre 07 July 2017 (has links)
L'estimation de la pose humaine et la reconnaissance des activités humaines sont des étapes importantes dans de nombreuses applications comme la robotique, la surveillance et la sécurité, etc. Actuellement abordées dans le domaine, ces tâches ne sont toujours pas résolues dans des environnements non-coopératifs particulièrement. Ces tâches admettent de divers défis comme l'occlusion, les variations des vêtements, etc. Les méthodes qui exploitent des images de profondeur ont l’avantage concernant les défis liés à l'arrière-plan et à l'apparence, pourtant, l’application est limitée pour des raisons matérielles. Dans un premier temps, nous nous sommes concentrés sur la reconnaissance des actions complexes depuis des vidéos. Pour ceci, nous avons introduit une représentation spatio-temporelle indépendante du point de vue. Plus précisément, nous avons capturé le mouvement de la personne en utilisant un capteur de profondeur et l'avons encodé en 3D pour le représenter. Un descripteur 3D a ensuite été utilisé pour la classification des séquences avec la méthodologie bag-of-words. Pour la deuxième partie, notre objectif était l'estimation de pose articulée, qui est souvent une étape intermédiaire pour la reconnaissance de l'activité. Notre motivation était d'incorporer des informations à partir de capteurs multiples et de les fusionner pour surmonter le problème de l'auto-occlusion. Ainsi, nous avons proposé un modèle de flexible mixtures-of-parts multi-vues inspiré par la méthodologie classique de structure pictural. Nous avons démontré que les contraintes géométriques et les paramètres de cohérence d'apparence sont efficaces pour renforcer la cohérence entre les points de vue, aussi que les paramètres classiques. Finalement, nous avons évalué ces nouvelles méthodes sur des datasets publics, qui vérifie que l'utilisation de représentations indépendantes de la vue et l'intégration d'informations à partir de points de vue multiples améliore la performance pour les tâches ciblées dans le cadre de cette manuscrit. / Estimating human pose and recognizing human activities are important steps in many applications, such as human computer interfaces (HCI), health care, smart conferencing, robotics, security surveillance etc. Despite the ongoing effort in the domain, these tasks remained unsolved in unconstrained and non cooperative environments in particular. Pose estimation and activity recognition face many challenges under these conditions such as occlusion or self occlusion, variations in clothing, background clutter, deformable nature of human body and diversity of human behaviors during activities. Using depth imagery has been a popular solution to address appearance and background related challenges, but it has restricted application area due to its hardware limitations and fails to handle remaining problems. Specifically, we considered action recognition scenarios where the position of the recording device is not fixed, and consequently require a method which is not affected by the viewpoint. As a second prob- lem, we tackled the human pose estimation task in particular settings where multiple visual sensors are available and allowed to collaborate. In this thesis, we addressed these two related problems separately. In the first part, we focused on indoor action recognition from videos and we consider complex ac- tivities. To this end, we explored several methodologies and eventually introduced a 3D spatio-temporal representation for a video sequence that is viewpoint independent. More specifically, we captured the movement of the person over time using depth sensor and we encoded it in 3D to represent the performed action with a single structure. A 3D feature descriptor was employed afterwards to build a codebook and classify the actions with the bag-of-words approach. As for the second part, we concentrated on articulated pose estimation, which is often an intermediate step for activity recognition. Our motivation was to incorporate information from multiple sources and views and fuse them early in the pipeline to overcome the problem of self-occlusion, and eventually obtain robust estimations. To achieve this, we proposed a multi-view flexible mixture of parts model inspired by the classical pictorial structures methodology. In addition to the single-view appearance of the human body and its kinematic priors, we demonstrated that geometrical constraints and appearance- consistency parameters are effective for boosting the coherence between the viewpoints in a multi-view setting. Both methods that we proposed was evaluated on public benchmarks and showed that the use of view-independent representations and integrating information from multiple viewpoints improves the performance of action recognition and pose estimation tasks, respectively.
76

Vérification automatique des montages d'usinage par vision : application à la sécurisation de l'usinage / Vision-based automatic verification of machining setup : application to machine tools safety

Karabagli, Bilal 06 November 2013 (has links)
Le terme "usinage à porte fermée", fréquemment employé par les PME de l’aéronautique et de l’automobile, désigne l’automatisation sécurisée du processus d’usinage des pièces mécaniques. Dans le cadre de notre travail, nous nous focalisons sur la vérification du montage d’usinage, avant de lancer la phase d’usinage proprement dite. Nous proposons une solution sans contact, basée sur la vision monoculaire (une caméra), permettant de reconnaitre automatiquement les éléments du montage (brut à usiner, pions de positionnement, tiges de fixation,etc.), de vérifier que leur implantation réelle (réalisée par l’opérateur) est conforme au modèle 3D numérique de montage souhaité (modèle CAO), afin de prévenir tout risque de collision avec l’outil d’usinage. / In High Speed Machining it is of key importance to avoid any collision between the machining tool and the machining setup. If the machining setup has not been assembled correctly by the operator and is not conform to the 3D CAD model sent to the machining unit, such collisions can occur. We have developed a vision system, that utilizes a single camera, to automatically check the conformity of the actual machining setup within the desired 3D CAD model, before launching the machining operation. First, we propose a configuration of the camera within the machining setup to ensure a best acquisition of the scene. In the aim to segmente the image in regions of interest, e.g. regions of the clamping elements and piece, based-on 3D CAD model, we realise a matching between graphes, theorical and real graphe computed from theorical image of 3D-CAD model and real image given by real camera. The graphs are constructed from a simple feature, such as circles and lines, that are manely present in the machining setup. In the aim to define the regions of interest (ROI) in real image within ROI given by 3D CAD model, we project a 3D CAD model in the real image, e.g. augmented reality. To automatically check the accordance between every region defined, we propose to compute three parametres, such as skeleton to represente the form, edges to represent a geometry and Area to represent dimension. We compute a score of accordance between three parameters that will be analyzed in fuzzy system to get a decision of conformity of the clamping element within it definition given in the CAD model. Some cases of machining setup configurations require 3D information to test the trajectory of the machine tool. To get out this situation, we have proposed a new depth from defocus based-method to compute a depth map of the scene. Finally, we present the result of our solution and we show the feasibility and robustness of the proposed solution in differents case of machining setup.
77

Enriching Remote Labs with Computer Vision and Drones / Enrichir les laboratoires distants grâce à la vision par ordinateur avec drone.

Khattar, Fawzi 13 December 2018 (has links)
Avec le progrès technologique, de nouvelles technologies sont en cours de développement afin de contribuer à une meilleure expérience dans le domaine de l’éducation. En particulier, les laboratoires distants constituent un moyen intéressant et pratique qui peut motiver les étudiants à apprendre. L'étudiant peut à tout moment, et de n'importe quel endroit, accéder au laboratoire distant et faire son TP (travail pratique). Malgré les nombreux avantages, les technologies à distance dans l’éducation créent une distance entre l’étudiant et l’enseignant. Les élèves peuvent avoir des difficultés à faire le TP si aucune intervention appropriée ne peut être prise pour les aider. Dans cette thèse, nous visons à enrichir un laboratoire électronique distant conçu pour les étudiants en ingénierie et appelé «LaboREM» (pour remote laboratory) de deux manières: tout d'abord, nous permettons à l'étudiant d'envoyer des commandes de haut niveau à un mini-drone disponible dans le laboratoire distant. L'objectif est d'examiner les faces-avant des instruments de mesure électroniques, à l'aide de la caméra intégrée au drone. De plus, nous autorisons la communication élève-enseignant à distance à l'aide du drone, au cas où un enseignant serait présent dans le laboratoire distant. Enfin, le drone doit revenir pour atterrir sur la plate-forme de recharge automatique des batteries, quand la mission est terminée. Nous proposons aussi un système automatique pour estimer l'état de l'étudiant (frustré / concentré..) afin de prendre les interventions appropriées pour assurer un bon déroulement du TP distant. Par exemple, si l'élève a des difficultés majeures, nous pouvons lui donner des indications ou réduire le niveau de difficulté de l’exercice. Nous proposons de faire cela en utilisant des signes visuels (estimation de la pose de la tête et analyse de l'expression faciale). De nombreuses évidences sur l'état de l'étudiant peuvent être acquises, mais elles sont incomplètes, parfois inexactes et ne couvrent pas tous les aspects de l'état de l'étudiant. C'est pourquoi nous proposons dans cette thèse de fusionner les preuves en utilisant la théorie de Dempster-Shafer qui permet la fusion de preuves incomplètes. / With the technological advance, new learning technologies are being developed in order to contribute to better learning experience. In particular, remote labs constitute an interesting and a practical way that can motivate nowadays students to learn. The student can at any time, and from anywhere, access the remote lab and do his lab-work. Despite many advantages, remote technologies in education create a distance between the student and the teacher. Without the presence of a teacher, students can have difficulties, if no appropriate interventions can be taken to help them. In this thesis, we aim to enrich an existing remote electronic lab made for engineering students called “LaboREM” (for remote Laboratory) in two ways: first we enable the student to send high level commands to a mini-drone available in the remote lab facility. The objective is to examine the front panels of electronic measurement instruments, by the camera embedded on the drone. Furthermore, we allow remote student-teacher communication using the drone, in case there is a teacher present in the remote lab facility. Finally, the drone has to go back home when the mission is over to land on a platform for automatic recharge of the batteries. Second, we propose an automatic system that estimates the affective state of the student (frustrated/ confused/ flow..) in order to take appropriate interventions to ensure good learning outcomes. For example, if the student is having major difficulties we can try to give him hints or reduce the difficulty level. We propose to do this by using visual cues (head pose estimation and facial expression analysis). Many evidences on the state of the student can be acquired, however these evidences are incomplete, sometimes inaccurate, and do not cover all the aspects of the state of the student alone. This is why we propose to fuse evidences using the theory of Dempster-Shafer that allows the fusion of incomplete evidence.
78

Detecção de faces e rastreamento da pose da cabeça

Schramm, Rodrigo 20 March 2009 (has links)
Submitted by Mariana Dornelles Vargas (marianadv) on 2015-04-27T19:08:59Z No. of bitstreams: 1 deteccao_faces.pdf: 3878917 bytes, checksum: 2fbf8222ef54d5fc0b1df0bf3b3a5292 (MD5) / Made available in DSpace on 2015-04-27T19:08:59Z (GMT). No. of bitstreams: 1 deteccao_faces.pdf: 3878917 bytes, checksum: 2fbf8222ef54d5fc0b1df0bf3b3a5292 (MD5) Previous issue date: 2009-03-20 / HP - Hewlett-Packard Brasil Ltda / As câmeras de vídeo já fazem parte dos novos modelos de interação entre o homem e a máquina. Através destas, a face e a pose da cabeça podem ser detectadas promovendo novos recursos para o usuário. Entre o conjunto de aplicações que têm se beneficiado deste tipo de recurso estão a vídeo-conferência, os jogos educacionais e de entretenimento, o controle de atenção de motoristas e a medida de foco de atenção. Nesse contexto insere-se essa proposta de mestrado, a qual propõe um novo modelo para detectar e rastrear a pose da cabeça a partir de uma seqüência de vídeo obtida com uma câmera monocular. Para alcançar esse objetivo, duas etapas principais foram desenvolvidas: a detecção da face e o rastreamento da pose. Nessa etapa, a face é detectada em pose frontal utilizando-se um detector com haar-like features. Na segunda etapa do algoritmo, após a detecção da face em pose frontal, atributos específicos da mesma são rastreados para estimar a variação da pose de cabeça. / Video cameras are already part of the new man-machine interaction models. Through these, the face and pose of the head can be found, providing new resources for users. Among the applications that have benefited from this type of resource are video conference, educational and entertainment games, and measurement of attention focus. In this context, this Master's thesis proposes a new model to detect and track the pose of the head in a video sequence captured by a monocular camera. To achieve this goal, two main stages were developed: face detection and head pose tracking. The first stage is the starting point for tracking the pose. In this stage, the face is detected in frontal pose using a detector with Haar-like features. In the second step of the algorithm, after detecting the face in frontal pose, specific attributes of the read are tracked to estimate the change in the pose of the head.
79

Bring Your Body into Action : Body Gesture Detection, Tracking, and Analysis for Natural Interaction

Abedan Kondori, Farid January 2014 (has links)
Due to the large influx of computers in our daily lives, human-computer interaction has become crucially important. For a long time, focusing on what users need has been critical for designing interaction methods. However, new perspective tends to extend this attitude to encompass how human desires, interests, and ambitions can be met and supported. This implies that the way we interact with computers should be revisited. Centralizing human values rather than user needs is of the utmost importance for providing new interaction techniques. These values drive our decisions and actions, and are essential to what makes us human. This motivated us to introduce new interaction methods that will support human values, particularly human well-being. The aim of this thesis is to design new interaction methods that will empower human to have a healthy, intuitive, and pleasurable interaction with tomorrow’s digital world. In order to achieve this aim, this research is concerned with developing theories and techniques for exploring interaction methods beyond keyboard and mouse, utilizing human body. Therefore, this thesis addresses a very fundamental problem, human motion analysis. Technical contributions of this thesis introduce computer vision-based, marker-less systems to estimate and analyze body motion. The main focus of this research work is on head and hand motion analysis due to the fact that they are the most frequently used body parts for interacting with computers. This thesis gives an insight into the technical challenges and provides new perspectives and robust techniques for solving the problem.
80

Melhorando a estima??o de pose com o RANSAC preemptivo generalizado e m?ltiplos geradores de hip?teses

Gomes Neto, Severino Paulo 27 February 2014 (has links)
Made available in DSpace on 2014-12-17T15:47:04Z (GMT). No. of bitstreams: 1 SeverinoPGN_TESE.pdf: 2322839 bytes, checksum: eda5c48fde7c920680bcb8d8be8d5d21 (MD5) Previous issue date: 2014-02-27 / The camera motion estimation represents one of the fundamental problems in Computer Vision and it may be solved by several methods. Preemptive RANSAC is one of them, which in spite of its robustness and speed possesses a lack of flexibility related to the requirements of applications and hardware platforms using it. In this work, we propose an improvement to the structure of Preemptive RANSAC in order to overcome such limitations and make it feasible to execute on devices with heterogeneous resources (specially low budget systems) under tighter time and accuracy constraints. We derived a function called BRUMA from Preemptive RANSAC, which is able to generalize several preemption schemes, allowing previously fixed parameters (block size and elimination factor) to be changed according the applications constraints. We also propose the Generalized Preemptive RANSAC method, which allows to determine the maximum number of hipotheses an algorithm may generate. The experiments performed show the superiority of our method in the expected scenarios. Moreover, additional experiments show that the multimethod hypotheses generation achieved more robust results related to the variability in the set of evaluated motion directions / A estima??o de pose/movimento de c?mera constitui um dos problemas fundamentais na vis?o computacional e pode ser resolvido por v?rios m?todos. Dentre estes m?todos se destaca o Preemptive RANSAC (RANSAC Preemptivo), que apesar da robustez e velocidade apresenta problemas de falta de flexibilidade em rela??o a requerimentos das aplica??es e plataformas computacionais utilizadas. Neste trabalho, propomos um aperfei?oamento da estrutura do Preemptive RANSAC para superar esta limita??o e viabilizar sua execu??o em dispositivos com recursos variados (enfatizando os de poucas capacidades) atendendo a requisitos de tempo e precis?o diversos. Derivamos do Preemptive RANSAC uma fun??o a que chamamos BRUMA, que ? capaz de generalizar v?rios esquemas de preemp??o e que permite que par?metros anteriormente fixos (tamanho de bloco e fator de elimina??o) sejam configurados de acordo com as restri??es da aplica??o. Propomos o m?todo Generalized Preemptive RANSAC (RANSAC Preemptivo Generalizado) que permite ainda alterar a quantidade m?xima de hip?teses a gerar. Os experimentos demonstraram superioridade de nossa proposta nos cen?rios esperados. Al?m disso, experimentos adicionais demonstram que a gera??o de hip?teses multim?todos produz resultados mais robustos em rela??o ? variabilidade nos tipos de movimento executados

Page generated in 0.0318 seconds