Global ETD Search

71	Synthetic Data Generation for 6D Object Pose and Grasping Estimation Martínez González, Pablo 16 March 2023 (has links) Teaching a robot how to behave so it becomes completely autonomous is not a simple task. When robotic systems become truly intelligent, interactions with them will feel natural and easy, but nothing could be further from truth. Make a robot understand its surroundings is a huge task that the computer vision field tries to address, and deep learning techniques are bringing us closer. But at the cost of the data. Synthetic data generation is the process of generating artificial data that is used to train machine learning models. This data is generated using computer algorithms and simulations, and is designed to resemble real-world data as closely as possible. The use of synthetic data has become increasingly popular in recent years, particularly in the field of deep learning, due to the shortage of high-quality annotated real-world data and the high cost of collecting it. For that reason, in this thesis we are addressing the task of facilitating the generation of synthetic data with the creation of a framework which leverages advances in modern rendering engines. In this context, the generated synthetic data can be used to train models for tasks such as 6D object pose estimation or grasp estimation. 6D object pose estimation refers to the problem of determining the position and orientation of an object in 3D space, while grasp estimation involves predicting the position and orientation of a robotic hand or gripper that can be used to pick up and manipulate the object. These are important tasks in robotics and computer vision, as they enable robots to perform complex manipulation and grasping tasks. In this work we propose a way of extracting grasping information from hand-object interactions in virtual reality, so that synthetic data can also boost research in that area. Finally, we use this synthetically generated data to test the proposal of applying 6D object pose estimation architectures to grasping region estimation. This idea is based on both problems sharing several underlying concepts such as object detection and orientation. / Enseñar a un robot a ser completamente autónomo no es tarea fácil. Cuando los sistemas robóticos sean realmente inteligentes, las interacciones con ellos parecerán naturales y fáciles, pero nada más lejos de la realidad. Hacer que un robot comprenda y asimile su entorno es una difícil cruzada que el campo de la visión por ordenador intenta abordar, y las técnicas de aprendizaje profundo nos están acercando al objetivo. Pero el precio son los datos. La generación de datos sintéticos es el proceso de generar datos artificiales que se utilizan para entrenar modelos de aprendizaje automático. Estos datos se generan mediante algoritmos informáticos y simulaciones, y están diseñados para parecerse lo más posible a los datos del mundo real. El uso de datos sintéticos se ha vuelto cada vez más popular en los últimos años, especialmente en el campo del aprendizaje profundo, debido a la escasez de datos reales anotados de alta calidad y al alto coste de su recopilación. Por ello, en esta tesis abordamos la tarea de facilitar la generación de datos sintéticos con la creación de una herramienta que aprovecha los avances de los motores modernos de renderizado. En este contexto, los datos sintéticos generados pueden utilizarse para entrenar modelos para tareas como la estimación de la pose 6D de objetos o la estimación de agarres. La estimación de la pose 6D de objetos se refiere al problema de determinar la posición y orientación de un objeto en el espacio 3D, mientras que la estimación del agarre implica predecir la posición y orientación de una mano robótica o pinza que pueda utilizarse para coger y manipular el objeto. Se trata de tareas importantes en robótica y visión por computador, ya que permiten a los robots realizar tareas complejas de manipulación y agarre. En este trabajo proponemos una forma de extraer información de agarres a partir de interacciones mano-objeto en realidad virtual, de modo que los datos sintéticos también puedan impulsar la investigación en esa área. Por último, utilizamos estos datos generados sintéticamente para poner a prueba la propuesta de aplicar arquitecturas de estimación de pose 6D de objetos a la estimación de regiones de agarre. Esta propuesta se basa en que ambos problemas comparten varios conceptos subyacentes, como la detección y orientación de objetos. / This thesis has been funded by the Spanish Ministry of Education [FPU17/00166] 6D Object Pose estimation Synthetic Data Sim2real Grasping Virtual Reality / Estimación de pose 6D Datos Sintéticos Simulación a real Agarres Realidad Virtual
72	Locally Tuned Nonlinear Manifold for Person Independent Head Pose Estimation Foytik, Jacob D. 22 August 2011 (has links) No description available. Computer Engineering Electrical Engineering Head Pose Estimation Piecewise Linear Manifold Pose Sensitive Representations Coarse to Fine Head Orientation Phase Congruency
73	Human-Robot Interaction with Pose Estimation and Dual-Arm Manipulation Using Artificial Intelligence Ren, Hailin 16 April 2020 (has links) This dissertation focuses on applying artificial intelligence techniques to human-robot interaction, which involves human pose estimation and dual-arm robotic manipulation. The motivating application behind this work is autonomous victim extraction in disaster scenarios using a conceptual design of a Semi-Autonomous Victim Extraction Robot (SAVER). SAVER is equipped with an advanced sensing system and two powerful robotic manipulators as well as a head and neck stabilization system to achieve autonomous safe and effective victim extraction, thereby reducing the potential risk to field medical providers. This dissertation formulates the autonomous victim extraction process using a dual-arm robotic manipulation system for human-robot interaction. According to the general process of Human-Robot Interaction (HRI), which includes perception, control, and decision-making, this research applies machine learning techniques to human pose estimation, robotic manipulator modeling, and dual-arm robotic manipulation, respectively. In the human pose estimation, an efficient parallel ensemble-based neural network is developed to provide real-time human pose estimation on 2D RGB images. A 13-limb, 14-joint skeleton model is used in this perception neural network and each ensemble of the neural network is designed for a specific limb detection. The parallel structure poses two main benefits: (1) parallel ensembles architecture and multiple Graphics Processing Units (GPU) make distributed computation possible, and (2) each individual ensemble can be deployed independently, making the processing more efficient when the detection of only some specific limbs is needed for the tasks. Precise robotic manipulator modeling benefits from the simplicity of the controller design and improves the performance of trajectory following. Traditional system modeling relies on first principles, simplifying assumptions and prior knowledge. Any imperfection in the above could lead to an analytical model that is different from the real system. Machine learning techniques have been applied in this field to pursue faster computation and more accurate estimation. However, a large dataset is always needed for these techniques, while obtaining the data from the real system could be costly in terms of both time and maintenance. In this research, a series of different Generative Adversarial Networks (GANs) are proposed to efficiently identify inverse kinematics and inverse dynamics of the robotic manipulators. One four-Degree-of-Freedom (DOF) robotic manipulator and one six-DOF robotic manipulator are used with different sizes of the dataset to evaluate the performance of the proposed GANs. The general methods can also be adapted to other systems, whose dataset is limited using general machine learning techniques. In dual-arm robotic manipulation, basic behaviors such as reaching, pushing objects, and picking objects up are learned using Reinforcement Learning. A Teacher-Student advising framework is proposed to learn a single neural network to control dual-arm robotic manipulators with previous knowledge of controlling a single robotic manipulator. Simulation and experimental results present the efficiency of the proposed framework compared to the learning process from scratch. Another concern in robotic manipulation is safety constraints. A variable-reward hierarchical reinforcement learning framework is proposed to solve sparse reward and tasks with constraints. A task of picking up and placing two objects to target positions while keeping them in a fixed distance within a threshold is used to evaluate the performance of the proposed method. Comparisons to other state-of-the-art methods are also presented. Finally, all the three proposed components are integrated as a single system. Experimental evaluation with a full-size manikin was performed to validate the concept of applying artificial intelligence techniques to autonomous victim extraction using a dual-arm robotic manipulation system. / Doctor of Philosophy / Using mobile robots for autonomous victim extraction in disaster scenarios reduces the potential risk to field medical providers. This dissertation focuses on applying artificial intelligence techniques to this human-robot interaction task involving pose estimation and dual-arm manipulation for victim extraction. This work is based on a design of a Semi-Autonomous Victim Extraction Robot (SAVER). SAVER is equipped with an advanced sensing system and two powerful robotic manipulators as well as a head and neck stabilization system attached on an embedded declining stretcher to achieve autonomous safe and effective victim extraction. Therefore, the overall research in this dissertation addresses: human pose estimation, robotic manipulator modeling, and dual-arm robotic manipulation for human pose adjustment. To accurately estimate the human pose for real-time applications, the dissertation proposes a neural network that could take advantages of multiple Graphics Processing Units (GPU). Considering the cost in data collection, the dissertation proposed novel machine learning techniques to obtain the inverse dynamic model and the inverse kinematic model of the robotic manipulators using limited collected data. Applying safety constraints is another requirement when robots interacts with humans. This dissertation proposes reinforcement learning techniques to efficiently train a dual-arm manipulation system not only to perform the basic behaviors, such as reaching, pushing objects and picking up and placing objects, but also to take safety constraints into consideration in performing tasks. Finally, the three components mentioned above are integrated together as a complete system. Experimental validation and results are discussed at the end of this dissertation. Reinforcement Learning Deep learning (Machine learning) Human-Robot Interaction Human Pose Estimation Human Pose Manipulation Dual-arm Manipulation
74	Repousser les limites de l'identification faciale en contexte de vidéo-surveillance / Breaking the limits of facial identification in video-surveillance context. Fiche, Cécile 31 January 2012 (has links) Les systèmes d'identification de personnes basés sur le visage deviennent de plus en plus répandus et trouvent des applications très variées, en particulier dans le domaine de la vidéosurveillance. Or, dans ce contexte, les performances des algorithmes de reconnaissance faciale dépendent largement des conditions d'acquisition des images, en particulier lorsque la pose varie mais également parce que les méthodes d'acquisition elles mêmes peuvent introduire des artéfacts. On parle principalement ici de maladresse de mise au point pouvant entraîner du flou sur l'image ou bien d'erreurs liées à la compression et faisant apparaître des effets de blocs. Le travail réalisé au cours de la thèse porte donc sur la reconnaissance de visages à partir d'images acquises à l'aide de caméras de vidéosurveillance, présentant des artéfacts de flou ou de bloc ou bien des visages avec des poses variables. Nous proposons dans un premier temps une nouvelle approche permettant d'améliorer de façon significative la reconnaissance des visages avec un niveau de flou élevé ou présentant de forts effets de bloc. La méthode, à l'aide de métriques spécifiques, permet d'évaluer la qualité de l'image d'entrée et d'adapter en conséquence la base d'apprentissage des algorithmes de reconnaissance. Dans un second temps, nous nous sommes focalisés sur l'estimation de la pose du visage. En effet, il est généralement très difficile de reconnaître un visage lorsque celui-ci n'est pas de face et la plupart des algorithmes d'identification de visages considérés comme peu sensibles à ce paramètre nécessitent de connaître la pose pour atteindre un taux de reconnaissance intéressant en un temps relativement court. Nous avons donc développé une méthode d'estimation de la pose en nous basant sur des méthodes de reconnaissance récentes afin d'obtenir une estimation rapide et suffisante de ce paramètre. / The person identification systems based on face recognition are becoming increasingly widespread and are being used in very diverse applications, particularly in the field of video surveillance. In this context, the performance of the facial recognition algorithms largely depends on the image acquisition context, especially because the pose can vary, but also because the acquisition methods themselves can introduce artifacts. The main issues are focus imprecision, which can lead to blurred images, or the errors related to compression, which can introduce the block artifact. The work done during the thesis focuses on facial recognition in images taken by video surveillance cameras, in cases where the images contain blur or block artifacts or show various poses. First, we are proposing a new approach that allows to significantly improve facial recognition in images with high blur levels or with strong block artifacts. The method, which makes use of specific noreference metrics, starts with the evaluation of the quality level of the input image and then adapts the training database of the recognition algorithms accordingly. Second, we have focused on the facial pose estimation. Normally, it is very difficult to recognize a face in an image taken from another viewpoint than the frontal one and the majority of facial identification algorithms which are robust to pose variation need to know the pose in order to achieve a satisfying recognition rate in a relatively short time. We have therefore developed a fast and satisfying pose estimation method based on recent recognition techniques. Reconnaissance de visages Conditions non contrôlées Estimateur de pose Métriques de qualité sans référence Flou Effets de bloc Face recognition Uncontrolled condition of acquisition Pose estimation No-reference quality metrics Blur Blocks artefacts Face pose estimation
75	Real-Time Head Pose Estimation in Low-Resolution Football Footage / Realtidsestimering av huvudets vridning i lågupplösta videosekvenser från fotbollsmatcher Launila, Andreas January 2009 (has links) <p>This report examines the problem of real-time head pose estimation in low-resolution football footage. A method is presented for inferring the head pose using a combination of footage and knowledge of the locations of the football and players. An ensemble of randomized ferns is compared with a support vector machine for processing the footage, while a support vector machine performs pattern recognition on the location data. Combining the two sources of information outperforms either in isolation. The location of the football turns out to be an important piece of information.</p> / QC 20100707 / Capturing and Visualizing Large scale Human Action (ACTVIS) head pose estimation football real-time coarse head pose estimation machine learning computer vision svm randomized ferns Computer science Datalogi Image analysis Bildanalys Computer engineering Datorteknik
76	Video-based analysis of Gait pathologies Nguyen, Hoang Anh 12 1900 (has links) L’analyse de la marche a émergé comme l’un des domaines médicaux le plus im- portants récemment. Les systèmes à base de marqueurs sont les méthodes les plus fa- vorisées par l’évaluation du mouvement humain et l’analyse de la marche, cependant, ces systèmes nécessitent des équipements et de l’expertise spécifiques et sont lourds, coûteux et difficiles à utiliser. De nombreuses approches récentes basées sur la vision par ordinateur ont été développées pour réduire le coût des systèmes de capture de mou- vement tout en assurant un résultat de haute précision. Dans cette thèse, nous présentons notre nouveau système d’analyse de la démarche à faible coût, qui est composé de deux caméras vidéo monoculaire placées sur le côté gauche et droit d’un tapis roulant. Chaque modèle 2D de la moitié du squelette humain est reconstruit à partir de chaque vue sur la base de la segmentation dynamique de la couleur, l’analyse de la marche est alors effectuée sur ces deux modèles. La validation avec l’état de l’art basée sur la vision du système de capture de mouvement (en utilisant le Microsoft Kinect) et la réalité du ter- rain (avec des marqueurs) a été faite pour démontrer la robustesse et l’efficacité de notre système. L’erreur moyenne de l’estimation du modèle de squelette humain par rapport à la réalité du terrain entre notre méthode vs Kinect est très prometteur: les joints des angles de cuisses (6,29◦ contre 9,68◦), jambes (7,68◦ contre 11,47◦), pieds (6,14◦ contre 13,63◦), la longueur de la foulée (6.14cm rapport de 13.63cm) sont meilleurs et plus stables que ceux de la Kinect, alors que le système peut maintenir une précision assez proche de la Kinect pour les bras (7,29◦ contre 6,12◦), les bras inférieurs (8,33◦ contre 8,04◦), et le torse (8,69◦contre 6,47◦). Basé sur le modèle de squelette obtenu par chaque méthode, nous avons réalisé une étude de symétrie sur différentes articulations (coude, genou et cheville) en utilisant chaque méthode sur trois sujets différents pour voir quelle méthode permet de distinguer plus efficacement la caractéristique symétrie / asymétrie de la marche. Dans notre test, notre système a un angle de genou au maximum de 8,97◦ et 13,86◦ pour des promenades normale et asymétrique respectivement, tandis que la Kinect a donné 10,58◦et 11,94◦. Par rapport à la réalité de terrain, 7,64◦et 14,34◦, notre système a montré une plus grande précision et pouvoir discriminant entre les deux cas. / Gait analysis has emerged as one of the most important medical field recently due to its wide range of applications. Marker-based systems are the most favoured methods of human motion assessment and gait analysis, however, these systems require specific equipment and expertise, and are cumbersome, costly and difficult to use. Many re- cent computer-vision-based approaches have been developed to reduce the cost of the expensive motion capture systems while ensuring high accuracy result. In this thesis, we introduce our new low-cost gait analysis system that is composed of two low-cost monocular cameras (camcorders) placed on the left and right sides of a treadmill. Each 2D left or right human skeleton model is reconstructed from each view based on dy- namic color segmentation, the gait analysis is then performed on these two models. The validation with one state-of-the-art vision-based motion capture system (using the Mi- crosoft Kinect v.1) and one ground-truth (with markers) was done to demonstrate the robustness and efficiency of our system. The average error in human skeleton model estimation compared to ground-truth between our method vs. Kinect are very promis- ing: the joints angles of upper legs (6.29◦ vs. 9.68◦), lower legs (7.68◦ vs. 11.47◦), feet (6.14◦ vs. 13.63◦), stride lengths (6.14cm vs. 13.63cm) were better and more stable than those from the Kinect, while the system could maintain a reasonably close accu- racy to the Kinect for upper arms (7.29◦ vs. 6.12◦), lower arms (8.33◦ vs. 8.04◦), and torso (8.69◦ vs. 6.47◦). Based on the skeleton model obtained by each method, we per- formed a symmetry study on various joints (elbow, knee and ankle) using each method on two different subjects to see which method can distinguish more efficiently the sym- metry/asymmetry characteristic of gaits. In our test, our system reported a maximum knee angle of 8.97◦ and 13.86◦ for normal and asymmetric walks respectively, while the Kinect gave 10.58◦ and 11.94◦. Compared to the ground-truth, 7.64◦ and 14.34◦, our system showed more accuracy and discriminative power between the two cases. Pose estimation Gait Pathology Video-based gait analysis l'estimation de la pose Pathologie Gait
77	Human pose estimation and action recognition by multi-robot systems / Estimation de pose humaine et reconnaissance d’action par un système multi-robots Dogan, Emre 07 July 2017 (has links) L'estimation de la pose humaine et la reconnaissance des activités humaines sont des étapes importantes dans de nombreuses applications comme la robotique, la surveillance et la sécurité, etc. Actuellement abordées dans le domaine, ces tâches ne sont toujours pas résolues dans des environnements non-coopératifs particulièrement. Ces tâches admettent de divers défis comme l'occlusion, les variations des vêtements, etc. Les méthodes qui exploitent des images de profondeur ont l’avantage concernant les défis liés à l'arrière-plan et à l'apparence, pourtant, l’application est limitée pour des raisons matérielles. Dans un premier temps, nous nous sommes concentrés sur la reconnaissance des actions complexes depuis des vidéos. Pour ceci, nous avons introduit une représentation spatio-temporelle indépendante du point de vue. Plus précisément, nous avons capturé le mouvement de la personne en utilisant un capteur de profondeur et l'avons encodé en 3D pour le représenter. Un descripteur 3D a ensuite été utilisé pour la classification des séquences avec la méthodologie bag-of-words. Pour la deuxième partie, notre objectif était l'estimation de pose articulée, qui est souvent une étape intermédiaire pour la reconnaissance de l'activité. Notre motivation était d'incorporer des informations à partir de capteurs multiples et de les fusionner pour surmonter le problème de l'auto-occlusion. Ainsi, nous avons proposé un modèle de flexible mixtures-of-parts multi-vues inspiré par la méthodologie classique de structure pictural. Nous avons démontré que les contraintes géométriques et les paramètres de cohérence d'apparence sont efficaces pour renforcer la cohérence entre les points de vue, aussi que les paramètres classiques. Finalement, nous avons évalué ces nouvelles méthodes sur des datasets publics, qui vérifie que l'utilisation de représentations indépendantes de la vue et l'intégration d'informations à partir de points de vue multiples améliore la performance pour les tâches ciblées dans le cadre de cette manuscrit. / Estimating human pose and recognizing human activities are important steps in many applications, such as human computer interfaces (HCI), health care, smart conferencing, robotics, security surveillance etc. Despite the ongoing effort in the domain, these tasks remained unsolved in unconstrained and non cooperative environments in particular. Pose estimation and activity recognition face many challenges under these conditions such as occlusion or self occlusion, variations in clothing, background clutter, deformable nature of human body and diversity of human behaviors during activities. Using depth imagery has been a popular solution to address appearance and background related challenges, but it has restricted application area due to its hardware limitations and fails to handle remaining problems. Specifically, we considered action recognition scenarios where the position of the recording device is not fixed, and consequently require a method which is not affected by the viewpoint. As a second prob- lem, we tackled the human pose estimation task in particular settings where multiple visual sensors are available and allowed to collaborate. In this thesis, we addressed these two related problems separately. In the first part, we focused on indoor action recognition from videos and we consider complex ac- tivities. To this end, we explored several methodologies and eventually introduced a 3D spatio-temporal representation for a video sequence that is viewpoint independent. More specifically, we captured the movement of the person over time using depth sensor and we encoded it in 3D to represent the performed action with a single structure. A 3D feature descriptor was employed afterwards to build a codebook and classify the actions with the bag-of-words approach. As for the second part, we concentrated on articulated pose estimation, which is often an intermediate step for activity recognition. Our motivation was to incorporate information from multiple sources and views and fuse them early in the pipeline to overcome the problem of self-occlusion, and eventually obtain robust estimations. To achieve this, we proposed a multi-view flexible mixture of parts model inspired by the classical pictorial structures methodology. In addition to the single-view appearance of the human body and its kinematic priors, we demonstrated that geometrical constraints and appearance- consistency parameters are effective for boosting the coherence between the viewpoints in a multi-view setting. Both methods that we proposed was evaluated on public benchmarks and showed that the use of view-independent representations and integrating information from multiple viewpoints improves the performance of action recognition and pose estimation tasks, respectively. Informatique Reconnaissance de mouvement Reconnaissance d'actions Estimation de la pose humaine Mutlivues IT - Information Technology Movment recognition Action recognition Articulated pose estimation Multiview settings 006.420 72
78	Vérification automatique des montages d'usinage par vision : application à la sécurisation de l'usinage / Vision-based automatic verification of machining setup : application to machine tools safety Karabagli, Bilal 06 November 2013 (has links) Le terme "usinage à porte fermée", fréquemment employé par les PME de l’aéronautique et de l’automobile, désigne l’automatisation sécurisée du processus d’usinage des pièces mécaniques. Dans le cadre de notre travail, nous nous focalisons sur la vérification du montage d’usinage, avant de lancer la phase d’usinage proprement dite. Nous proposons une solution sans contact, basée sur la vision monoculaire (une caméra), permettant de reconnaitre automatiquement les éléments du montage (brut à usiner, pions de positionnement, tiges de fixation,etc.), de vérifier que leur implantation réelle (réalisée par l’opérateur) est conforme au modèle 3D numérique de montage souhaité (modèle CAO), afin de prévenir tout risque de collision avec l’outil d’usinage. / In High Speed Machining it is of key importance to avoid any collision between the machining tool and the machining setup. If the machining setup has not been assembled correctly by the operator and is not conform to the 3D CAD model sent to the machining unit, such collisions can occur. We have developed a vision system, that utilizes a single camera, to automatically check the conformity of the actual machining setup within the desired 3D CAD model, before launching the machining operation. First, we propose a configuration of the camera within the machining setup to ensure a best acquisition of the scene. In the aim to segmente the image in regions of interest, e.g. regions of the clamping elements and piece, based-on 3D CAD model, we realise a matching between graphes, theorical and real graphe computed from theorical image of 3D-CAD model and real image given by real camera. The graphs are constructed from a simple feature, such as circles and lines, that are manely present in the machining setup. In the aim to define the regions of interest (ROI) in real image within ROI given by 3D CAD model, we project a 3D CAD model in the real image, e.g. augmented reality. To automatically check the accordance between every region defined, we propose to compute three parametres, such as skeleton to represente the form, edges to represent a geometry and Area to represent dimension. We compute a score of accordance between three parameters that will be analyzed in fuzzy system to get a decision of conformity of the clamping element within it definition given in the CAD model. Some cases of machining setup configurations require 3D information to test the trajectory of the machine tool. To get out this situation, we have proposed a new depth from defocus based-method to compute a depth map of the scene. Finally, we present the result of our solution and we show the feasibility and robustness of the proposed solution in differents case of machining setup. Machine outil à commande numérique Détection de contours Système expert flou Segmentation d'image Estimation de pose Machining setup Edges detection Fuzzy logic Image segmentation Pose estimation
79	Enriching Remote Labs with Computer Vision and Drones / Enrichir les laboratoires distants grâce à la vision par ordinateur avec drone. Khattar, Fawzi 13 December 2018 (has links) Avec le progrès technologique, de nouvelles technologies sont en cours de développement afin de contribuer à une meilleure expérience dans le domaine de l’éducation. En particulier, les laboratoires distants constituent un moyen intéressant et pratique qui peut motiver les étudiants à apprendre. L'étudiant peut à tout moment, et de n'importe quel endroit, accéder au laboratoire distant et faire son TP (travail pratique). Malgré les nombreux avantages, les technologies à distance dans l’éducation créent une distance entre l’étudiant et l’enseignant. Les élèves peuvent avoir des difficultés à faire le TP si aucune intervention appropriée ne peut être prise pour les aider. Dans cette thèse, nous visons à enrichir un laboratoire électronique distant conçu pour les étudiants en ingénierie et appelé «LaboREM» (pour remote laboratory) de deux manières: tout d'abord, nous permettons à l'étudiant d'envoyer des commandes de haut niveau à un mini-drone disponible dans le laboratoire distant. L'objectif est d'examiner les faces-avant des instruments de mesure électroniques, à l'aide de la caméra intégrée au drone. De plus, nous autorisons la communication élève-enseignant à distance à l'aide du drone, au cas où un enseignant serait présent dans le laboratoire distant. Enfin, le drone doit revenir pour atterrir sur la plate-forme de recharge automatique des batteries, quand la mission est terminée. Nous proposons aussi un système automatique pour estimer l'état de l'étudiant (frustré / concentré..) afin de prendre les interventions appropriées pour assurer un bon déroulement du TP distant. Par exemple, si l'élève a des difficultés majeures, nous pouvons lui donner des indications ou réduire le niveau de difficulté de l’exercice. Nous proposons de faire cela en utilisant des signes visuels (estimation de la pose de la tête et analyse de l'expression faciale). De nombreuses évidences sur l'état de l'étudiant peuvent être acquises, mais elles sont incomplètes, parfois inexactes et ne couvrent pas tous les aspects de l'état de l'étudiant. C'est pourquoi nous proposons dans cette thèse de fusionner les preuves en utilisant la théorie de Dempster-Shafer qui permet la fusion de preuves incomplètes. / With the technological advance, new learning technologies are being developed in order to contribute to better learning experience. In particular, remote labs constitute an interesting and a practical way that can motivate nowadays students to learn. The student can at any time, and from anywhere, access the remote lab and do his lab-work. Despite many advantages, remote technologies in education create a distance between the student and the teacher. Without the presence of a teacher, students can have difficulties, if no appropriate interventions can be taken to help them. In this thesis, we aim to enrich an existing remote electronic lab made for engineering students called “LaboREM” (for remote Laboratory) in two ways: first we enable the student to send high level commands to a mini-drone available in the remote lab facility. The objective is to examine the front panels of electronic measurement instruments, by the camera embedded on the drone. Furthermore, we allow remote student-teacher communication using the drone, in case there is a teacher present in the remote lab facility. Finally, the drone has to go back home when the mission is over to land on a platform for automatic recharge of the batteries. Second, we propose an automatic system that estimates the affective state of the student (frustrated/ confused/ flow..) in order to take appropriate interventions to ensure good learning outcomes. For example, if the student is having major difficulties we can try to give him hints or reduce the difficulty level. We propose to do this by using visual cues (head pose estimation and facial expression analysis). Many evidences on the state of the student can be acquired, however these evidences are incomplete, sometimes inaccurate, and do not cover all the aspects of the state of the student alone. This is why we propose to fuse evidences using the theory of Dempster-Shafer that allows the fusion of incomplete evidence. Vision par ordinateur Drone Laboratoire distant Théorie de l’évidence Estimation de pose 3D Computer Vision Drones Remote Labs Evidence theory 3D pose estimation 004.6
80	Detecção de faces e rastreamento da pose da cabeça Schramm, Rodrigo 20 March 2009 (has links) Submitted by Mariana Dornelles Vargas (marianadv) on 2015-04-27T19:08:59Z No. of bitstreams: 1 deteccao_faces.pdf: 3878917 bytes, checksum: 2fbf8222ef54d5fc0b1df0bf3b3a5292 (MD5) / Made available in DSpace on 2015-04-27T19:08:59Z (GMT). No. of bitstreams: 1 deteccao_faces.pdf: 3878917 bytes, checksum: 2fbf8222ef54d5fc0b1df0bf3b3a5292 (MD5) Previous issue date: 2009-03-20 / HP - Hewlett-Packard Brasil Ltda / As câmeras de vídeo já fazem parte dos novos modelos de interação entre o homem e a máquina. Através destas, a face e a pose da cabeça podem ser detectadas promovendo novos recursos para o usuário. Entre o conjunto de aplicações que têm se beneficiado deste tipo de recurso estão a vídeo-conferência, os jogos educacionais e de entretenimento, o controle de atenção de motoristas e a medida de foco de atenção. Nesse contexto insere-se essa proposta de mestrado, a qual propõe um novo modelo para detectar e rastrear a pose da cabeça a partir de uma seqüência de vídeo obtida com uma câmera monocular. Para alcançar esse objetivo, duas etapas principais foram desenvolvidas: a detecção da face e o rastreamento da pose. Nessa etapa, a face é detectada em pose frontal utilizando-se um detector com haar-like features. Na segunda etapa do algoritmo, após a detecção da face em pose frontal, atributos específicos da mesma são rastreados para estimar a variação da pose de cabeça. / Video cameras are already part of the new man-machine interaction models. Through these, the face and pose of the head can be found, providing new resources for users. Among the applications that have benefited from this type of resource are video conference, educational and entertainment games, and measurement of attention focus. In this context, this Master's thesis proposes a new model to detect and track the pose of the head in a video sequence captured by a monocular camera. To achieve this goal, two main stages were developed: face detection and head pose tracking. The first stage is the starting point for tracking the pose. In this stage, the face is detected in frontal pose using a detector with Haar-like features. In the second step of the algorithm, after detecting the face in frontal pose, specific attributes of the read are tracked to estimate the change in the pose of the head. Visão computacional Detecção de faces Rastreamento da pose da cabeça Interface homem-computador Computer vision Face detection Head pose estimation Human-computer interaction

Search results