Global ETD Search

71	Vers une reconnaissance des activités humaines non supervisées et des gestes dans les vidéos / Toward unsupervised human activity and gesture recognition in videos Negin, Farhood 15 October 2018 (has links) L’objectif principal de cette thèse est de proposer un framework complet pour une découverte, modélisation et reconnaissance automatiques des activités humaines dans les vidéos. Afin de modéliser et de reconnaître des activités dans des vidéos à long terme, nous proposons aussi un framework qui combine des informations perceptuelles globales et locales issues de la scène, et qui construit, en conséquence, des modèles d’activités hiérarchiques. Dans la première catégorie du framework, un classificateur supervisé basé sur le vecteur de Fisher est formé et les étiquettes sémantiques prédites sont intégrées dans les modèles hiérarchiques construits. Dans la seconde catégorie, pour avoir un framework complètement non supervisé, plutôt que d’incorporer les étiquettes sémantiques, les codes visuels formés sont stockés dans les modèles. Nous évaluons les frameworks sur deux ensembles de données réalistes sur les activités de la vie quotidienne enregistrées auprés des patients dans un environnement hospitalier. Pour modéliser des mouvements fins du corps humain, nous proposons quatre différents frameworks de reconnaissance de gestes où chaque framework accepte une ou une combinaison de différentes modalités de données en entrée. Nous évaluons les frameworks développés dans le contexte du test de diagnostic médical, appelé Praxis. Nous proposons un nouveau défi dans la reconnaissance gestuelle qui consiste à obtenir une opinion objective sur les performances correctes et incorrectes de gestes très similaires. Les expériences montrent l’efficacité de notre approche basée sur l’apprentissage en profondeur dans la reconnaissance des gestes et les tâches d’évaluation de la performance. / The main goal of this thesis is to propose a complete framework for automatic discovery, modeling and recognition of human activities in videos. In order to model and recognize activities in long-term videos, we propose a framework that combines global and local perceptual information from the scene and accordingly constructs hierarchical activity models. In the first variation of the framework, a supervised classifier based on Fisher vector is trained and the predicted semantic labels are embedded in the constructed hierarchical models. In the second variation, to have a completely unsupervised framework, rather than embedding the semantic labels, the trained visual codebooks are stored in the models. Finally, we evaluate the proposed frameworks on two realistic Activities of Daily Living datasets recorded from patients in a hospital environment. Furthermore, to model fine motions of human body, we propose four different gesture recognition frameworks where each framework accepts one or combination of different data modalities as input. We evaluate the developed frameworks in the context of medical diagnostic test namely Praxis. Praxis test is a gesture-based diagnostic test, which has been accepted as a diagnostically indicative of cortical pathologies such as Alzheimer’s disease. We suggest a new challenge in gesture recognition, which is to obtain an objective opinion about correct and incorrect performances of very similar gestures. The experiments show effectiveness of our deep learning based approach in gesture recognition and performance assessment tasks. Vision par ordinateur Reconnaissance de l'activité humaine Reconnaissance gestuelle Apprentissage automatique Computer vision Activity recognition Gesture recognition Machine learning
72	VR interaktivní aplikace / VR Interactive Application Valenta, Marek January 2018 (has links) This document is primarily concerned with describing the development of an interactive virtual reality application. At the beginning is a description of the history of the VR, dating back to the 19th century. Following it is a description of the devices that are currently being used for the VR and for which this application is created. The Unity 3D development environment is presented, which is, together with the basic description of neural networks, a major part of the whole project. The other half describes the design and implementation of simulation of a magical world, including a new magic spell system that has not yet been used in any VR game, which is the recognition of hand gestures for spell creation by using neural network. The work also contains a description of the whole application, rooms and objects. The conclusion discusses the problems and dead ends that have been tried but failed.
73	Ovládání počítače gesty / Gesture Based Human-Computer Interface Jaroň, Lukáš January 2012 (has links) This masters thesis describes possibilities and principles of gesture-based computer interface. The work describes general approaches for gesture control. It also deals with implementation of the selected detection method of the hands and fingers using depth maps loaded form Kinect sensor. The implementation also deals with gesture recognition using hidden Markov models. For demonstration purposes there is also described implementation of a simple photo viewer that uses developed gesture-based computer interface. The work also focuses on quality testing and accuracy evaluation for selected gesture recognizer.
74	South African Sign Language Recognition Using Feature Vectors and Hidden Markov Models Naidoo, Nathan Lyle January 2010 (has links) >Magister Scientiae - MSc / This thesis presents a system for performing whole gesture recognition for South African Sign Language. The system uses feature vectors combined with Hidden Markov models. In order to construct a feature vector, dynamic segmentation must occur to extract the signer's hand movements. Techniques and methods for normalising variations that occur when recording a signer performing a gesture, are investigated. The system has a classification rate of 69%. Hidden Markov models Personal computer (PC) South African Sign Language (SASL) Recognition and Animation University of the Western Cape (UWC) Gesture Recognition (GR)
75	Méthodes d'apprentissage pour l'interaction homme-machine / Neural Learning Methods for Human-Computer Interaction Kopinski, Thomas 01 February 2016 (has links) Cette thèse a pour but d'améliorer la tâche de reconnaître des gestes de main en utilisant des techniques d'apprentissage par ordinateur et de traitement du signal. Les principales contributions de la thèse sont dédiés à la théorie de l'apprentissage par ordinateur et à l'interaction homme-machine. L'objectif étant d'implanter toutes méthodes en temps réel, toute méthode employé au cours de cette thèse était un compromis entre puissance et temps de calcul nécessaire.Plusieurs pistes ont été poursuivi : au début, la fusion des informations fournies par plusieurs capteurs tu type « time-of-flight » a été étudiée, dans le but d'améliorer le taux de reconnaissances correctes par rapport au cas avec un seul capteur. En particulier, l'impact des différentes caractéristiques calculés à partir d'une nuage de points, et de ses paramètres, a été évalué. Egalement, la performance des réseaux multi-couches (MLP) à été comparé avec celle d'un séparateur à vaste marge (SVM).En s'appuyant sur ces résultats, l'implantation du système dans une voiture a eté effectuée. Tout d'abord, nous avons montré que le système n'est pas du tout gêné par le fait d'être exposé aux conditions d'éclairage « outdoor ». L'extension de la base d'entraînement et une modification des caractéristiques calculé de la nuage des points a pu augmenter le taux de bonnes reconnaissances de façon très significative, ainsi que le rajout des mesures de confiance à la classification.Afin d'améliorer la performance des classifieurs à la base des réseaux multi-couche (MLP), une nouvelle méthode assez simple a été mise au point ensuite. Cette méthode met au profit des informations déjà présentes dans la dernière couche du réseau. En combinant cette nouvelle approche avec une technique de fusion, le taux de bonnes reconnaissances est amélioré, et surtout pour le cas des échantillons « difficiles ». Ces résultats ont été analysés et comparés de façon approfondie en comparant des différentes possibilités de fusion dans un tel contexte. L'exploitation du fait que les données traitées dont des séquences, et qu'il y a par conséquent une cohérence temporelle dans des échantillons successifs, a également été abordée un utilisant les mêmes techniques de fusion. Un système de « infotainment » implanté sur un smartphone, qui utilise les techniques décrites ici, a également été réalisé.Dans un dernier temps, un modèle simplifié de la reconnaissance des gestes dynamiques a été proposé et validé dans un contexte applicatif. Il a été montré que un geste peut être défini de façon assez robuste par une pose initiale et une pose finale, qui sont classé par le système décrit ci-dessus. / This thesis aims at improving the complex task of hand gesture recognition by utilizing machine learning techniques to learn from features calculated from 3D point cloud data. The main contributions of this work are embedded in the domains of machine learning and in the human-machine interaction. Since the goal is to demonstrate that a robust real-time capable system can be set up which provides a supportive means of interaction, the methods researched have to be light-weight in the sense that descriptivity balances itself with the calculation overhead needed to, in fact, remain real-time capable. To this end several approaches were tested:Initially the fusion of multiple ToF-sensors to improve the overall recognition rate was researched. It is examined, how employing more than one sensor can significantly boost recognition results in especially difficult cases and get a first grasp on the influence of the descriptors for this task as well as the influence of the choice of parameters on the calculation of the descriptor. The performance of MLPs with standard parameters is compared with the performance of SVMs for which the parameters have been obtained via grid search.Building on these results, the integration of the system into the car interior is shown. It is demonstrated how such a system can easily be integrated into an outdoor environment subject to strongly varying lighting conditions without the need for tedious calibration procedures. Furthermore the introduction of a modified light-weight version of the descriptor coupled with an extended database significantly boosts the frame rate for the whole recognition pipeline. Lastly the introduction of confidence measures for the output of the MLPs allows for more stable classification results and gives an insight on the innate challenges of this multiclass problem in general.In order to improve the classification performance of the MLPs without the need for sophisticated algorithm design or extensive parameter search a simple method is proposed which makes use of the existing recognition routines by exploiting information already present in the output neurons of the MLPs. A simple fusion technique is proposed which combines descriptor features with neuron confidences coming from a previously trained net and proves that augmented results can be achieved in nearly all cases for problem classes and individuals respectively.These findings are analyzed in-depth on a more theoretical scale by comparing the effectiveness of learning solely on neural activities in the output layer with the previously introduced fusion approach. In order to take into account temporal information, the thesis describes a possible approach on how to exploit the fact that we are dealing with a problem within which data is processed in a sequential manner and therefore problem-specific information can be taken into account. This approach classifies a hand pose by fusing descriptor features with neural activities coming from previous time steps and lays the ground work for the following section of making the transition towards dynamic hand gestures. Furthermore an infotainment system realized on a mobile device is introduced and coupled with the preprocessing and recognition module which in turn is integrated into an automotive setting demonstrating a possible testing environment for a gesture recognition system.In order to extend the developed system to allow for dynamic hand gesture interaction a simplified approach is proposed. This approach demonstrates that recognition of dynamic hand gesture sequences can be achieved with the simple definition of a starting and an ending pose based on a recognition module working with sufficient accuracy and even allowing for relaxed restrictions in terms of defining the parameters for such a sequence. L'interaction homme-Machine Reconnaissance de gestes Reseaux de neurones Capteur 3D Human-Machine Interaction Gesture recognition Neural networks 3D sensors 006.3
76	Pervasive Quantied-Self using Multiple Sensors January 2019 (has links) abstract: The advent of commercial inexpensive sensors and the advances in information and communication technology (ICT) have brought forth the era of pervasive Quantified-Self. Automatic diet monitoring is one of the most important aspects for Quantified-Self because it is vital for ensuring the well-being of patients suffering from chronic diseases as well as for providing a low cost means for maintaining the health for everyone else. Automatic dietary monitoring consists of: a) Determining the type and amount of food intake, and b) Monitoring eating behavior, i.e., time, frequency, and speed of eating. Although there are some existing techniques towards these ends, they suffer from issues of low accuracy and low adherence. To overcome these issues, multiple sensors were utilized because the availability of affordable sensors that can capture the different aspect information has the potential for increasing the available knowledge for Quantified-Self. For a), I envision an intelligent dietary monitoring system that automatically identifies food items by using the knowledge obtained from visible spectrum camera and infrared spectrum camera. This system is able to outperform the state-of-the-art systems for cooked food recognition by 25% while also minimizing user intervention. For b), I propose a novel methodology, IDEA that performs accurate eating action identification within eating episodes with an average F1-score of 0.92. This is an improvement of 0.11 for precision and 0.15 for recall for the worst-case users as compared to the state-of-the-art. IDEA uses only a single wrist-band which includes four sensors and provides feedback on eating speed every 2 minutes without obtaining any manual input from the user. / Dissertation/Thesis / Doctoral Dissertation Computer Engineering 2019 Computer engineering Computer science Electrical engineering Diet monitoring Gesture Recognition Pervasive computing Quantified-Self Time-series Data Modeling Wearable
77	Reconhecimento visual de gestos para imitação e correção de movimentos em fisioterapia guiada por robô / Visual gesture recognition for mimicking and correcting movements in robot-guided physiotherapy Gambirasio, Ricardo Fibe 16 November 2015 (has links) O objetivo deste trabalho é tornar possível a inserção de um robô humanoide para auxiliar pacientes em sessões de fisioterapia. Um sistema robótico é proposto que utiliza um robô humanoide, denominado NAO, visando analisar os movimentos feitos pelos pacientes e corrigi-los se necessário, além de motivá-los durante uma sessão de fisioterapia. O sistema desenvolvido permite que o robô, em primeiro lugar, aprenda um exercício correto de fisioterapia observando sua execução por um fisioterapeuta; em segundo lugar, que ele demonstre o exercício para que um paciente possa imitá-lo; e, finalmente, corrija erros cometidos pelo paciente durante a execução do exercício. O exercício correto é capturado por um sensor Kinect e dividido em uma sequência de estados em dimensão espaço-temporal usando k-means clustering. Estes estados então formam uma máquina de estados finitos para verificar se os movimentos do paciente estão corretos. A transição de um estado para o próximo corresponde a movimentos parciais que compõem o movimento aprendido, e acontece somente quando o robô observa o mesmo movimento parcial executado corretamente pelo paciente; caso contrário o robô sugere uma correção e pede que o paciente tente novamente. O sistema foi testado com vários pacientes em tratamento fisioterapêutico para problemas motores. Os resultados obtidos, em termos de precisão e recuperação para cada movimento, mostraram-se muito promissores. Além disso, o estado emocional dos pacientes foi também avaliado por meio de um questionário aplicado antes e depois do tratamento e durante o tratamento com um software de reconhecimento facial de emoções e os resultados indicam um impacto emocional bastante positivo e que pode vir a auxiliar pacientes durante tratamento fisioterapêuticos. / This dissertation develops a robotic system to guide patients through physiotherapy sessions. The proposed system uses the humanoid robot NAO, and it analyses patients movements to guide, correct, and motivate them during a session. Firstly, the system learns a correct physiotherapy exercise by observing a physiotherapist perform it; secondly, it demonstrates the exercise so that the patient can reproduce it; and finally, it corrects any mistakes that the patient might make during the exercise. The correct exercise is captured via Kinect sensor and divided into a sequence of states in spatial-temporal dimension using k-means clustering. Those states compose a finite state machine that is used to verify whether the patients movements are correct. The transition from one state to the next corresponds to partial movements that compose the learned exercise. If the patient executes the partial movement incorrectly, the system suggests a correction and returns to the same state, asking that the patient try again. The system was tested with multiple patients undergoing physiotherapeutic treatment for motor impairments. Based on the results obtained, the system achieved high precision and recall across all partial movements. The emotional impact of treatment on patients was also measured, via before and after questionnaires and via a software that recognizes emotions from video taken during treatment, showing a positive impact that could help motivate physiotherapy patients, improving their motivation and recovery. Finite state machines Fisioterapia Gesture recognition Human-robot interaction Interação humano-robô Máquinas de estados fintos Physiotherapy Reconhecimento de gestos Robotic vision Visão robótica
78	Gestures in human-robot interaction Bodiroža, Saša 16 February 2017 (has links) Gesten sind ein Kommunikationsweg, der einem Betrachter Informationen oder Absichten übermittelt. Daher können sie effektiv in der Mensch-Roboter-Interaktion, oder in der Mensch-Maschine-Interaktion allgemein, verwendet werden. Sie stellen eine Möglichkeit für einen Roboter oder eine Maschine dar, um eine Bedeutung abzuleiten. Um Gesten intuitiv benutzen zukönnen und Gesten, die von Robotern ausgeführt werden, zu verstehen, ist es notwendig, Zuordnungen zwischen Gesten und den damit verbundenen Bedeutungen zu definieren -- ein Gestenvokabular. Ein Menschgestenvokabular definiert welche Gesten ein Personenkreis intuitiv verwendet, um Informationen zu übermitteln. Ein Robotergestenvokabular zeigt welche Robotergesten zu welcher Bedeutung passen. Ihre effektive und intuitive Benutzung hängt von Gestenerkennung ab, das heißt von der Klassifizierung der Körperbewegung in diskrete Gestenklassen durch die Verwendung von Mustererkennung und maschinellem Lernen. Die vorliegende Dissertation befasst sich mit beiden Forschungsbereichen. Als eine Voraussetzung für die intuitive Mensch-Roboter-Interaktion wird zunächst ein Aufmerksamkeitsmodell für humanoide Roboter entwickelt. Danach wird ein Verfahren für die Festlegung von Gestenvokabulare vorgelegt, das auf Beobachtungen von Benutzern und Umfragen beruht. Anschliessend werden experimentelle Ergebnisse vorgestellt. Eine Methode zur Verfeinerung der Robotergesten wird entwickelt, die auf interaktiven genetischen Algorithmen basiert. Ein robuster und performanter Gestenerkennungsalgorithmus wird entwickelt, der auf Dynamic Time Warping basiert, und sich durch die Verwendung von One-Shot-Learning auszeichnet, das heißt durch die Verwendung einer geringen Anzahl von Trainingsgesten. Der Algorithmus kann in realen Szenarien verwendet werden, womit er den Einfluss von Umweltbedingungen und Gesteneigenschaften, senkt. Schließlich wird eine Methode für das Lernen der Beziehungen zwischen Selbstbewegung und Zeigegesten vorgestellt. / Gestures consist of movements of body parts and are a mean of communication that conveys information or intentions to an observer. Therefore, they can be effectively used in human-robot interaction, or in general in human-machine interaction, as a way for a robot or a machine to infer a meaning. In order for people to intuitively use gestures and understand robot gestures, it is necessary to define mappings between gestures and their associated meanings -- a gesture vocabulary. Human gesture vocabulary defines which gestures a group of people would intuitively use to convey information, while robot gesture vocabulary displays which robot gestures are deemed as fitting for a particular meaning. Effective use of vocabularies depends on techniques for gesture recognition, which considers classification of body motion into discrete gesture classes, relying on pattern recognition and machine learning. This thesis addresses both research areas, presenting development of gesture vocabularies as well as gesture recognition techniques, focusing on hand and arm gestures. Attentional models for humanoid robots were developed as a prerequisite for human-robot interaction and a precursor to gesture recognition. A method for defining gesture vocabularies for humans and robots, based on user observations and surveys, is explained and experimental results are presented. As a result of the robot gesture vocabulary experiment, an evolutionary-based approach for refinement of robot gestures is introduced, based on interactive genetic algorithms. A robust and well-performing gesture recognition algorithm based on dynamic time warping has been developed. Most importantly, it employs one-shot learning, meaning that it can be trained using a low number of training samples and employed in real-life scenarios, lowering the effect of environmental constraints and gesture features. Finally, an approach for learning a relation between self-motion and pointing gestures is presented. Mensch-Roboter-Interaktion Aufmerksamkeitsmodell Gestenerkennung Gestenvokabular Human-Robot Interaction Attentional Models Gesture Recognition Gesture Vocabulary 004 Informatik 28 Informatik, Datenverarbeitung ST 308 ddc:004
79	Projeto de um módulo de aquisição e pré-processamento de imagem colorida baseado em computação reconfigurável e aplicado a robôs móveis / A project of a module for acquisition and color image pre-processing based on reconfigurable computation and applied to mobile robots Bonato, Vanderlei 14 May 2004 (has links) Este trabalho propõe um módulo básico de aquisição e pré-processamento de imagem colorida aplicado a robôs móveis, implementado em hardware reconfigurável, dentro do conceito de sistemas SoC (System-on-a-Chip). O módulo básico é apresentado em conjunto com funções mais específicas de pré-processamento de imagem, que são utilizadas como base para a verificação das funcionalidades implementadas no trabalho proposto. As principais funções realizadas pelo módulo básico são: montagem de frames a partir dos pixels obtidos da câmera digital CMOS, controle dos diversos parâmetros de configuração da câmera e conversão de padrões de cores. Já as funções mais específicas abordam as etapas de segmentação, centralização, redução e interpretação das imagens adquiridas. O tipo de dispositivo reconfigurável utilizado neste trabalho é o FPGA (Field-Programmable Gate Array), que permite maior adequação das funções específicas às necessidades das aplicações, tendo sempre como base o módulo proposto. O sistema foi aplicado para reconhecer gestos e obteve a taxa 99,57% de acerto operando a 31,88 frames por segundo. / This work proposes a basic module for a mobile robot color image capture and pre-processing, implemented in reconfigurable hardware based on SoC (System-on-a-Chip). The basic module is presented with a specifics image pre-processing function that are used as a base for verify the functionalities implemented in this research. The mains functions implemented on this basic module are: to read the pixels provide by the CMOS camera for compose the frame, to adjust the parameters of the camera control and to convert color space. The specifics image pre-processing functions are used to do image segmentation, centralization, reduction and image classification. The reconfigurable dispositive used in this research is the FPGA (Field-Programmable Gate Array) that permit to adapt the specific function according to the application needs. The system was applied to recognize gesture and had 99,57% rate of true recognition at 31,88 frames per second. Boolean Neural Network câmera cmos computação embutida FPGA fpga Gesture Recognition reconhecimento de gestos rede neural booleana Robotics SoC visão computacional Vision System
80	Reconhecimento visual de gestos para imitação e correção de movimentos em fisioterapia guiada por robô / Visual gesture recognition for mimicking and correcting movements in robot-guided physiotherapy Ricardo Fibe Gambirasio 16 November 2015 (has links) O objetivo deste trabalho é tornar possível a inserção de um robô humanoide para auxiliar pacientes em sessões de fisioterapia. Um sistema robótico é proposto que utiliza um robô humanoide, denominado NAO, visando analisar os movimentos feitos pelos pacientes e corrigi-los se necessário, além de motivá-los durante uma sessão de fisioterapia. O sistema desenvolvido permite que o robô, em primeiro lugar, aprenda um exercício correto de fisioterapia observando sua execução por um fisioterapeuta; em segundo lugar, que ele demonstre o exercício para que um paciente possa imitá-lo; e, finalmente, corrija erros cometidos pelo paciente durante a execução do exercício. O exercício correto é capturado por um sensor Kinect e dividido em uma sequência de estados em dimensão espaço-temporal usando k-means clustering. Estes estados então formam uma máquina de estados finitos para verificar se os movimentos do paciente estão corretos. A transição de um estado para o próximo corresponde a movimentos parciais que compõem o movimento aprendido, e acontece somente quando o robô observa o mesmo movimento parcial executado corretamente pelo paciente; caso contrário o robô sugere uma correção e pede que o paciente tente novamente. O sistema foi testado com vários pacientes em tratamento fisioterapêutico para problemas motores. Os resultados obtidos, em termos de precisão e recuperação para cada movimento, mostraram-se muito promissores. Além disso, o estado emocional dos pacientes foi também avaliado por meio de um questionário aplicado antes e depois do tratamento e durante o tratamento com um software de reconhecimento facial de emoções e os resultados indicam um impacto emocional bastante positivo e que pode vir a auxiliar pacientes durante tratamento fisioterapêuticos. / This dissertation develops a robotic system to guide patients through physiotherapy sessions. The proposed system uses the humanoid robot NAO, and it analyses patients movements to guide, correct, and motivate them during a session. Firstly, the system learns a correct physiotherapy exercise by observing a physiotherapist perform it; secondly, it demonstrates the exercise so that the patient can reproduce it; and finally, it corrects any mistakes that the patient might make during the exercise. The correct exercise is captured via Kinect sensor and divided into a sequence of states in spatial-temporal dimension using k-means clustering. Those states compose a finite state machine that is used to verify whether the patients movements are correct. The transition from one state to the next corresponds to partial movements that compose the learned exercise. If the patient executes the partial movement incorrectly, the system suggests a correction and returns to the same state, asking that the patient try again. The system was tested with multiple patients undergoing physiotherapeutic treatment for motor impairments. Based on the results obtained, the system achieved high precision and recall across all partial movements. The emotional impact of treatment on patients was also measured, via before and after questionnaires and via a software that recognizes emotions from video taken during treatment, showing a positive impact that could help motivate physiotherapy patients, improving their motivation and recovery. Fisioterapia Interação humano-robô Máquinas de estados fintos Reconhecimento de gestos Visão robótica Finite state machines Gesture recognition Human-robot interaction Physiotherapy Robotic vision

Search results