Global ETD Search

111	A Machine Learning Framework for Real-Time Gesture and Skeleton-Based Action Recognition in Unit : Exploring Human-Compute-Interaction in Game Design and Interaction Moeini, Arian January 2024 (has links) This master thesis presents a machine learning framework for real-time gesture and skeleton-based action recognition, integrated with the Unity game engine. The system aims to enhance human-computer interaction (HCI) in gaming and 3D related applications through natural movement recognition, by training a model on skeleton tracking data. The framework is trained to accurately categorize and identify gestures such as kicks and punches, enabling a more immersive gaming experience not existing in traditional controllers. After studying the evolution of HCI and how machine learning has transformed and reshaped the interaction paradigm, the prototype system is built through data collection, augmenting, and preprocessing, followed by training and evaluating a Long Short-Term Memory (LSTM) neural network model for gesture classification. The model is integrated into Unity via Unity Sentis using Open Neural Network Exchange (ONNX) format, enabling efficient real-time action recognition in 3D space. Each component of the pipeline is available and adaptable for future custom- ization and needs, skeleton tracking and Unity integration is built using the ZED 2i camera and ZED SDK. Experimental results demonstrate that the system presented can achieve over 90% accuracy in identifying predefined gestures. As a bridging solution tailored for Unity, this framework offers a practical solution to action recognition that could be found useful in future applications. This work contributes to advancing human-computer interaction and offers a foundation for further development in gesture-based Unity game design. Machine Learning Framework Real-time Gesture Recognition Skeleton-based Action Recognition Unity Game Engine Integration Human-Computer Interaction (HCI) Natural Movement Recognition Skeleton Tracking Data Computer Sciences Datavetenskap (datalogi)
112	Analyse du geste dansé et retours visuels par modèles physiques : apport des qualités de mouvement à l'interaction avec le corps entier / Dance Gesture Analysis and Visual Feedback based on Physical Models : Contributions of Movement Qualities in Whole Body Interaction Fdili Alaoui, Sarah 19 December 2012 (has links) La présente thèse a pour but d’approfondir l’étude du geste dans le cadre de l’interaction Homme Machine. Il s’agit de créer de nouveaux paradigmes d’interaction qui offrent à l’utilisateur de plus amples possibilités d’expression basées sur le geste. Un des vecteurs d’expression du geste, très rarement traité en Interaction Homme Machine, qui lui confère sa coloration et son aspect, est ce que les théoriciens et praticiens de la danse appellent « les qualités de mouvement ». Nous mettons à profit des collaborations avec le domaine de la danse pour étudier la notion de qualités de mouvement et l’intégrer à des paradigmes d’interaction gestuelle. Notre travail analyse les apports de l’intégration des qualités de mouvement comme modalité d’interaction, fournit les outils propices à l’élaboration de cette intégration (en termes de méthodes d’analyse, de visualisation et de contrôle gestuel), en développe et évalue certaines techniques d’interaction.Les contributions de la thèse se situent d’abord dans la formalisation de la notion de qualités de mouvement et l’évaluation de son intégration dans un dispositif interactif en termes d’expérience utilisateur. Sur le plan de la visualisation des qualités de mouvement, les travaux menés pendant la thèse ont permis de démontrer que les modèles physiques masses-ressorts offrent de grandes possibilités de simulation de comportements dynamiques et de contrôle en temps réel. Sur le plan de l’analyse, la thèse a permis de développer des approches novatrices de reconnaissance automatique des qualités de mouvement de l’utilisateur. Enfin, à partir des approches d’analyse et de visualisation des qualités de mouvement, la thèse a donné lieu à l’implémentation d’un ensemble de techniques d’interaction. Elle a appliqué et évalué ses techniques dans le contexte de la pédagogie de la danse et de la performance. / The thesis studies gesture in the context of Human-Computer interaction. It aims at creating new interaction paradigms that offer the user further expressive possibilities based on gestures. The theorists and practitioners of the dance call "movement qualities” (MQ), a notion that conveys expressive content describing the way a gesture is performed. This notion has been rarely taken into consideration in the field of HCI. Our work draws on collaborations with the field of dance to explore the notion of movement qualities and to integrate it as interaction modality.   The contributions of the thesis are in the formalism of the notion of movement qualities and evaluation of its integration as interaction modality in terms of user experience.   We also provide computational tools for considering MQ in interactive systems in terms of analysis, representation and gesture control methods. On the representational level, our work have demonstrated that physical models based on masses and springs systems offer great opportunities for simulating dynamics related to MQs and for real-time gesture control. On the analysis level, we developed innovative approaches to automatic real time recognition of movement qualities. Finally, we implemented of a set of interaction techniques based on movement qualities that we applied and evaluated in the context of dance pedagogy and performance. Art-science Geste dansé Geste expressif Qualités de mouvement Performance augmentée Interaction du corps entier Techniques d’interaction Analyse de gestes Reconnaissance de gestes Retours visuels par modèles physiques Modèles masses-ressorts Expérience utilisateur Art-science Dance gesture Expressive gesture Movement qualities Augmented performance Whole body interaction Interaction techniques Gesture analysis Gesture recognition Mass-springs systems User experience
113	Automatic non linear metric learning : Application to gesture recognition / Apprentissage automatique de métrique non linéaire : Application à la reconnaissance de gestes Berlemont, Samuel 11 February 2016 (has links) Cette thèse explore la reconnaissance de gestes à partir de capteurs inertiels pour Smartphone. Ces gestes consistent en la réalisation d'un tracé dans l'espace présentant une valeur sémantique, avec l'appareil en main. Notre étude porte en particulier sur l'apprentissage de métrique entre signatures gestuelles grâce à l'architecture "Siamoise" (réseau de neurones siamois, SNN), qui a pour but de modéliser les relations sémantiques entre classes afin d'extraire des caractéristiques discriminantes. Cette architecture est appliquée au perceptron multicouche (MultiLayer Perceptron). Les stratégies classiques de formation d'ensembles d'apprentissage sont essentiellement basées sur des paires similaires et dissimilaires, ou des triplets formés d'une référence et de deux échantillons respectivement similaires et dissimilaires à cette référence. Ainsi, nous proposons une généralisation de ces approches dans un cadre de classification, où chaque ensemble d'apprentissage est composé d’une référence, un exemple positif, et un exemple négatif pour chaque classe dissimilaire. Par ailleurs, nous appliquons une régularisation sur les sorties du réseau au cours de l'apprentissage afin de limiter les variations de la norme moyenne des vecteurs caractéristiques obtenus. Enfin, nous proposons une redéfinition du problème angulaire par une adaptation de la notion de « sinus polaire », aboutissant à une analyse en composantes indépendantes non-linéaire supervisée. A l'aide de deux bases de données inertielles, la base MHAD (Multimodal Human Activity Dataset) ainsi que la base Orange, composée de gestes symboliques inertiels réalisés avec un Smartphone, les performances de chaque contribution sont caractérisées. Ainsi, des protocoles modélisant un monde ouvert, qui comprend des gestes inconnus par le système, mettent en évidence les meilleures capacités de détection et rejet de nouveauté du SNN. En résumé, le SNN proposé permet de réaliser un apprentissage supervisé de métrique de similarité non-linéaire, qui extrait des vecteurs caractéristiques discriminants, améliorant conjointement la classification et le rejet de gestes inertiels. / As consumer devices become more and more ubiquitous, new interaction solutions are required. In this thesis, we explore inertial-based gesture recognition on Smartphones, where gestures holding a semantic value are drawn in the air with the device in hand. In our research, speed and delay constraints required by an application are critical, leading us to the choice of neural-based models. Thus, our work focuses on metric learning between gesture sample signatures using the "Siamese" architecture (Siamese Neural Network, SNN), which aims at modelling semantic relations between classes to extract discriminative features, applied to the MultiLayer Perceptron. Contrary to some popular versions of this algorithm, we opt for a strategy that does not require additional parameter fine tuning, namely a set threshold on dissimilar outputs, during training. Indeed, after a preprocessing step where the data is filtered and normalised spatially and temporally, the SNN is trained from sets of samples, composed of similar and dissimilar examples, to compute a higher-level representation of the gesture, where features are collinear for similar gestures, and orthogonal for dissimilar ones. While the original model already works for classification, multiple mathematical problems which can impair its learning capabilities are identified. Consequently, as opposed to the classical similar or dissimilar pair; or reference, similar and dissimilar sample triplet input set selection strategies, we propose to include samples from every available dissimilar classes, resulting in a better structuring of the output space. Moreover, we apply a regularisation on the outputs to better determine the objective function. Furthermore, the notion of polar sine enables a redefinition of the angular problem by maximising a normalised volume induced by the outputs of the reference and dissimilar samples, which effectively results in a Supervised Non-Linear Independent Component Analysis. Finally, we assess the unexplored potential of the Siamese network and its higher-level representation for novelty and error detection and rejection. With the help of two real-world inertial datasets, the Multimodal Human Activity Dataset as well as the Orange Dataset, specifically gathered for the Smartphone inertial symbolic gesture interaction paradigm, we characterise the performance of each contribution, and prove the higher novelty detection and rejection rate of our model, with protocols aiming at modelling unknown gestures and open world configurations. To summarise, the proposed SNN allows for supervised non-linear similarity metric learning, which extracts discriminative features, improving both inertial gesture classification and rejection. Informatique Informatique ambiante Intelligence artificielle Reconnaissance de gestes Apprentissage automatique Apprentissage métrique Réseau de neurones artificiels Reseau siamois Capteur inertiel Système micro électromécanique - MEMS Information Technology Ubiquitous computing Artificial intelligence Gesture recognition Machine learning Metric learning Artificial neural network Siamese network Inertial sensor MEMS - Micro Electro Mechanical System 006.307 2
114	Sistema de visão computacional para detecção do uso de telefones celulares ao dirigir / A computer vision system tor detecting use of mobile phones while driving Berri, Rafael Alceste 21 February 2014 (has links) Made available in DSpace on 2016-12-12T20:22:52Z (GMT). No. of bitstreams: 1 RAFAEL ALCESTE BERRI.pdf: 28428368 bytes, checksum: 667b9facc9809bfd5e0847e15279b0e6 (MD5) Previous issue date: 2014-02-21 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / In this work, three proposals of systems have been developed using a frontal camera to monitor the driver and enabling to identificate if a cell phone is being used while driving the vehicle. It is estimated that 80% of crashes and 65% of near collisions involved drivers who were inattentive in traffic for three seconds before the event. Five videos in real environment were generated to test the systems. The pattern recognition system (RP) uses adaptive skin segmentation, feature extraction, and machine learning to detect cell phone usage on each frame. The cell phone detection happens when, in periods of 3 seconds, 60% (threshold) of frames or more are identified as a cell phone use, individually. The average accuracy on videos achieved was 87.25% with Multilayer Perceptron (MLP), Gaussian activation function, and two neurons of the intermediate layer. The movement detection system (DM) uses optical flow, filtering the most relevant movements of the scene, and three successive frames for detecting the movements to take the phone to the ear and take it off. The DM proposal was not demonstrated as being an effective solution for detecting cell phone use, reaching an accuracy of 52.86%. The third solution is a hybrid system. It uses the RP system for classification and the DM for choosing the RP parameters. The parameters chosen for RP are the threshold and the classification system. The definition of these two parameters occurs at the end of each period, based on movement detected by the DM. Experimentally it was established that, when the movement induces to use cell phone, it is proper to use the threshold of 60%, and the classifier as MLP/Gaussian with seven neurons of the intermediate layer; otherwise, it is used threshold 85%, and MLP/Gaussian with two neurons of the intermediate layer for classification. The hybrid solution is the most robust system with average accuracy of 91.68% in real environment. / Neste trabalho, são desenvolvidas três propostas de sistemas que permitem identificar o uso de celular, durante o ato de dirigir um veículo, utilizando imagens capturadas de uma câmera posicionada em frente ao motorista. Estima-se que 80% das colisões e 65% das quase colisões envolveram motoristas que não estavam prestando a devida atenção ao trânsito por três segundos antes do evento. Cinco vídeos em ambiente real foram gerados com o intuito de testar os sistemas. A proposta de reconhecimento de padrões (RP) emprega segmentação de pele adaptativa, extração de características e aprendizado de máquina (classificador) na detecção do celular em cada quadro processado. A detecção do uso do celular ocorre quando, em períodos de 3 segundos, ao menos em 60% dos quadros (corte) são identificados com celular. A acurácia média nos vídeos alcançou 87, 25% ao utilizar Perceptron Multi-camadas (MLP) com função de ativação gaussiana e dois neurônios na camada intermediária como classificador. A proposta de detecção de movimento (DM) utiliza o fluxo ótico, filtragem dos movimentos mais relevantes da cena e três quadros consecutivos para detectar os momentos de levar o celular ao ouvido e o retirá-lo. A aplicação do DM, como solução para detectar o uso do celular, não se demostrou eficaz atingindo uma acurácia de 52, 86%. A terceira proposta, uma solução híbrida, utiliza o sistema RP como classificador e o de DM como seu parametrizador. Os parâmetros escolhidos para o sistema de RP são o corte e o sistema classificador. A definição desses dois parâmetros ocorre ao final de cada período, baseada na movimentação detectada pela DM. Com experimentações definiu-se que, caso a movimentação induza ao uso do celular, é adequado o uso do corte de 60% e o classificador MLP/Gaussiana com sete neurônios na camada intermediária, caso contrário, utiliza-se o corte de 85% e classificador MLP/Gaussiana com dois neurônios na mesma camada. A versão híbrida é a solução desenvolvida mais robusta, atingindo a melhor acurácia média de 91, 68% em ambiente real. Algoritmo genético Aprendizado de máquina Distração do motorista Fluxo ótico Máquina de vetor de suporte Perceptron multicamada Reconhecimento de gestos Segmentação de pele Telefones celulares Visão computacional Cell phones Computer vision Driver distraction Genetic algorithm Gesture recognition Machine learning Multilayer perceptron Optical flow Skin segmentation Support vector machines
115	Creating Good User Experience in a Hand-Gesture-Based Augmented Reality Game / Användbarhet i ett handgestbaserat AR-spel Lam, Benny, Nilsson, Jakob January 2019 (has links) The dissemination of new innovative technology requires feasibility and simplicity. The problem with marker-based augmented reality is similar to glove-based hand gesture recognition: they both require an additional component to function. This thesis investigates the possibility of combining markerless augmented reality together with appearance-based hand gesture recognition by implementing a game with good user experience. The methods employed in this research consist of a game implementation and a pre-study meant for measuring interactive accuracy and precision, and for deciding upon which gestures should be utilized in the game. A test environment was realized in Unity using ARKit and Manomotion SDK. Similarly, the implementation of the game used the same development tools. However, Blender was used for creating the 3D models. The results from 15 testers showed that the pinching gesture was the most favorable one. The game was evaluated with a System Usability Scale (SUS) and received a score of 70.77 among 12 game testers, which indicates that the augmented reality game, which interaction method is solely based on bare-hands, can be quite enjoyable. AR Hand gestures Bare-hand interaction HGR Augmented reality usability user experience Manomotion ARKit Visual Odometry SLAM VO VIO 3D gestural interaction gesture recognition gesture tracking augmented environments Förstärkt verklighet 3D gestinteraktion gestigenkänning visuell odometri Manomotion ARKit Engineering and Technology Teknik och teknologier Interaction Technologies Interaktionsteknik Design Design Human Computer Interaction Probability Theory and Statistics Sannolikhetsteori och statistik Computer Systems Datorsystem Software Engineering Programvaruteknik

Page generated in 0.1226 seconds