Global ETD Search

51	Advancing human pose and gesture recognition Pfister, Tomas January 2015 (has links) This thesis presents new methods in two closely related areas of computer vision: human pose estimation, and gesture recognition in videos. In human pose estimation, we show that random forests can be used to estimate human pose in monocular videos. To this end, we propose a co-segmentation algorithm for segmenting humans out of videos, and an evaluator that predicts whether the estimated poses are correct or not. We further extend this pose estimator to new domains (with a transfer learning approach), and enhance its predictions by predicting the joint positions sequentially (rather than independently) in an image, and using temporal information in the videos (rather than predicting the poses from a single frame). Finally, we go beyond random forests, and show that convolutional neural networks can be used to estimate human pose even more accurately and efficiently. We propose two new convolutional neural network architectures, and show how optical flow can be employed in convolutional nets to further improve the predictions. In gesture recognition, we explore the idea of using weak supervision to learn gestures. We show that we can learn sign language automatically from signed TV broadcasts with subtitles by letting algorithms 'watch' the TV broadcasts and 'match' the signs with the subtitles. We further show that if even a small amount of strong supervision is available (as there is for sign language, in the form of sign language video dictionaries), this strong supervision can be combined with weak supervision to learn even better models. 006.3
52	Modelo abrangente e reconhecimento de gestos com as mãos livres para ambientes 3D. / Comprehensive model and gesture recognition with free hands for 3d environments. Bernardes Júnior, João Luiz 18 November 2010 (has links) O principal objetivo deste trabalho é possibilitar o reconhecimento de gestos com as mãos livres, para uso em interação em ambientes 3D, permitindo que gestos sejam selecionados, para cada contexto de interação, dentre um grande conjunto de gestos possíveis. Esse grande conjunto deve aumentar a probabilidade de que se possa selecionar gestos já existentes no domínio de cada aplicação ou com associações lógicas claras com as ações que comandam e, assim, facilitar o aprendizado, memorização e uso dos gestos. Estes são requisitos importantes para aplicações em entretenimento e educação, que são os principais alvos deste trabalho. Propõe-se um modelo de gestos que, baseado em uma abordagem linguística, os divide em três componentes: postura e movimento da mão e local onde se inicia. Combinando números pequenos de cada um destes componentes, este modelo permite a definição de dezenas de milhares de gestos, de diferentes tipos. O reconhecimento de gestos assim modelados é implementado por uma máquina de estados finitos com regras explícitas que combina o reconhecimento de cada um de seus componentes. Essa máquina só utiliza a hipótese que os gestos são segmentados no tempo por posturas conhecidas e nenhuma outra relacionada à forma como cada componente é reconhecido, permitindo seu uso com diferentes algoritmos e em diferentes contextos. Enquanto este modelo e esta máquina de estados são as principais contribuições do trabalho, ele inclui também o desenvolvimento de algoritmos simples mas inéditos para reconhecimento de doze movimentos básicos e de uma grande variedade de posturas usando equipamento bastante acessível e pouca preparação. Inclui ainda um framework modular para reconhecimento de gestos manuais em geral, que também pode ser aplicado a outros domínios e com outros algoritmos. Além disso, testes realizados com usuários levantam diversas questões relativas a essa forma de interação. Mostram também que o sistema satisfaz os requisitos estabelecidos. / This work\'s main goal is to make possible the recognition of free hand gestures, for use in interaction in 3D environments, allowing the gestures to be selected, for each interaction context, from a large set of possible gestures. This large set must increase the probability of selecting a gesture which already exists in the application\'s domain or with clear logic association with the actions they command and, thus, to facilitate the learning, memorization and use of these gestures. These requirements are important to entertainment and education applications, this work\'s main targets. A gesture model is proposed that, based on a linguistic approach, divides them in three components: hand posture and movement and the location where it starts. Combining small numbers for each of these components, this model allows the definition of tens of thousands of gestures, of different types. The recognition of gestures so modeled is implemented by a finite state machine with explicit rules which combines the recognition of each of its components. This machine only uses the hypothesis that gestures are segmented in time by known posture, and no other related to the way in which each component is recognized, allowing its use with different algorithms and in different contexts. While this model and this finite state machine are this work\'s main contributions, it also includes the development of simple but novel algorithms for the recognition of twelve basic movements and a large variety of postures requiring highly accessible equipment and little setup. It likewise includes the development of a modular framework for the recognition of hand gestures in general, that may also be applied to other domains and algorithms. Beyond that, tests with users raise several questions about this form of interaction. They also show that the system satisfies the requirements set for it. Ambientes virtuais e aumentados Computer vision Gesture recognition Human computer interaction Interação humano-computador Reconhecimento de gestos Virtual and augmented environments Visão computacional
53	Segmentação e reconhecimento de gestos em tempo real com câmeras e aceleração gráfica / Real-time segmentation and gesture recognition with cameras and graphical acceleration Dantas, Daniel Oliveira 15 March 2010 (has links) O objetivo deste trabalho é reconhecer gestos em tempo real apenas com o uso de câmeras, sem marcadores, roupas ou qualquer outro tipo de sensor. A montagem do ambiente de captura é simples, com apenas duas câmeras e um computador. O fundo deve ser estático, e contrastar com o usuário. A ausência de marcadores ou roupas especiais dificulta a tarefa de localizar os membros. A motivação desta tese é criar um ambiente de realidade virtual para treino de goleiros, que possibilite corrigir erros de movimentação, posicionamento e de escolha do método de defesa. A técnica desenvolvida pode ser aplicada para qualquer atividade que envolva gestos ou movimentos do corpo. O reconhecimento de gestos começa com a detecção da região da imagem onde se encontra o usuário. Nessa região, localizamos as regiões mais salientes como candidatas a extremidades do corpo, ou seja, mãos, pés e cabeça. As extremidades encontradas recebem um rótulo que indica a parte do corpo que deve representar. Um vetor com as coordenadas das extremidades é gerado. Para descobrir qual a pose do usuário, o vetor com as coordenadas das suas extremidades é classificado. O passo final é a classificação temporal, ou seja, o reconhecimento do gesto. A técnica desenvolvida é robusta, funcionando bem mesmo quando o sistema foi treinado com um usuário e aplicado a dados de outro. / Our aim in this work is to recognize gestures in real time with cameras, without markers or special clothes. The capture environment setup is simple, uses just two cameras and a computer. The background must be static, and its colors must be different the users. The absence of markers or special clothes difficults the location of the users limbs. The motivation of this thesis is to create a virtual reality environment for goalkeeper training, but the technique can be applied in any activity that involves gestures or body movements. The recognition of gestures starts with the background subtraction. From the foreground, we locate the more proeminent regions as candidates to body extremities, that is, hands, feet and head. The found extremities receive a label that indicates the body part it may represent. To classify the users pose, the vector with the coordinates of his extremities is compared to keyposes and the best match is selected. The final step is the temporal classification, that is, the gesture recognition. The developed technique is robust, working well even when the system was trained with an user and applied to another users data. 3D reconstruction Gesture recognition GPU GPU OpenGL OpenGL Real-time Reconhecimento de gestos Reconstrução 3D Stereo vision Tempo real Visão estéreo
54	Técnica para interação com mãos em superficies planares utilizando uma câmera RGB-D / A technique for hand interaction with planar surfaces using an RGB-D camera Weber, Henrique January 2016 (has links) Sistemas de Interação Humano-Computador baseados em toque são uma tecnologia disseminada em tablets, smartphones e notebooks. Trata-se de um grande avanço que aumenta a facilidade de comunicação e, ao mesmo tempo, diminui a necessidade de interfaces como mouse e teclado. Entretanto, a superfície de interação utilizada por esses sistemas normalmente é equipada com sensores para a captação dos movimentos realizados pelo usuário, o que impossibilita transformar uma superfície planar qualquer (uma mesa, por exemplo) em uma superfície de interação. Por outro lado, a popularização de sensores de profundidade a partir do lançamento do Microsoft Kinect propiciou o desenvolvimento de sistemas que adotam objetos do dia a dia como superfícies de interação. Nesta dissertação é proposta uma interface natural para interação com superfícies planares utilizando uma câmera RGB-D em posição descendente. Inicialmente, o plano de interação é localizado na nuvem de pontos 3D através de uma variação do algoritmo RANSAC com coerência temporal. Objetos acima do plano são segmentados a partir da transformada watershed baseada em uma função de energia que combina cor, profundidade e informação de confiança. A cor de pele é utilizada para isolar as mãos, e os dedos que interagem com o plano são identificados por um novo processo de esqueletonização 2D. Finalmente, as pontas dos dedos são rastreadas com o uso do algoritmo Húngaro, e o filtro de Kalman é usado para produzir trajetórias mais suaves. Para demonstrar a utilidade da técnica, foi desenvolvido um protótipo que permite ao usuário desenhar em uma superfície de forma natural e intuitiva. / Touch-based Human-Computer Interfaces (HCIs) are a widespread technology present in tablets, smartphones, and notebooks. This is a breakthrough which increases the ease of communication and at the same time reduces the need for interfaces such as mouse and keyboard. However, the interaction surface used by these systems is usually equipped with sensors to capture the movements made by the user, making it impossible to substitute this surface by any other such as a table, for example. On the other hand, the progress of commercial 3D depth sensing technologies in the past five years, having as a keystone Microsoft’s Kinect sensor, has increased the interest in 3D hand gesture recognition using depth data. In this dissertation, we present a natural Human-Computer Interface (HCI) for interaction with planar surfaces using a topdown RGB-D camera. Initially, the interaction plane is located in the 3D point cloud by using a variation of RANSAC with temporal coherence. Off-plane objects are segmented using the watershed transform based on an energy function that combines color, depth and confidence information. Skin color information is used to isolate the hand(s), and a novel 2D skeletonization process identifies the interaction fingers. Finally, the fingertips are tracked using the Hungarian algorithm, and a Kalman filter is applied to produce smoother trajectories. To demonstrate the usefulness of the technique, we also developed a prototype in which the user can draw on the surface using lines and sprays in a natural way. Informática médica Interação homem-computador Visao computacional Computer vision Human-computer interaction Gesture recognition RGB-D cameras
55	Modelo abrangente e reconhecimento de gestos com as mãos livres para ambientes 3D. / Comprehensive model and gesture recognition with free hands for 3d environments. João Luiz Bernardes Júnior 18 November 2010 (has links) O principal objetivo deste trabalho é possibilitar o reconhecimento de gestos com as mãos livres, para uso em interação em ambientes 3D, permitindo que gestos sejam selecionados, para cada contexto de interação, dentre um grande conjunto de gestos possíveis. Esse grande conjunto deve aumentar a probabilidade de que se possa selecionar gestos já existentes no domínio de cada aplicação ou com associações lógicas claras com as ações que comandam e, assim, facilitar o aprendizado, memorização e uso dos gestos. Estes são requisitos importantes para aplicações em entretenimento e educação, que são os principais alvos deste trabalho. Propõe-se um modelo de gestos que, baseado em uma abordagem linguística, os divide em três componentes: postura e movimento da mão e local onde se inicia. Combinando números pequenos de cada um destes componentes, este modelo permite a definição de dezenas de milhares de gestos, de diferentes tipos. O reconhecimento de gestos assim modelados é implementado por uma máquina de estados finitos com regras explícitas que combina o reconhecimento de cada um de seus componentes. Essa máquina só utiliza a hipótese que os gestos são segmentados no tempo por posturas conhecidas e nenhuma outra relacionada à forma como cada componente é reconhecido, permitindo seu uso com diferentes algoritmos e em diferentes contextos. Enquanto este modelo e esta máquina de estados são as principais contribuições do trabalho, ele inclui também o desenvolvimento de algoritmos simples mas inéditos para reconhecimento de doze movimentos básicos e de uma grande variedade de posturas usando equipamento bastante acessível e pouca preparação. Inclui ainda um framework modular para reconhecimento de gestos manuais em geral, que também pode ser aplicado a outros domínios e com outros algoritmos. Além disso, testes realizados com usuários levantam diversas questões relativas a essa forma de interação. Mostram também que o sistema satisfaz os requisitos estabelecidos. / This work\'s main goal is to make possible the recognition of free hand gestures, for use in interaction in 3D environments, allowing the gestures to be selected, for each interaction context, from a large set of possible gestures. This large set must increase the probability of selecting a gesture which already exists in the application\'s domain or with clear logic association with the actions they command and, thus, to facilitate the learning, memorization and use of these gestures. These requirements are important to entertainment and education applications, this work\'s main targets. A gesture model is proposed that, based on a linguistic approach, divides them in three components: hand posture and movement and the location where it starts. Combining small numbers for each of these components, this model allows the definition of tens of thousands of gestures, of different types. The recognition of gestures so modeled is implemented by a finite state machine with explicit rules which combines the recognition of each of its components. This machine only uses the hypothesis that gestures are segmented in time by known posture, and no other related to the way in which each component is recognized, allowing its use with different algorithms and in different contexts. While this model and this finite state machine are this work\'s main contributions, it also includes the development of simple but novel algorithms for the recognition of twelve basic movements and a large variety of postures requiring highly accessible equipment and little setup. It likewise includes the development of a modular framework for the recognition of hand gestures in general, that may also be applied to other domains and algorithms. Beyond that, tests with users raise several questions about this form of interaction. They also show that the system satisfies the requirements set for it. Ambientes virtuais e aumentados Interação humano-computador Reconhecimento de gestos Visão computacional Computer vision Gesture recognition Human computer interaction Virtual and augmented environments
56	Real-time Hand Gesture Detection and Recognition for Human Computer Interaction Dardas, Nasser Hasan Abdel-Qader 08 November 2012 (has links) This thesis focuses on bare hand gesture recognition by proposing a new architecture to solve the problem of real-time vision-based hand detection, tracking, and gesture recognition for interaction with an application via hand gestures. The first stage of our system allows detecting and tracking a bare hand in a cluttered background using face subtraction, skin detection and contour comparison. The second stage allows recognizing hand gestures using bag-of-features and multi-class Support Vector Machine (SVM) algorithms. Finally, a grammar has been developed to generate gesture commands for application control. Our hand gesture recognition system consists of two steps: offline training and online testing. In the training stage, after extracting the keypoints for every training image using the Scale Invariance Feature Transform (SIFT), a vector quantization technique will map keypoints from every training image into a unified dimensional histogram vector (bag-of-words) after K-means clustering. This histogram is treated as an input vector for a multi-class SVM to build the classifier. In the testing stage, for every frame captured from a webcam, the hand is detected using my algorithm. Then, the keypoints are extracted for every small image that contains the detected hand posture and fed into the cluster model to map them into a bag-of-words vector, which is fed into the multi-class SVM classifier to recognize the hand gesture. Another hand gesture recognition system was proposed using Principle Components Analysis (PCA). The most eigenvectors and weights of training images are determined. In the testing stage, the hand posture is detected for every frame using my algorithm. Then, the small image that contains the detected hand is projected onto the most eigenvectors of training images to form its test weights. Finally, the minimum Euclidean distance is determined among the test weights and the training weights of each training image to recognize the hand gesture. Two application of gesture-based interaction with a 3D gaming virtual environment were implemented. The exertion videogame makes use of a stationary bicycle as one of the main inputs for game playing. The user can control and direct left-right movement and shooting actions in the game by a set of hand gesture commands, while in the second game, the user can control and direct a helicopter over the city by a set of hand gesture commands. Posture recognition Gesture recognition Scale Invariant Feature Transform (SIFT) K-means Bag-of-features Support Vector Machine (SVM) Human-computer interaction
57	Einsatz der elektronischen Patientenakte im Operationssaal am Beispiel der HNO-Chirurgie Dressler, Christian 04 June 2013 (has links) (PDF) Wenn ein Chirurg heutzutage während der Operation Informationen aus der Patientenakte benötigt, ist er gezwungen, sich entweder unsteril zu machen oder Personal anzuweisen, ihm die entspre-chenden Informationen zugänglich zu machen. Aus technischer Sicht ist ein System zur intraoperati-ven Bedienung und Darstellung sehr einfach zu realisieren. Grundlage dafür ist eine elektronische Patientenakte (EPA), welche beispielsweise softwaregenerierten oder eingescannten Dokumenten verwaltet. Die vorliegende Arbeit widmet sich den folgenden Fragen: Wird ein solches System im Operationssaal sinnvoll genutzt? Welche Methoden zur sterilen Bedienung kommen infrage? Wie muss die grafische Darstellung auf den Operationssaal abgestimmt werden? Kann durch das Imple-mentieren aktueller Kommunikationsstandards auf alle verfügbaren Patientendaten zugegriffen werden? Dazu wurden in einer ambulanten HNO-Klinik zwei Pilotstudien durchgeführt. In der ersten Studie wurde das erste auf dem Markt befindliche kommerzielle Produkt „MI-Report“ der Firma Karl Storz evaluiert, welches per Gestenerkennung bedient wird. Für die zweite Studie wurde ein EPA-System entwickelt (Doc-O-R), welches eine Vorauswahl der angezeigten Dokumente in Abhängigkeit des Eingriffs traf und mit einem Fußschalter bedient werden konnte. Pro System wurden ca. 50 Eingriffe dokumentiert. Dabei wurde jedes angesehene Dokument und der Nutzungsgrund protokolliert. Die Systeme wurden durchschnittlich mehr als einmal pro Eingriff genutzt. Die automatische Vorauswahl der Dokumente zur Reduzierung der Interaktionen zeigte sehr gute Ergebnisse. Da das behandelte Thema noch in den Anfängen steckt, wird in der Arbeit am Ende auf die Vielzahl von Möglichkeiten eingegangen, welche bezüglich neuartiger Darstellungsmethoden, Bedienvorrich-tungen und aktueller Standardisierungsaktivitäten noch realisiert werden können. Dadurch werden zukünftig auch die Abläufe in der Chirurgie beeinflusst werden. elektronische Patientenakte EPA Chirurgie Gestenerkennung Gebrauchstauglichkeit electronic patient record EPR surgery gesture recognition usability ddc:004 ddc:610
58	Hand Gesture Recognition System Gingir, Emrah 01 September 2010 (has links) (PDF) This thesis study presents a hand gesture recognition system, which replaces input devices like keyboard and mouse with static and dynamic hand gestures, for interactive computer applications. Despite the increase in the attention of such systems there are still certain limitations in literature. Most applications require different constraints like having distinct lightning conditions, usage of a specific camera, making the user wear a multi-colored glove or need lots of training data. The system mentioned in this study disables all these restrictions and provides an adaptive, effort free environment to the user. Study starts with an analysis of the different color space performances over skin color extraction. This analysis is independent of the working system and just performed to attain valuable information about the color spaces. Working system is based on two steps, namely hand detection and hand gesture recognition. In the hand detection process, normalized RGB color space skin locus is used to threshold the coarse skin pixels in the image. Then an adaptive skin locus, whose varying boundaries are estimated from coarse skin region pixels, segments the distinct skin color in the image for the current conditions. Since face has a distinct shape, face is detected among the connected group of skin pixels by using the shape analysis. Non-face connected group of skin pixels are determined as hands. Gesture of the hand is recognized by improved centroidal profile method, which is applied around the detected hand. A 3D flight war game, a boxing game and a media player, which are controlled remotely by just using static and dynamic hand gestures, were developed as human machine interface applications by using the theoretical background of this study. In the experiments, recorded videos were used to measure the performance of the system and a correct recognition rate of ~90% was acquired with nearly real time computation.
59	A Generic Gesture Recognition Approach based on Visual Perception Hu, Gang 22 June 2012 (has links) Current developments of hardware devices have allowed the computer vision technologies to analyze complex human activities in real time. High quality computer algorithms for human activity interpretation are required by many emerging applications, such as patient behavior analysis, surveillance, gesture control video games, and other human computer interface systems. Despite great efforts that have been made in the past decades, it is still a challenging task to provide a generic gesture recognition solution that can facilitate the developments of different gesture-based applications. Human vision is able to perceive scenes continuously, recognize objects and grasp motion semantics effortlessly. Neuroscientists and psychologists have tried to understand and explain how exactly the visual system works. Some theories/hypotheses on visual perception such as the visual attention and the Gestalt Laws of perceptual organization (PO) have been established and shed some light on understanding fundamental mechanisms of human visual perception. In this dissertation, inspired by those visual attention models, we attempt to model and integrate important visual perception discoveries into a generic gesture recognition framework, which is the fundamental component of full-tier human activity understanding tasks. Our approach handles challenging tasks by: (1) organizing the complex visual information into a hierarchical structure including low-level feature, object (human body), and 4D spatiotemporal layers; 2) extracting bottom-up shape-based visual salience entities at each layer according to PO grouping laws; 3) building shape-based hierarchical salience maps in favor of high-level tasks for visual feature selection by manipulating attention conditions of the top-down knowledge about gestures and body structures; and 4) modeling gesture representations by a set of perceptual gesture salience entities (PGSEs) that provide qualitative gesture descriptions in 4D space for recognition tasks. Unlike other existing approaches, our gesture representation method encodes both extrinsic and intrinsic properties and reflects the way humans perceive the visual world so as to reduce the semantic gaps. Experimental results show our approach outperforms the others and has great potential in real-time applications. / PhD Thesis Visual attention Perceptual organization Salience map 3D Camera Perceptual gesture salience entity Generic edge token Curve partition point Gesture recognition
60	Usability Analysis in Locomotion Interface for Human Computer Interaction System Design Farhadi-Niaki, Farzin 09 January 2019 (has links) In the past decade and more than any time before, new technologies have been broadly applied in various fields of interaction between human and machine. Despite many functionality studies, yet, how such technologies should be evaluated within the context of human computer interaction research remains unclear. This research aims at proposing a mechanism to evaluate/predict the design of user interfaces with their interacting components. At the first level of analysis, an original concept extracts the usability results of components, such as effectiveness, efficiency, adjusted satisfaction, and overall acceptability, for comparison in the fields of interest. At the second level of analysis, another original concept defines new metrics based on the level of complexity in interactions between input modality and feedback of performing a task, in the field of classical solid mechanics. Having these results, a set of hypotheses is provided to test if some common satisfaction criteria can be predicted from their correlations with the components of performance, complexity, and overall acceptability. In the context of this research, three multimodal applications are implemented and experimentally tested to study the quality of interactions through the proposed hypotheses: a) full-body gestures vs. mouse/keyboard, in a Box game; b) arm/hand gestures vs. three-dimensional haptic controller, in a Slingshot game; and c) hand/finger gestures vs. mouse/keyboard, in a Race game. Their graphical user interfaces are designed to cover some extents of static/dynamic gestures, pulse/continuous touch-based controls, and discrete/analog tasks measured. They are quantified based on a new definition termed index of complexity which represents a concept of effort in the domain of locomotion interaction. Single/compound devices are also defined and studied to evaluate the effect of user’s attention in multi-tasking interactions. The proposed method of investigation for usability is meant to assist human-computer interface developers to reach a proper overall acceptability, performance, and effort-based analyses prior to their final user interface design. HCI Gesture Recognition Usability analysis Machine Vision Heuristic Evaluation User Interface Design Human Satisfaction Human Performance Complexity Locomotion Interaction

Search results