Global ETD Search

91	Projeto de um módulo de aquisição e pré-processamento de imagem colorida baseado em computação reconfigurável e aplicado a robôs móveis / A project of a module for acquisition and color image pre-processing based on reconfigurable computation and applied to mobile robots Vanderlei Bonato 14 May 2004 (has links) Este trabalho propõe um módulo básico de aquisição e pré-processamento de imagem colorida aplicado a robôs móveis, implementado em hardware reconfigurável, dentro do conceito de sistemas SoC (System-on-a-Chip). O módulo básico é apresentado em conjunto com funções mais específicas de pré-processamento de imagem, que são utilizadas como base para a verificação das funcionalidades implementadas no trabalho proposto. As principais funções realizadas pelo módulo básico são: montagem de frames a partir dos pixels obtidos da câmera digital CMOS, controle dos diversos parâmetros de configuração da câmera e conversão de padrões de cores. Já as funções mais específicas abordam as etapas de segmentação, centralização, redução e interpretação das imagens adquiridas. O tipo de dispositivo reconfigurável utilizado neste trabalho é o FPGA (Field-Programmable Gate Array), que permite maior adequação das funções específicas às necessidades das aplicações, tendo sempre como base o módulo proposto. O sistema foi aplicado para reconhecer gestos e obteve a taxa 99,57% de acerto operando a 31,88 frames por segundo. / This work proposes a basic module for a mobile robot color image capture and pre-processing, implemented in reconfigurable hardware based on SoC (System-on-a-Chip). The basic module is presented with a specifics image pre-processing function that are used as a base for verify the functionalities implemented in this research. The mains functions implemented on this basic module are: to read the pixels provide by the CMOS camera for compose the frame, to adjust the parameters of the camera control and to convert color space. The specifics image pre-processing functions are used to do image segmentation, centralization, reduction and image classification. The reconfigurable dispositive used in this research is the FPGA (Field-Programmable Gate Array) that permit to adapt the specific function according to the application needs. The system was applied to recognize gesture and had 99,57% rate of true recognition at 31,88 frames per second. câmera cmos computação embutida fpga reconhecimento de gestos rede neural booleana visão computacional Boolean Neural Network FPGA Gesture Recognition Robotics SoC Vision System
92	Contributions to Pen & Touch Human-Computer Interaction Martín-Albo Simón, Daniel 01 September 2016 (has links) [EN] Computers are now present everywhere, but their potential is not fully exploited due to some lack of acceptance. In this thesis, the pen computer paradigm is adopted, whose main idea is to replace all input devices by a pen and/or the fingers, given that the origin of the rejection comes from using unfriendly interaction devices that must be replaced by something easier for the user. This paradigm, that was was proposed several years ago, has been only recently fully implemented in products, such as the smartphones. But computers are actual illiterates that do not understand gestures or handwriting, thus a recognition step is required to "translate" the meaning of these interactions to computer-understandable language. And for this input modality to be actually usable, its recognition accuracy must be high enough. In order to realistically think about the broader deployment of pen computing, it is necessary to improve the accuracy of handwriting and gesture recognizers. This thesis is devoted to study different approaches to improve the recognition accuracy of those systems. First, we will investigate how to take advantage of interaction-derived information to improve the accuracy of the recognizer. In particular, we will focus on interactive transcription of text images. Here the system initially proposes an automatic transcript. If necessary, the user can make some corrections, implicitly validating a correct part of the transcript. Then the system must take into account this validated prefix to suggest a suitable new hypothesis. Given that in such application the user is constantly interacting with the system, it makes sense to adapt this interactive application to be used on a pen computer. User corrections will be provided by means of pen-strokes and therefore it is necessary to introduce a recognizer in charge of decoding this king of nondeterministic user feedback. However, this recognizer performance can be boosted by taking advantage of interaction-derived information, such as the user-validated prefix. Then, this thesis focuses on the study of human movements, in particular, hand movements, from a generation point of view by tapping into the kinematic theory of rapid human movements and the Sigma-Lognormal model. Understanding how the human body generates movements and, particularly understand the origin of the human movement variability, is important in the development of a recognition system. The contribution of this thesis to this topic is important, since a new technique (which improves the previous results) to extract the Sigma-lognormal model parameters is presented. Closely related to the previous work, this thesis study the benefits of using synthetic data as training. The easiest way to train a recognizer is to provide "infinite" data, representing all possible variations. In general, the more the training data, the smaller the error. But usually it is not possible to infinitely increase the size of a training set. Recruiting participants, data collection, labeling, etc., necessary for achieving this goal can be time-consuming and expensive. One way to overcome this problem is to create and use synthetically generated data that looks like the human. We study how to create these synthetic data and explore different approaches on how to use them, both for handwriting and gesture recognition. The different contributions of this thesis have obtained good results, producing several publications in international conferences and journals. Finally, three applications related to the work of this thesis are presented. First, we created Escritorie, a digital desk prototype based on the pen computer paradigm for transcribing handwritten text images. Second, we developed "Gestures à Go Go", a web application for bootstrapping gestures. Finally, we studied another interactive application under the pen computer paradigm. In this case, we study how translation reviewing can be done more ergonomically using a pen. / [ES] Hoy en día, los ordenadores están presentes en todas partes pero su potencial no se aprovecha debido al "miedo" que se les tiene. En esta tesis se adopta el paradigma del pen computer, cuya idea fundamental es sustituir todos los dispositivos de entrada por un lápiz electrónico o, directamente, por los dedos. El origen del rechazo a los ordenadores proviene del uso de interfaces poco amigables para el humano. El origen de este paradigma data de hace más de 40 años, pero solo recientemente se ha comenzado a implementar en dispositivos móviles. La lenta y tardía implantación probablemente se deba a que es necesario incluir un reconocedor que "traduzca" los trazos del usuario (texto manuscrito o gestos) a algo entendible por el ordenador. Para pensar de forma realista en la implantación del pen computer, es necesario mejorar la precisión del reconocimiento de texto y gestos. El objetivo de esta tesis es el estudio de diferentes estrategias para mejorar esta precisión. En primer lugar, esta tesis investiga como aprovechar información derivada de la interacción para mejorar el reconocimiento, en concreto, en la transcripción interactiva de imágenes con texto manuscrito. En la transcripción interactiva, el sistema y el usuario trabajan "codo con codo" para generar la transcripción. El usuario valida la salida del sistema proporcionando ciertas correcciones, mediante texto manuscrito, que el sistema debe tener en cuenta para proporcionar una mejor transcripción. Este texto manuscrito debe ser reconocido para ser utilizado. En esta tesis se propone aprovechar información contextual, como por ejemplo, el prefijo validado por el usuario, para mejorar la calidad del reconocimiento de la interacción. Tras esto, la tesis se centra en el estudio del movimiento humano, en particular del movimiento de las manos, utilizando la Teoría Cinemática y su modelo Sigma-Lognormal. Entender como se mueven las manos al escribir, y en particular, entender el origen de la variabilidad de la escritura, es importante para el desarrollo de un sistema de reconocimiento, La contribución de esta tesis a este tópico es importante, dado que se presenta una nueva técnica (que mejora los resultados previos) para extraer el modelo Sigma-Lognormal de trazos manuscritos. De forma muy relacionada con el trabajo anterior, se estudia el beneficio de utilizar datos sintéticos como entrenamiento. La forma más fácil de entrenar un reconocedor es proporcionar un conjunto de datos "infinito" que representen todas las posibles variaciones. En general, cuanto más datos de entrenamiento, menor será el error del reconocedor. No obstante, muchas veces no es posible proporcionar más datos, o hacerlo es muy caro. Por ello, se ha estudiado como crear y usar datos sintéticos que se parezcan a los reales. Las diferentes contribuciones de esta tesis han obtenido buenos resultados, produciendo varias publicaciones en conferencias internacionales y revistas. Finalmente, también se han explorado tres aplicaciones relaciones con el trabajo de esta tesis. En primer lugar, se ha creado Escritorie, un prototipo de mesa digital basada en el paradigma del pen computer para realizar transcripción interactiva de documentos manuscritos. En segundo lugar, se ha desarrollado "Gestures à Go Go", una aplicación web para generar datos sintéticos y empaquetarlos con un reconocedor de forma rápida y sencilla. Por último, se presenta un sistema interactivo real bajo el paradigma del pen computer. En este caso, se estudia como la revisión de traducciones automáticas se puede realizar de forma más ergonómica. / [CAT] Avui en dia, els ordinadors són presents a tot arreu i es comunament acceptat que la seva utilització proporciona beneficis. No obstant això, moltes vegades el seu potencial no s'aprofita totalment. En aquesta tesi s'adopta el paradigma del pen computer, on la idea fonamental és substituir tots els dispositius d'entrada per un llapis electrònic, o, directament, pels dits. Aquest paradigma postula que l'origen del rebuig als ordinadors prové de l'ús d'interfícies poc amigables per a l'humà, que han de ser substituïdes per alguna cosa més coneguda. Per tant, la interacció amb l'ordinador sota aquest paradigma es realitza per mitjà de text manuscrit i/o gestos. L'origen d'aquest paradigma data de fa més de 40 anys, però només recentment s'ha començat a implementar en dispositius mòbils. La lenta i tardana implantació probablement es degui al fet que és necessari incloure un reconeixedor que "tradueixi" els traços de l'usuari (text manuscrit o gestos) a alguna cosa comprensible per l'ordinador, i el resultat d'aquest reconeixement, actualment, és lluny de ser òptim. Per pensar de forma realista en la implantació del pen computer, cal millorar la precisió del reconeixement de text i gestos. L'objectiu d'aquesta tesi és l'estudi de diferents estratègies per millorar aquesta precisió. En primer lloc, aquesta tesi investiga com aprofitar informació derivada de la interacció per millorar el reconeixement, en concret, en la transcripció interactiva d'imatges amb text manuscrit. En la transcripció interactiva, el sistema i l'usuari treballen "braç a braç" per generar la transcripció. L'usuari valida la sortida del sistema donant certes correccions, que el sistema ha d'usar per millorar la transcripció. En aquesta tesi es proposa utilitzar correccions manuscrites, que el sistema ha de reconèixer primer. La qualitat del reconeixement d'aquesta interacció és millorada, tenint en compte informació contextual, com per exemple, el prefix validat per l'usuari. Després d'això, la tesi se centra en l'estudi del moviment humà en particular del moviment de les mans, des del punt de vista generatiu, utilitzant la Teoria Cinemàtica i el model Sigma-Lognormal. Entendre com es mouen les mans en escriure és important per al desenvolupament d'un sistema de reconeixement, en particular, per entendre l'origen de la variabilitat de l'escriptura. La contribució d'aquesta tesi a aquest tòpic és important, atès que es presenta una nova tècnica (que millora els resultats previs) per extreure el model Sigma- Lognormal de traços manuscrits. De forma molt relacionada amb el treball anterior, s'estudia el benefici d'utilitzar dades sintètiques per a l'entrenament. La forma més fàcil d'entrenar un reconeixedor és proporcionar un conjunt de dades "infinit" que representin totes les possibles variacions. En general, com més dades d'entrenament, menor serà l'error del reconeixedor. No obstant això, moltes vegades no és possible proporcionar més dades, o fer-ho és molt car. Per això, s'ha estudiat com crear i utilitzar dades sintètiques que s'assemblin a les reals. Les diferents contribucions d'aquesta tesi han obtingut bons resultats, produint diverses publicacions en conferències internacionals i revistes. Finalment, també s'han explorat tres aplicacions relacionades amb el treball d'aquesta tesi. En primer lloc, s'ha creat Escritorie, un prototip de taula digital basada en el paradigma del pen computer per realitzar transcripció interactiva de documents manuscrits. En segon lloc, s'ha desenvolupat "Gestures à Go Go", una aplicació web per a generar dades sintètiques i empaquetar-les amb un reconeixedor de forma ràpida i senzilla. Finalment, es presenta un altre sistema inter- actiu sota el paradigma del pen computer. En aquest cas, s'estudia com la revisió de traduccions automàtiques es pot realitzar de forma més ergonòmica. / Martín-Albo Simón, D. (2016). Contributions to Pen & Touch Human-Computer Interaction [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/68482 / TESIS Interactive Transcription Handwriting Recognition Gesture Recognition Kinematic Theory Sigma-Lognormal Model Synthetic Hanwriting Sigma-Lognormal Parameters Extraction LENGUAJES Y SISTEMAS INFORMATICOS
93	Deep-learning for high dimensional sequential observations : application to continuous gesture recognition / Modélisation par réseaux de neurones profonds pour l'apprentissage continu d'objets et de gestes par un robot Granger, Nicolas 10 January 2019 (has links) Cette thèse a pour but de contribuer à améliorer les interfaces Homme-machine. En particulier, nos appareils devraient répliquer notre capacité à traiter continûment des flux d'information. Cependant, le domaine de l’apprentissage statistique dédié à la reconnaissance de séries temporelles pose de multiples défis. Nos travaux utilisent la reconnaissance de gestes comme exemple applicatif, ces données offrent un mélange complexe de poses corporelles et de mouvements, encodées sous des modalités très variées. La première partie de notre travail compare deux modèles temporels de l’état de l’art pour la reconnaissance continue sur des séquences, plus précisément l’hybride réseau de neurones -- modèle de Markov caché (NN-HMM) et les réseaux de neurones récurrents bidirectionnels (BD-RNN) avec des unités commandées par des portes. Pour ce faire, nous avons implémenté un environnement de test partagé qui est plus favorable à une étude comparative équitable. Nous proposons des ajustements sur les fonctions de coût utilisées pour entraîner les réseaux de neurones et sur les expressions du modèle hybride afin de gérer un large déséquilibre des classes de notre base d’apprentissage. Bien que les publications récentes semblent privilégier l’architecture BD-RNN, nous démontrons que l’hybride NN-HMM demeure compétitif. Cependant, ce dernier est plus dépendant de son modèle d'entrées pour modéliser les phénomènes temporels à court terme. Enfin, nous montrons que les facteurs de variations appris sur les entrées par les deux modèles sont inter-compatibles. Dans un second temps, nous présentons une étude de l'apprentissage dit «en un coup» appliqué aux gestes. Ce paradigme d'apprentissage gagne en attention mais demeure peu abordé dans le cas de séries temporelles. Nous proposons une architecture construite autour d’un réseau de neurones bidirectionnel. Son efficacité est démontrée par la reconnaissance de gestes isolés issus d’un dictionnaire de langage des signes. À partir de ce modèle de référence, nous proposons de multiples améliorations inspirées par des travaux dans des domaines connexes, et nous étudions les avantages ou inconvénients de chacun / This thesis aims to improve the intuitiveness of human-computer interfaces. In particular, machines should try to replicate human's ability to process streams of information continuously. However, the sub-domain of Machine Learning dedicated to recognition on time series remains barred by numerous challenges. Our studies use gesture recognition as an exemplar application, gestures intermix static body poses and movements in a complex manner using widely different modalities. The first part of our work compares two state-of-the-art temporal models for continuous sequence recognition, namely Hybrid Neural Network--Hidden Markov Models (NN-HMM) and Bidirectional Recurrent Neural Networks (BDRNN) with gated units. To do so, we reimplemented the two within a shared test-bed which is more amenable to a fair comparative work. We propose adjustments to Neural Network training losses and the Hybrid NN-HMM expressions to accommodate for highly imbalanced data classes. Although recent publications tend to prefer BDRNNs, we demonstrate that Hybrid NN-HMM remain competitive. However, the latter rely significantly on their input layers to model short-term patterns. Finally, we show that input representations learned via both approaches are largely inter-compatible. The second part of our work studies one-shot learning, which has received relatively little attention so far, in particular for sequential inputs such as gestures. We propose a model built around a Bidirectional Recurrent Neural Network. Its effectiveness is demonstrated on the recognition of isolated gestures from a sign language lexicon. We propose several improvements over this baseline by drawing inspiration from related works and evaluate their performances, exhibiting different advantages and disadvantages for each Robotique d’assistance Apprentissage en un coup Apprentissage profond Reconnaissance de gestes Vision en temps-réel Assistive robotics One-shot learning Deep learning Gesture recognition Real-time vision
94	FAZT: FEW AND ZERO-SHOT FRAMEWORK TO LEARN TEMPO-VISUAL EVENTS FROM LITTLE OR NO DATA Naveen Madapana (11613925) 20 December 2021 (has links) <div>Supervised classification methods based on deep learning have achieved great success in many domains and tasks that are previously unimaginable. Such approaches build on learning paradigms that require hundreds of examples in order to learn to classify objects or events. Thus, their immediate application to the domains with few or no observations is limited. This is because of the lack of ability to rapidly generalize to new categories from a few examples or from high-level descriptions of categories. This can be attributed to the significant gap between the way machines represent knowledge and the way humans represent categories in their minds and learn to recognize them. In this context, this research represents categories as semantic trees in a high-level attribute space and proposes an approach to utilize these representations to conduct N-Shot, Few-Shot, One-Shot, and Zero-Shot Learning (ZSL). This work refers to this paradigm as the problem of general classification (GCP) and proposes a unified framework for GCP referred to as the Few and Zero-Shot Technique (FAZT). FAZT framework is an end-to-end approach that uses trainable 3D convolutional neural networks and recurrent neural networks to simultaneously optimize for both the semantic and the classification tasks. Lastly, the problem of systematically obtaining semantic attributes by utilizing domain-specific ontologies is presented. The proposed framework is validated in the domains of hand gesture and action/activity recognition, however, this research can be applied to other domains such as video understanding, the study of human behavior, emotion recognition, etc. First, an attribute-based dataset for gestures is developed in a systematic manner by relying on literature in gestures and semantics, and crowdsourced platforms such as Amazon Mechanical Turk. To the best of our knowledge, this is the first ZSL dataset for hand gestures (ZSGL dataset). Next, our framework is evaluated in two experimental conditions: 1. Within-category (to test the attribute recognition power) and 2. Across-category (to test the ability to recognize an unknown category). In addition, we conducted experiments in zero-shot, one-shot, few-shot and continuous learning conditions in both open-set and closed-set scenarios. Results showed that our framework performs favorably on the ZSGL, Kinetics, UIUC Action, UCF101 and HMDB51 action datasets in all the experimental conditions.<br></div><div><br></div> Computer Engineering transfer learning machine learning deep learning zero-shot learning few-shot learning lifelong learning gesture recognition activity recognition semantic description agreement analysis
95	Optické metody rozeznání gest / Optical methods of gesture recognition Netopil, Jan January 2016 (has links) This thesis deals with optical devices and methods image processing for recognizing hand gestures. The types of gestures, possible applications, contact based devices and vision based devices are described in thesis. Next, a review of hand detection, features extraction and gesture classification is provided. Proposed gesture recognition system consists of infrared camera FLIR A655sc, infrared FLIR Lepton module, webcam Logitech S7500, method for hand gesture analysis and a database of gestures for classification. For each of the devices, gesture recognition is evaluated in terms of speed and accuracy in different environments. The proposed method was implemented in MATLAB.
96	HMMs and LSTMs for On-line Gesture Recognition on the Stylaero Board : Evaluating and Comparing Two Methods / Kontinuerlig Gestdetektering meddels LSTMer och HMMer Sibelius Parmbäck, Sebastian January 2019 (has links) In this thesis, methods of implementing an online gesture recognition system for the novel Stylaero Board device are investigated. Two methods are evaluated - one based on LSTMs and one based on HMMs - on three kinds of gestures: Tap, circle, and flick motions. A method’s performance was measured in its accuracy in determining both whether any of the above listed gestures were performed and, if so, which gesture, in an online single-pass scenario. Insight was acquired regarding the technical challenges and possible solutions to the online aspect of the problem. Poor performance was, however, observed in both methods, with a likely culprit identified as low quality of training data, due to an arduous and complex gesture performance capturing process. Further research improving on the process of gathering data is suggested. HMM LSTM AI GR ML ANN Hidden Markov Model Long Short Term Memory Gesture Recognition Machine Learning Artificial Neural Network Computer Systems Datorsystem
97	Einsatz der elektronischen Patientenakte im Operationssaal am Beispiel der HNO-Chirurgie Dressler, Christian 30 April 2013 (has links) Wenn ein Chirurg heutzutage während der Operation Informationen aus der Patientenakte benötigt, ist er gezwungen, sich entweder unsteril zu machen oder Personal anzuweisen, ihm die entspre-chenden Informationen zugänglich zu machen. Aus technischer Sicht ist ein System zur intraoperati-ven Bedienung und Darstellung sehr einfach zu realisieren. Grundlage dafür ist eine elektronische Patientenakte (EPA), welche beispielsweise softwaregenerierten oder eingescannten Dokumenten verwaltet. Die vorliegende Arbeit widmet sich den folgenden Fragen: Wird ein solches System im Operationssaal sinnvoll genutzt? Welche Methoden zur sterilen Bedienung kommen infrage? Wie muss die grafische Darstellung auf den Operationssaal abgestimmt werden? Kann durch das Imple-mentieren aktueller Kommunikationsstandards auf alle verfügbaren Patientendaten zugegriffen werden? Dazu wurden in einer ambulanten HNO-Klinik zwei Pilotstudien durchgeführt. In der ersten Studie wurde das erste auf dem Markt befindliche kommerzielle Produkt „MI-Report“ der Firma Karl Storz evaluiert, welches per Gestenerkennung bedient wird. Für die zweite Studie wurde ein EPA-System entwickelt (Doc-O-R), welches eine Vorauswahl der angezeigten Dokumente in Abhängigkeit des Eingriffs traf und mit einem Fußschalter bedient werden konnte. Pro System wurden ca. 50 Eingriffe dokumentiert. Dabei wurde jedes angesehene Dokument und der Nutzungsgrund protokolliert. Die Systeme wurden durchschnittlich mehr als einmal pro Eingriff genutzt. Die automatische Vorauswahl der Dokumente zur Reduzierung der Interaktionen zeigte sehr gute Ergebnisse. Da das behandelte Thema noch in den Anfängen steckt, wird in der Arbeit am Ende auf die Vielzahl von Möglichkeiten eingegangen, welche bezüglich neuartiger Darstellungsmethoden, Bedienvorrich-tungen und aktueller Standardisierungsaktivitäten noch realisiert werden können. Dadurch werden zukünftig auch die Abläufe in der Chirurgie beeinflusst werden.:1 Einführung 13 1.1 Problemstellung 14 1.2 Stand der Wissenschaft und Technik 14 1.2.1 Überblick 15 1.2.2 Digitalisierung des Operationssaals 16 1.2.3 Verbreitung Elektronischer Datenverarbeitungssysteme im Krankenhaus 16 1.2.4 Definitionen zum Begriff der elektronischen Patientenakte 17 1.2.5 Aufbau eines EPA-Systems 20 1.2.6 Sterile Bedienkonzepte 20 1.2.7 Darstellung 27 1.2.8 Standardisierung 33 2 Aufgabenstellung 39 3 Materialien und Methoden 41 3.1 Klinik 41 3.1.1 Technischer Stand 41 3.1.2 Abläufe im IRDC 41 3.2 Protokollierung 43 3.3 Verwendete Dokumente 44 3.3.1 KIS-Übersicht 44 3.3.2 Audiogramm 45 3.3.3 Tympanogramm 46 3.3.4 Blutwerte 47 3.3.5 OP-Bericht 48 3.3.6 Rhinomanometrie 50 3.3.7 Computertomographie 50 3.3.8 Bilder vorangegangener Untersuchungen und Operationen 51 3.3.9 Radiologische Gutachten 52 3.3.10 Anamnese 53 3.3.11 Überweisung 54 3.3.12 Stimmbefund 55 3.4 Statistische Auswertung 55 3.4.1 Abhängigkeit des betrachteten Dokuments von der Art des Eingriffs 55 3.4.2 Bewertung des Algorithmus zur automatischen Vorauswahl der Dokumente 56 3.5 Vorbereitung 57 3.6 Studie „MI-Report“ 57 3.6.1 Anzeige 58 3.6.2 Sensor und Bedienung 59 3.6.3 Personen 59 3.6.4 Vorbereitung 60 3.6.5 Protokollierung 60 3.7 Studie „Doc-O-R“ 62 3.7.1 Klinik 63 3.7.2 Vorbereitung 64 3.7.3 Protokollierung 64 3.7.4 Metadaten 65 3.7.5 Softwareentwicklung 65 4 Ergebnisse 69 4.1.1 Statistische Auswertung 71 4.2 Studie „MI-Report“ 71 4.2.1 Aktivierung 72 4.2.2 Nutzung 72 4.3 Studie „Doc-O-R“ 75 4.3.1 Datenlage 75 4.3.2 Algorithmus 75 4.3.3 Nutzung 77 4.3.4 Phasen 78 4.3.5 Operateure 79 4.3.6 Revisionen 79 5 Diskussion 81 5.1 Nutzung 81 5.2 Schwächen des Studienaufbaus 82 5.3 Statistische Auswertung 83 5.4 Darstellung 83 5.5 Standards 83 5.5.1 Technische Faktoren 84 5.5.2 Emotionale Faktoren 84 5.5.3 Strategische Faktoren 84 5.5.4 Ökonomische Faktoren 85 5.5.5 Rechtliche Faktoren 85 5.5.6 Machtpolitische Faktoren 85 5.6 Studie „MI-Report“ 85 5.6.1 Grafische Oberfläche 85 5.6.2 Aktivierung 86 5.6.3 Nutzung 86 5.6.4 Schwächen des Studienaufbaus 87 5.7 Studie „Doc-O-R“ 88 5.7.1 Schwächen der Studie 88 5.7.2 Algorithmus 88 5.7.3 Darstellung 88 5.7.4 Bedienung 89 5.7.5 Phasen 89 5.7.6 Nutzung 89 5.7.7 Revisionen 90 6 Schlussfolgerung 91 6.1 Bedienung 91 6.2 Standardisierung 92 6.3 Darstellung 93 6.4 Nutzungsverhalten 94 7 Ausblick 97 7.1 Bedienung 98 7.2 Standardisierung 100 7.3 Darstellung 102 7.4 Nutzungsverhalten 104 8 Zusammenfassung der Arbeit 105 9 Abbildungsverzeichnis 109 10 Quellenangaben 112 Anhang A Anatomische und physiologische Grundlagen 119 Anhang B Ambulante Eingriffe in der HNO-Chirurgie 121 Anhang C Schematischer Aufbau des Operationstraktes 123 info:eu-repo/classification/ddc/004 ddc:004 info:eu-repo/classification/ddc/610 ddc:610
98	Low Cost Open Source Modal Virtual Environment Interfaces Using Full Body Motion Tracking and Hand Gesture Recognition Marangoni, Matthew J. 25 May 2013 (has links) No description available. Computer Science Computer Engineering HCI Kinect accelerometer VR OSS gestural interface gesture recognition SVM support vector machine modal interface body tracking virtual environment interaction
99	Combining Eye Tracking and Gestures to Interact with a Computer System Rådell, Dennis January 2016 (has links) Eye tracking and gestures are relatively new input methods, changing the way humans interact with computers. Gestures can be used for games or controlling a computer through an interface. Eye tracking is another way of interacting with computers, often by combining with other inputs such as a mouse or touch pad. Gestures and eye tracking have been used in commercially available products, but seldom combined to create a multimodal interaction. This thesis presents a prototype which combines eye tracking with gestures to interact with a computer. To accomplish this, the report investigates different methods of recognizing hand gestures. The aim is to combine the technologies in such a way that the gestures can be simple, and the location of a user’s gaze will decide what the gesture does. The report concludes by presenting a final prototype where the gestures are combined with eye tracking to interact with a computer. The final prototype uses an IR camera together with an eye tracker. The final prototype is evaluated with regards to learnability, usefulness, and intuitiveness. The evaluation of the prototype shows that usefulness is low, but learnability and intuitiveness are quite high. / Eye tracking och gester är relativt nya inmatningsmetoder, som förändra sättet människor interagerar med datorer. Gester kan användas för till exempel spel eller för att styra en dator via ett gränssnitt. Eye tracking är ett annat sätt att interagera med datorer, ofta med hjälp av genom att kombinera med andra styrenheter såsom en mus eller styrplatta. Gester och eye tracking har använts i kommersiellt tillgängliga produkter, men sällan kombinerats för att skapa en multimodal interaktion. Denna avhandling presenterar en prototyp som kombinerar eye tracking med gester för att interagera med en dator. För att åstadkomma detta undersöker rapporten olika metoder för att känna igen gester. Målet är att kombinera teknologierna på ett sådant sätt att gestern kan vara enkla, och platsen för användarens blick kommer bestämma vad gesten gör. Rapporten avslutas genom att presentera en slutlig prototyp där gester kombineras med eye tracking för att interagera med en dator. Den slutliga prototypen använder en IR kamera och en eye tracker. Den slutliga prototypen utvärderas med avseende på lärbarhet, användbarhet, och intuition. Utvärderingen av prototypen visar att användbarheten är låg, men både lärbarhet och intuition är ganska höga. Eye tracking Hand gesture recognition Multimodal interaction Human-computer interaction Eyetracking Gestigenkänning Multimodal interaktion Människa- datorinteraktion Computer and Information Sciences Data- och informationsvetenskap
100	Reconnaissance des actions humaines à partir d'une séquence vidéo Touati, Redha 12 1900 (has links) The work done in this master's thesis, presents a new system for the recognition of human actions from a video sequence. The system uses, as input, a video sequence taken by a static camera. A binary segmentation method of the the video sequence is first achieved, by a learning algorithm, in order to detect and extract the different people from the background. To recognize an action, the system then exploits a set of prototypes generated from an MDS-based dimensionality reduction technique, from two different points of view in the video sequence. This dimensionality reduction technique, according to two different viewpoints, allows us to model each human action of the training base with a set of prototypes (supposed to be similar for each class) represented in a low dimensional non-linear space. The prototypes, extracted according to the two viewpoints, are fed to a $K$-NN classifier which allows us to identify the human action that takes place in the video sequence. The experiments of our model conducted on the Weizmann dataset of human actions provide interesting results compared to the other state-of-the art (and often more complicated) methods. These experiments show first the sensitivity of our model for each viewpoint and its effectiveness to recognize the different actions, with a variable but satisfactory recognition rate and also the results obtained by the fusion of these two points of view, which allows us to achieve a high performance recognition rate. / Le travail mené dans le cadre de ce projet de maîtrise vise à présenter un nouveau système de reconnaissance d’actions humaines à partir d'une séquence d'images vidéo. Le système utilise en entrée une séquence vidéo prise par une caméra statique. Une méthode de segmentation binaire est d'abord effectuée, grâce à un algorithme d’apprentissage, afin de détecter les différentes personnes de l'arrière-plan. Afin de reconnaitre une action, le système exploite ensuite un ensemble de prototypes générés, par une technique de réduction de dimensionnalité MDS, à partir de deux points de vue différents dans la séquence d'images. Cette étape de réduction de dimensionnalité, selon deux points de vue différents, permet de modéliser chaque action de la base d'apprentissage par un ensemble de prototypes (censé être relativement similaire pour chaque classe) représentés dans un espace de faible dimension non linéaire. Les prototypes extraits selon les deux points de vue sont amenés à un classifieur K-ppv qui permet de reconnaitre l'action qui se déroule dans la séquence vidéo. Les expérimentations de ce système sur la base d’actions humaines de Wiezmann procurent des résultats assez intéressants comparés à d’autres méthodes plus complexes. Ces expériences montrent d'une part, la sensibilité du système pour chaque point de vue et son efficacité à reconnaitre les différentes actions, avec un taux de reconnaissance variable mais satisfaisant, ainsi que les résultats obtenus par la fusion de ces deux points de vue, qui permet l'obtention de taux de reconnaissance très performant. Traitement de la vidéo Reconnaissance des gestes Réduction de dimensionnalité Reconnaissance des formes Video processing Human gait analysis Gesture recognition Reduction of dimensionality Shape recognition Analyse des activités humaines

Search results