• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 54
  • 19
  • 6
  • 5
  • 4
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 113
  • 113
  • 58
  • 43
  • 36
  • 36
  • 22
  • 20
  • 19
  • 17
  • 17
  • 16
  • 14
  • 12
  • 11
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Αναγνώριση χειρονομιών-actions σε συνθήκες έντονου ανομοιόμορφου φωτισμού

Σωτηρόπουλος, Παναγιώτης 11 October 2013 (has links)
Ο ταχύτατος ρυθμός εξέλιξης της επιστήμης των υπολογιστών τις τελευταίες δεκαετίες είχε σαν αποτέλεσμα την επέκταση της χρήσης των υπολογιστών σε διάφορους τομείς της καθημερινής μας ζωής, από ένα συνεχώς αυξανόμενο αριθμό ανθρώπων. Ωστόσο παρατηρούνται ακόμα αρκετές δυσκολίες στον τρόπο χειρισμού διάφορων υπολογιστικών συστημάτων, γεγονός που προσανατολίζει την έρευνα στην ανάπτυξη συστημάτων των οποίων η χρήση βασίζεται στα "φυσικά" μέσα επικοινωνίας που χρησιμοποιούνται από τον άνθρωπο, όπως για παράδειγμα οι χειρονομίες. Αντικείμενο της παρούσας ειδικής επιστημονικής εργασίας αποτελεί η διερεύνηση και υλοποίηση ενός συστήματος αναγνώρισης ανθρώπινων χειρονομιών σε ακολουθίες εικόνων (video), με χρήση τεχνικών υπολογιστικής όρασης. Κατόπιν μιας σύντομης αναφοράς στην επικοινωνία ανθρώπου-υπολογιστή (ΕΑΥ), ερευνώνται εκτενώς τρία πεδία της όρασης υπολογιστών: το χρώμα, και ειδικότερα η χρήση του χρώματος για την κατάτμηση μιας εικόνας, η τεχνική της σύμπτωσης προτύπων (template matching) για αναζήτηση πρωτοτύπων εικόνων σε άλλες εικόνες και η μέθοδος αναγνώρισης κινούμενων αντικειμένων. Οι μεθοδολογίες αυτές συνδυάζονται για την ανάπτυξη του συστήματος αναγνώρισης χειρονομιών, το οποίο μπορεί να χρησιμοποιηθεί σε εφαρμογές όπως η αλληλεπίδραση με ηλεκτρονικούς υπολογιστές, σε κονσόλες βιντεοπαιχνιδιών και διάφορων συσκευών που χρησιμοποιούμε καθημερινά, όπως η τηλεόραση και το κινητό τηλέφωνο. / The enormous rate of evolution of computer science in recent decades has resulted in expanding the use of computers in more and more areas of our everyday life, by more and more people. However there are still numerous difficulties in using various computer systems, therefore the research is oriented to the development of the use of which is based on more "natural" means that people use to communicate with each other, such as gestures. The subject of this master thesis is the study and development of a system for the recognition of human gestures in image sequences (video), using computer vision techniques. After a brief mention of Human-Computer Interaction extensively investigated three areas of computer vision: color, particularly the use of color for the segmentation of an image, the technique of template matching to search prototype images other images and recognition method of moving objects. These methodologies combine the development of gesture recognition system, which can be used in applications such as computer interaction, video game consoles and various devices that we use every day, including television and mobile phone.
12

Human Action Recognition on Videos: Different Approaches

Mejia, Maria Helena January 2012 (has links)
The goal of human action recognition on videos is to determine in an automatic way what is happening in a video. This work focuses on providing an answer to this question: given consecutive frames from a video where a person or persons are doing an action, is an automatic system able to recognize the action that is going on for each person? Seven approaches have been provided, most of them based on an alignment process in order to find a measure of distance or similarity for obtaining the classification. Some are based on fluents that are converted to qualitative sequences of Allen relations to make it possible to measure the distance between the pair of sequences by aligning them. The fluents are generated in various ways: representation based on feature extraction of human pose propositions in just an image or a small sequence of images, changes of time series mainly on the angle of slope, changes of the time series focus on the slope direction, and propositions based on symbolic sequences generated by SAX. Another approach based on alignment corresponds to Dynamic Time Warping on subsets of highly dependent parts of the body. An additional approach explored is based on SAX symbolic sequences and respective pair wise alignment. The last approach is based on discretization of the multivariate time series, but instead of alignment, a spectrum kernel and SVM are used as is employed to classify protein sequences in biology. Finally, a sliding window method is used to recognize the actions along the video. These approaches were tested on three datasets derived from RGB-D cameras (e.g., Microsoft Kinect) as well as ordinary video, and a selection of the approaches was compared to the results of other researchers.
13

Uma abordagem para interoperabilização de dados de acelerômetros em aplicações interativas / An approach for accelerometer data interoperability on interactive applications

Carvalho, Jorge Rodrigues 25 April 2013 (has links)
Pesquisas em Interfaces Naturais, sub-área da Computação Ubíqua, investigam o uso de dispositivos não-tradicionais para possibilitar a interação entre usuários e aplicaçõs de maneiras menos intrusivas (gestos, voz e escrita baseada em tinta eletrônica, por exemplo). Com o aumento da popularidade de dispositivos equipados com sensores de aceleração, os desenvolvedores agora dispõem de um novo dispositivo que pode ser utilizado para prover interação entre usuários e diferentes aplicações, como por exemplo as que se encontram presentes em ambientes de TV interativos. Assim, aplicações que fazem uso de acelerômetros vêm sendo desenvolvidas para situações específicas, e suas implementações e formatos de dados manipulados são dependentes do domínio para o qual foram projetados. Este trabalho apresenta um modelo para a formalização do modo como esses dados podem ser manipulados, por meio de uma abordagem genérica e extensível. Além disso, o modelo permite a descrição de regras para agregação de valor a estes dados por meio da adição de significados. Isto e obtido com a proposta de uma arquitetura em camadas que possibilita a estruturação e compartilhamento desses dados de modo flexível. Três protótipos foram implementados na linguagem de programação Java, fazendo-se uso dessa arquitetura e de uma API desenvolvida para facilitar o uso do modelo. Essas implementações demonstram a viabilidade do modelo proposto como solução para a questão da interoperabilidade nos cenários ilustrados, e para a extensibilidade dos dados, nos casos em que uma mudança de requisitos faz-se necessária / Research in Natural Interfaces, sub-area of Ubiquitous Computing, investigates the use of non-traditional devices to support user interaction with applications in less intrusive ways (gestures, voice and writing based on electronic ink, for instance). With the increasing popularity of accelerometers, developers now have another tool that can be used to provide interaction between users and different applications, such as interactive TV environments. However, applications that make use of accelerometers are currently being developed for specific situations, and their implementations and handled documents are also dependent on the domain for which they were designed. This work aims to propose a model to formalize how the accelerometer data may be handled in a generic way. In addition, the model enables the description of rules to aggregate value to these data through the addition of meanings. This is done by proposing a layered architecture to structure and share data in a exible way. Three prototypes were implemented in the Java programming language, making use of this architecture and an API designed to facilitate the model implementation. These prototypes demonstrate the feasibility of the model proposed as a solution to the issue of interoperability in the scenarios illustrated, and the problem of data extensibility, whenever a change of requirements poses necessary
14

A random forest approach to segmenting and classifying gestures

Joshi, Ajjen Das 12 March 2016 (has links)
This thesis investigates a gesture segmentation and recognition scheme that employs a random forest classification model. A complete gesture recognition system should localize and classify each gesture from a given gesture vocabulary, within a continuous video stream. Thus, the system must determine the start and end points of each gesture in time, as well as accurately recognize the class label of each gesture. We propose a unified approach that performs the tasks of temporal segmentation and classification simultaneously. Our method trains a random forest classification model to recognize gestures from a given vocabulary, as presented in a training dataset of video plus 3D body joint locations, as well as out-of-vocabulary (non-gesture) instances. Given an input video stream, our trained model is applied to candidate gestures using sliding windows at multiple temporal scales. The class label with the highest classifier confidence is selected, and its corresponding scale is used to determine the segmentation boundaries in time. We evaluated our formulation in segmenting and recognizing gestures from two different benchmark datasets: the NATOPS dataset of 9,600 gesture instances from a vocabulary of 24 aircraft handling signals, and the CHALEARN dataset of 7,754 gesture instances from a vocabulary of 20 Italian communication gestures. The performance of our method compares favorably with state-of-the-art methods that employ Hidden Markov Models or Hidden Conditional Random Fields on the NATOPS dataset. We conclude with a discussion of the advantages of using our model.
15

Virtual Mouse¡GVision-Based Gesture Recognition

Chen, Chih-Yu 01 July 2003 (has links)
The thesis describes a method for human-computer interaction through vision-based gesture recognition and hand tracking, which consists of five phases: image grabbing, image segmentation, feature extraction, gesture recognition, and system mouse controlling. Unlike most of previous works, our method recognizes hand with just one camera and requires no color markers or mechanical gloves. The primary work of the thesis is improving the accuracy and speed of the gesture recognition. Further, the gesture commands will be used to replace the mouse interface on a standard personal computer to control application software in a more intuitive manner.
16

Ανίχνευση κίνησης χεριού και αναγνώριση χειρονομίας σε πραγματικό χρόνο / Hand tracking and hand gesture recognition in real time

Γονιδάκης, Παναγιώτης 07 May 2015 (has links)
Με την αλματώδη πρόοδο της τεχνολογίας τα τελευταία χρόνια, οι συσκευές πολυμέσων έχουν γίνει ακόμη περισσότερο «έξυπνες». Όλες αυτές οι συσκευές, απαιτούν επικοινωνία με τον χρήστη σε πραγματικό χρόνο. Ο τομέας της επικοινωνίας ανθρώπου – υπολογιστή (human – computer interaction - HCI) έχει προχωρήσει πια από την εποχή που τα μοναδικά εργαλεία ήταν το ποντίκι και το πληκτρολόγιο. Ένας από τους πιο ενδιαφέροντες και αναπτυσσόμενους τομείς είναι η χρήση χειρονομιών για αλληλεπίδραση με την έξυπνη συσκευή. Στη παρούσα εργασία προτείνεται ένα αυτόματο σύστημα όπου ο χρήστης θα επικοινωνεί με μία συσκευή πολυμέσων, για παράδειγμα μία τηλεόραση, βάση χειρονομιών σε πραγματικό χρόνο και σε πραγματικές συνθήκες. Θα παρουσιαστούν και θα δοκιμαστούν δημοφιλείς αλγόριθμοι της υπολογιστικής όρασης (computer vision) και της αναγνώρισης προτύπων (pattern recognition) και κάποιοι από αυτούς θα ενσωματωθούν στο σύστημά μας. Το προτεινόμενο σύστημα μπορεί να παρακολουθεί το χέρι ενός χρήστη και να αναγνωρίζει τη χειρονομία του. Η παρούσα εργασία παρουσιάζει μια υλοποίηση στο Matlab® (2014b) και αποτελεί προστάδιο υλοποίησης σε πραγματικό χρόνο. / The rapid advances of technology during the last years have enabled multimedia devices to become more and more «smart» requiring real time interaction with the user. The field of human-computer interaction (HCI) has to show many advances in this communication that is not restricted only to the use of mouse and keyboard that once used to be the only tools for communication with the computer. An area of great interest and improvements is the use of hand gestures in order to enable interaction with the smart device. In this master thesis, a novel automatic system that enables the communication of the user with a multimedia device, e.g. a television, in real time and under real circumstances is proposed. In this thesis, popular algorithms from the fields of computer vision and pattern recognition will be investigated in order to choose the ones that will be incorporated in the proposed system. This system has the ability to track the user’s hand, recognize his gesture and act properly. The current implementation was conducted with the use of Matlab® (2014b) and constitutes a first step of the final real time implementation.
17

Who Moved My Slide? Recognizing Entities In A Lecture Video And Its Applications

Tung, Qiyam Junn January 2014 (has links)
Lecture videos have proliferated in recent years thanks to the increasing bandwidths of Internet connections and availability of video cameras. Despite the massive volume of videos available, there are very few systems that parse useful information from them. Extracting meaningful data can help with searching and indexing of lecture videos as well as improve understanding and usability for the viewers. While video tags and user preferences are good indicators for relevant videos, it is completely dependent on human-generated data. Furthermore, many lecture videos are technical by nature and sparse video tags are too coarse-grained to relate parts of a video by a specific topic. While extracting the text from the presentation slide will ameliorate this issue, a lecture video still contains significantly more information than what is just available on the presentation slides. That is, the actions and words of the speaker contribute to a richer and more nuanced understanding of the lecture material. The goal of the Semantically Linked Instructional Content (SLIC) project is to relate videos using more specific and relevant features such as slide text and other entities. In this work, we will present the algorithms used to recognize the entities of the lecture. Specifically, the entities in lecture videos are the laser and pointing hand gestures and the location of the slide and its text and images in the video. Our algorithms work under the assumption that the slide location (homography) is known for each frame and extend the knowledge of the scene. Specifically, gestures inform when and where on a slide notable events occur. We will also show how recognition of these entities can help with understanding lectures better and energy-savings on mobile devices. We conducted a user study that shows that magnifying text based on laser gestures on a slide helps direct a viewer's attention to the relevant text. We also performed empirical measurements on real cellphones to confirm that selectively dimming less relevant regions of the video frame would reduce energy consumption significantly.
18

Uma abordagem para interoperabilização de dados de acelerômetros em aplicações interativas / An approach for accelerometer data interoperability on interactive applications

Jorge Rodrigues Carvalho 25 April 2013 (has links)
Pesquisas em Interfaces Naturais, sub-área da Computação Ubíqua, investigam o uso de dispositivos não-tradicionais para possibilitar a interação entre usuários e aplicaçõs de maneiras menos intrusivas (gestos, voz e escrita baseada em tinta eletrônica, por exemplo). Com o aumento da popularidade de dispositivos equipados com sensores de aceleração, os desenvolvedores agora dispõem de um novo dispositivo que pode ser utilizado para prover interação entre usuários e diferentes aplicações, como por exemplo as que se encontram presentes em ambientes de TV interativos. Assim, aplicações que fazem uso de acelerômetros vêm sendo desenvolvidas para situações específicas, e suas implementações e formatos de dados manipulados são dependentes do domínio para o qual foram projetados. Este trabalho apresenta um modelo para a formalização do modo como esses dados podem ser manipulados, por meio de uma abordagem genérica e extensível. Além disso, o modelo permite a descrição de regras para agregação de valor a estes dados por meio da adição de significados. Isto e obtido com a proposta de uma arquitetura em camadas que possibilita a estruturação e compartilhamento desses dados de modo flexível. Três protótipos foram implementados na linguagem de programação Java, fazendo-se uso dessa arquitetura e de uma API desenvolvida para facilitar o uso do modelo. Essas implementações demonstram a viabilidade do modelo proposto como solução para a questão da interoperabilidade nos cenários ilustrados, e para a extensibilidade dos dados, nos casos em que uma mudança de requisitos faz-se necessária / Research in Natural Interfaces, sub-area of Ubiquitous Computing, investigates the use of non-traditional devices to support user interaction with applications in less intrusive ways (gestures, voice and writing based on electronic ink, for instance). With the increasing popularity of accelerometers, developers now have another tool that can be used to provide interaction between users and different applications, such as interactive TV environments. However, applications that make use of accelerometers are currently being developed for specific situations, and their implementations and handled documents are also dependent on the domain for which they were designed. This work aims to propose a model to formalize how the accelerometer data may be handled in a generic way. In addition, the model enables the description of rules to aggregate value to these data through the addition of meanings. This is done by proposing a layered architecture to structure and share data in a exible way. Three prototypes were implemented in the Java programming language, making use of this architecture and an API designed to facilitate the model implementation. These prototypes demonstrate the feasibility of the model proposed as a solution to the issue of interoperability in the scenarios illustrated, and the problem of data extensibility, whenever a change of requirements poses necessary
19

Data collection of 3D spatial features of gestures from static peruvian sign language alphabet for sign language recognition

Nurena-Jara, Roberto, Ramos-Carrion, Cristopher, Shiguihara-Juarez, Pedro 21 October 2020 (has links)
El texto completo de este trabajo no está disponible en el Repositorio Académico UPC por restricciones de la casa editorial donde ha sido publicado. / Peruvian Sign Language Recognition (PSL) is approached as a classification problem. Previous work has employed 2D features from the position of hands to tackle this problem. In this paper, we propose a method to construct a dataset consisting of 3D spatial positions of static gestures from the PSL alphabet, using the HTC Vive device and a well-known technique to extract 21 keypoints from the hand to obtain a feature vector. A dataset of 35, 400 instances of gestures for PSL was constructed and a novel way to extract data was stated. To validate the appropriateness of this dataset, a comparison of four baselines classifiers in the Peruvian Sign Language Recognition (PSLR) task was stated, achieving 99.32% in the average in terms of F1 measure in the best case. / Revisión por pares
20

Dynamic Hand Gesture Recognition Using Ultrasonic Sonar Sensors and Deep Learning

Lin, Chiao-Shing 03 March 2022 (has links)
The space of hand gesture recognition using radar and sonar is dominated mostly by radar applications. In addition, the machine learning algorithms used by these systems are typically based on convolutional neural networks with some applications exploring the use of long short term memory networks. The goal of this study was to build and design a Sonar system that can classify hand gestures using a machine learning approach. Secondly, the study aims to compare convolutional neural networks to long short term memory networks as a means to classify hand gestures using sonar. A Doppler Sonar system was designed and built to be able to sense hand gestures. The Sonar system is a multi-static system containing one transmitter and three receivers. The sonar system can measure the Doppler frequency shifts caused by dynamic hand gestures. Since the system uses three receivers, three different Doppler frequency channels are measured. Three additional differential frequency channels are formed by computing the differences between the frequency of each of the receivers. These six channels are used as inputs to the deep learning models. Two different deep learning algorithms were used to classify the hand gestures; a Doppler biLSTM network [1] and a CNN [2]. Six basic hand gestures, two in each x- y- and z-axis, and two rotational hand gestures are recorded using both left and right hand at different distances. The gestures were also recorded using both left and right hands. Ten-Fold cross-validation is used to evaluate the networks' performance and classification accuracy. The LSTM was able to classify the six basic gestures with an accuracy of at least 96% but with the addition of the two rotational gestures, the accuracy drops to 47%. This result is acceptable since the basic gestures are more commonly used gestures than rotational gestures. The CNN was able to classify all the gestures with an accuracy of at least 98%. Additionally, The LSTM network is also able to classify separate left and right-hand gestures with an accuracy of 80% and The CNN with an accuracy of 83%. The study shows that CNN is the most widely used algorithm for hand gesture recognition as it can consistently classify gestures with various degrees of complexity. The study also shows that the LSTM network can also classify hand gestures with a high degree of accuracy. More experimentation, however, needs to be done in order to increase the complexity of recognisable gestures.

Page generated in 0.122 seconds