Spelling suggestions: "subject:"[een] HIDDEN MARKOV MODELS"" "subject:"[enn] HIDDEN MARKOV MODELS""
91 |
Automated Interpretation of Abnormal Adult ElectroencephalogramsLopez de Diego, Silvia Isabel January 2017 (has links)
Interpretation of electroencephalograms (EEGs) is a process that is still dependent on the subjective analysis of the examiner. The interrater agreement, even for relevant clinical events such as seizures, can be low. For instance, the differences between interictal, ictal, and post-ictal EEGs can be quite subtle. Before making such low-level interpretations of the signals, neurologists often classify EEG signals as either normal or abnormal. Even though the characteristics of a normal EEG are well defined, there are some factors, such as benign variants, that complicate this decision. However, neurologists can make this classification accurately by only examining the initial portion of the signal. Therefore, in this thesis, we explore the hypothesis that high performance machine classification of an EEG signal as abnormal can approach human performance using only the first few minutes of an EEG recording. The goal of this thesis is to establish a baseline for automated classification of abnormal adult EEGs using state of the art machine learning algorithms and a big data resource – The TUH EEG Corpus. A demographically balanced subset of the corpus was used to evaluate performance of the systems. The data was partitioned into a training set (1,387 normal and 1,398 abnormal files), and an evaluation set (150 normal and 130 abnormal files). A system based on hidden Markov Models (HMMs) achieved an error rate of 26.1%. The addition of a Stacked Denoising Autoencoder (SdA) post-processing step (HMM-SdA) further decreased the error rate to 24.6%. The overall best result (21.2% error rate) was achieved by a deep learning system that combined a Convolutional Neural Network and a Multilayer Perceptron (CNN-MLP). Even though the performance of our algorithm still lags human performance, which approaches a 1% error rate for this task, we have established an experimental paradigm that can be used to explore this application and have demonstrated a promising baseline using state of the art deep learning technology. / Electrical and Computer Engineering
|
92 |
Recognition of off-line printed Arabic text using Hidden Markov Models.Al-Muhtaseb, Husni A., Mahmoud, Sabri A., Qahwaji, Rami S.R. January 2008 (has links)
yes / This paper describes a technique for automatic recognition of off-line printed Arabic text using Hidden Markov Models. In this work different sizes of overlapping and non-overlapping hierarchical windows are used to generate 16 features from each vertical sliding strip. Eight different Arabic fonts were used for testing (viz. Arial, Tahoma, Akhbar, Thuluth, Naskh, Simplified Arabic, Andalus, and Traditional Arabic). It was experimentally proven that different fonts have their highest recognition rates at different numbers of states (5 or 7) and codebook sizes (128 or 256).
Arabic text is cursive, and each character may have up to four different shapes based on its location in a word. This research work considered each shape as a different class, resulting in a total of 126 classes (compared to 28 Arabic letters). The achieved average recognition rates were between 98.08% and 99.89% for the eight experimental fonts.
The main contributions of this work are the novel hierarchical sliding window technique using only 16 features for each sliding window, considering each shape of Arabic characters as a separate class, bypassing the need for segmenting Arabic text, and its applicability to other languages.
|
93 |
Intent Recognition Of Rotation Versus Translation Movements In Human-Robot Collaborative Manipulation TasksNguyen, Vinh Q 07 November 2016 (has links) (PDF)
The goal of this thesis is to enable a robot to actively collaborate with a person to move an object in an efficient, smooth and robust manner. For a robot to actively assist a person it is key that the robot recognizes the actions or phases of a collaborative tasks. This requires the robot to have the ability to estimate a person’s movement intent. A hurdle in collaboratively moving an object is determining whether the partner is trying to rotate or translate the object (the rotation versus translation problem). In this thesis, Hidden Markov Models (HMM) are used to recognize human intent of rotation or translation in real-time. Based on this recognition, an appropriate impedance control mode is selected to assist the person. The approach is tested on a seven degree-of-freedom industrial robot, KUKA LBR iiwa 14 R820, working with a human partner during manipulation tasks. Results show the HMMs can estimate human intent with accuracy of 87.5% by using only haptic data recorded from the robot. Integrated with impedance control, the robot is able to collaborate smoothly and efficiently with a person during the manipulation tasks. The HMMs are compared with a switching function based approach that uses interaction force magnitudes to recognize rotation versus translation. The results show that HMMs can predict correctly when fast rotation or slow translation is desired, whereas the switching function based on force magnitudes performs poorly.
|
94 |
Iterative Decoding and Channel Estimation over Hidden Markov Fading ChannelsKhan, Anwer Ali 24 May 2000 (has links)
Since the 1950s, hidden Markov models (HMMS) have seen widespread use in electrical engineering. Foremost has been their use in speech processing, pattern recognition, artificial intelligence, queuing theory, and communications theory. However, recent years have witnessed a renaissance in the application of HMMs to the analysis and simulation of digital communication systems. Typical applications have included signal estimation, frequency tracking, equalization, burst error characterization, and transmit power control. Of special significance to this thesis, however, has been the use of HMMs to model fading channels typical of wireless communications. This variegated use of HMMs is fueled by their ability to model time-varying systems with memory, their ability to yield closed form solutions to otherwise intractable analytic problems, and their ability to help facilitate simple hardware and/or software based implementations of simulation test-beds.
The aim of this thesis is to employ and exploit hidden Markov fading models within an iterative (turbo) decoding framework. Of particular importance is the problem of channel estimation, which is vital for realizing the large coding gains inherent in turbo coded schemes. This thesis shows that a Markov fading channel (MFC) can be conceptualized as a trellis, and that the transmission of a sequence over a MFC can be viewed as a trellis encoding process much like convolutional encoding. The thesis demonstrates that either maximum likelihood sequence estimation (MLSE) algorithms or maximum <I> a posteriori</I> (MAP) algorithms operating over the trellis defined by the MFC can be used for channel estimation. Furthermore, the thesis illustrates sequential and decision-directed techniques for using the aforementioned trellis based channel estimators <I>en masse</I> with an iterative decoder. / Master of Science
|
95 |
Bird song recognition with hidden Markov modelsVan der Merwe, Hugo Jacobus 03 1900 (has links)
Thesis (MScEng (Electrical and Electronic Engineering))--Stellenbosch University, 2008. / Automatic bird song recognition and transcription is a relatively new field. Reliable
automatic recognition systems would be of great benefit to further research
in ornithology and conservation, as well as commercially in the very large birdwatching
subculture.
This study investigated the use of Hidden Markov Models and duration
modelling for bird call recognition. Through use of more accurate duration
modelling, very promising results were achieved with feature vectors consisting
of only pitch and volume. An accuracy of 51% was achieved for 47 calls from 39
birds, with the models typically trained from only one or two specimens. The
ALS pitch tracking algorithm was adapted to bird song to extract the pitch.
Bird song synthesis was employed to subjectively evaluate the features.
Compounded Selfloop Duration Modelling was developed as an alternative
duration modelling technique. For long durations, this technique can be more
computationally efficient than Ferguson stacks.
The application of approximate string matching to bird song was also briefly
considered.
|
96 |
An HMM-based automatic singing transcription platform for a sight-singing tutorKrige, Willie 03 1900 (has links)
Thesis (MScEng (Electrical and Electronic Engineering))--Stellenbosch University, 2008. / A singing transcription system transforming acoustic input into MIDI note sequences
is presented. The transcription system is incorporated into a pronunciation-independent
sight-singing tutor system, which provides note-level feedback on the accuracy with which
each note in a sequence has been sung.
Notes are individually modeled with hidden Markov models (HMMs) using untuned
pitch and delta-pitch as feature vectors. A database consisting of annotated passages
sung by 26 soprano subjects was compiled for the development of the system, since no
existing data was available. Various techniques that allow efficient use of a limited dataset
are proposed and evaluated. Several HMM topologies are also compared, in analogy with
approaches often used in the field of automatic speech recognition. Context-independent
note models are evaluated first, followed by the use of explicit transition models to better
identify boundaries between notes. A non-repetitive grammar is used to reduce the
number of insertions. Context-dependent note models are then introduced, followed by
context-dependent transition models. The aim in introducing context-dependency is to
improve transition region modeling, which in turn should increase note transcription accuracy,
but also improve the time-alignment of the notes and the transition regions. The
final system is found to be able to transcribe sung passages with around 86% accuracy.
Finally, a note-level sight-singing tutor system based on the singing transcription system
is presented and a number of note sequence scoring approaches are evaluated.
|
97 |
Deep Neural Networks for Large Vocabulary Handwritten Text Recognition / Réseaux de Neurones Profonds pour la Reconnaissance de Texte Manucrit à Large VocabulaireBluche, Théodore 13 May 2015 (has links)
La transcription automatique du texte dans les documents manuscrits a de nombreuses applications, allant du traitement automatique des documents à leur indexation ou leur compréhension. L'une des approches les plus populaires de nos jours consiste à parcourir l'image d'une ligne de texte avec une fenêtre glissante, de laquelle un certain nombre de caractéristiques sont extraites, et modélisées par des Modèles de Markov Cachés (MMC). Quand ils sont associés à des réseaux de neurones, comme des Perceptrons Multi-Couches (PMC) ou Réseaux de Neurones Récurrents de type Longue Mémoire à Court Terme (RNR-LMCT), et à un modèle de langue, ces modèles produisent de bonnes transcriptions. D'autre part, dans de nombreuses applications d'apprentissage automatique, telles que la reconnaissance de la parole ou d'images, des réseaux de neurones profonds, comportant plusieurs couches cachées, ont récemment permis une réduction significative des taux d'erreur.Dans cette thèse, nous menons une étude poussée de différents aspects de modèles optiques basés sur des réseaux de neurones profonds dans le cadre de systèmes hybrides réseaux de neurones / MMC, dans le but de mieux comprendre et évaluer leur importance relative. Dans un premier temps, nous montrons que des réseaux de neurones profonds apportent des améliorations cohérentes et significatives par rapport à des réseaux ne comportant qu'une ou deux couches cachées, et ce quel que soit le type de réseau étudié, PMC ou RNR, et d'entrée du réseau, caractéristiques ou pixels. Nous montrons également que les réseaux de neurones utilisant les pixels directement ont des performances comparables à ceux utilisant des caractéristiques de plus haut niveau, et que la profondeur des réseaux est un élément important de la réduction de l'écart de performance entre ces deux types d'entrées, confirmant la théorie selon laquelle les réseaux profonds calculent des représentations pertinantes, de complexités croissantes, de leurs entrées, en apprenant les caractéristiques de façon automatique. Malgré la domination flagrante des RNR-LMCT dans les publications récentes en reconnaissance d'écriture manuscrite, nous montrons que des PMCs profonds atteignent des performances comparables. De plus, nous avons évalué plusieurs critères d'entrainement des réseaux. Avec un entrainement discriminant de séquences, nous reportons, pour des systèmes PMC/MMC, des améliorations comparables à celles observées en reconnaissance de la parole. Nous montrons également que la méthode de Classification Temporelle Connexionniste est particulièrement adaptée aux RNRs. Enfin, la technique du dropout a récemment été appliquée aux RNR. Nous avons testé son effet à différentes positions relatives aux connexions récurrentes des RNRs, et nous montrons l'importance du choix de ces positions.Nous avons mené nos expériences sur trois bases de données publiques, qui représentent deux langues (l'anglais et le français), et deux époques, en utilisant plusieurs types d'entrées pour les réseaux de neurones : des caractéristiques prédéfinies, et les simples valeurs de pixels. Nous avons validé notre approche en participant à la compétition HTRtS en 2014, où nous avons obtenu la deuxième place. Les résultats des systèmes présentés dans cette thèse, avec les deux types de réseaux de neurones et d'entrées, sont comparables à l'état de l'art sur les bases Rimes et IAM, et leur combinaison dépasse les meilleurs résultats publiés sur les trois bases considérées. / The automatic transcription of text in handwritten documents has many applications, from automatic document processing, to indexing and document understanding. One of the most popular approaches nowadays consists in scanning the text line image with a sliding window, from which features are extracted, and modeled by Hidden Markov Models (HMMs). Associated with neural networks, such as Multi-Layer Perceptrons (MLPs) or Long Short-Term Memory Recurrent Neural Networks (LSTM-RNNs), and with a language model, these models yield good transcriptions. On the other hand, in many machine learning applications, including speech recognition and computer vision, deep neural networks consisting of several hidden layers recently produced a significant reduction of error rates. In this thesis, we have conducted a thorough study of different aspects of optical models based on deep neural networks in the hybrid neural network / HMM scheme, in order to better understand and evaluate their relative importance. First, we show that deep neural networks produce consistent and significant improvements over networks with one or two hidden layers, independently of the kind of neural network, MLP or RNN, and of input, handcrafted features or pixels. Then, we show that deep neural networks with pixel inputs compete with those using handcrafted features, and that depth plays an important role in the reduction of the performance gap between the two kinds of inputs, supporting the idea that deep neural networks effectively build hierarchical and relevant representations of their inputs, and that features are automatically learnt on the way. Despite the dominance of LSTM-RNNs in the recent literature of handwriting recognition, we show that deep MLPs achieve comparable results. Moreover, we evaluated different training criteria. With sequence-discriminative training, we report similar improvements for MLP/HMMs as those observed in speech recognition. We also show how the Connectionist Temporal Classification framework is especially suited to RNNs. Finally, the novel dropout technique to regularize neural networks was recently applied to LSTM-RNNs. We tested its effect at different positions in LSTM-RNNs, thus extending previous works, and we show that its relative position to the recurrent connections is important. We conducted the experiments on three public databases, representing two languages (English and French) and two epochs, using different kinds of neural network inputs: handcrafted features and pixels. We validated our approach by taking part to the HTRtS contest in 2014. The results of the final systems presented in this thesis, namely MLPs and RNNs, with handcrafted feature or pixel inputs, are comparable to the state-of-the-art on Rimes and IAM. Moreover, the combination of these systems outperformed all published results on the considered databases.
|
98 |
RASTREAMENTO DE AGROBOTS EM ESTUFAS AGRÍCOLAS USANDO MODELOS OCULTOS DE MARKOV: Comparação do desempenho e da correção dos algoritmos de Viterbi e Viterbi com janela de observações deslizanteAlves, Roberson Junior Fernandes 17 September 2015 (has links)
Made available in DSpace on 2017-07-21T14:19:26Z (GMT). No. of bitstreams: 1
Roberson Junior Fernandes Alves.pdf: 17901245 bytes, checksum: 170e17bbccf0e54fa9b0dab204aca2e4 (MD5)
Previous issue date: 2015-09-17 / Developing mobile and autonomous agrobots for greenhouses requires the use of procedures which allow robot autolocalization and tracking. The tracking problem can be modeled as finding the most likely sequence of states in a hidden Markov model„ whose states indicate
the positions of an occupancy grid. This sequence can be estimated with Viterbi’s algorithm. However, the processing time and consumed memory, of this algorithm, grows with the dimensions of the grid and tracking duration, and, this can constraint its use for tracking agrobots. Considering it, this work presents a tracking procedure which uses two approximated implementations of Viterbi’s algorithm called Viterbi-JD(Viterbi’s algorithm
with a sliding window) and Viterbi-JD-MTE(Viterbi’s algorithm with a sliding window over an hidden Markov model with sparse transition matrix). The experimental
results show that the time and memory performance of tracking with this two approximated implementations are significantly higher than the Viterbi’s based tracking. The
reported tracking hypothesis is suboptimal, when compared to the hypothesis generated
by Viterbi, but the error does not grows substantially. Th experimentos was performed
using RSSI(Received Signal Strength Indicator) simulated data. / O desenvolvimento de agrobots móveis e autônomos para operar em estufas agrícolas depende da implementação de procedimentos que permitam o rastreamento do robô no ambiente. O problema do rastreamento pode ser modelado como a determinação da sequência
de estados mais prováveis de um modelo oculto de Markov cujos estados indicam posições de uma grade de ocupação. Esta sequência pode ser estimada pelo algoritmo de Viterbi. No entanto, o tempo de processamento e a memória consumida, por esse algoritmo, crescem com as dimensões da grade e com a duração do rastreamento, e isto pode limitar
seu uso no rastreamento de agrobots em estufas. Considerando o exposto, este trabalho apresenta um procedimento de rastreamento que utiliza mplementações aproximadas do algoritmo de Viterbi denominadas de Viterbi-JD(Viterbi com janela deslizante) e Viterbi-
JD-MTE(Viterbi com janela deslizante sobre um modelo oculto de Markov com matriz de transição esparsa). Os experimentos mostram que o desempenho de tempo e memória
do rastreamento baseado nessas implementações aproximadas é significativamente melhor
que aquele do algoritmo original. A hipótese de rastreamento gerada é sub ótima em relação
àquela calculada pelo algoritmo original, contudo, não há um aumento substancial
do erro. Os experimentos foram realizados utilizando dados simulados de RSSI (Received
Signal Strength Indicator).
|
99 |
Dynamic Programming with Multiple Candidates and its Applications to Sign Language and Hand Gesture RecognitionYang, Ruiduo 07 March 2008 (has links)
Dynamic programming has been widely used to solve various kinds of optimization problems.In this work, we show that two crucial problems in video-based sign language and gesture recognition systems can be attacked by dynamic programming with additional multiple observations. The first problem occurs at the higher (sentence) level. Movement epenthesis [1] (me), i.e., the necessary but meaningless movement between signs, can result in difficulties in modeling and scalability as the number of signs increases. The second problem occurs at the lower (feature) level. Ambiguity of hand detection and occlusion will propagate errors to the higher level. We construct a novel framework that can handle both of these problems based on a dynamic programming approach.
The me has only be modeled explicitly in the past. Our proposed method tries to handle me in a dynamic programming framework where we model the me implicitly. We call this enhanced Level Building (eLB) algorithm. This formulation also allows the incorporation of statistical grammar models such as bigrams and trigrams. Another dynamic programming process that handles the problem of selecting among multiple hand candidates is also included in the feature level. This is different from most of the previous approaches, where a single observation is used. We also propose a grouping process that can generate multiple, overlapping hand candidates.
We demonstrate our ideas on three continuous American Sign Language data sets and one hand gesture data set. The ASL data sets include one with a simple background, one with a simple background but with the signer wearing short sleeved clothes, and the last with a complex and changing background. The gesture data set contains color gloved gestures with a complex background. We achieve within 5% performance loss from the automatically chosen me score compared with the manually chosen me score. At the low level, we first over segment each frame to get a list of segments. Then we use a greedy method to group the segments based on different grouping cues. We also show that the performance loss is within 5% when we compare this method with manually selected feature vectors.
|
100 |
Text Augmentation: Inserting markup into natural language text with PPM ModelsYeates, Stuart Andrew January 2006 (has links)
This thesis describes a new optimisation and new heuristics for automatically marking up XML documents, and CEM, a Java implementation, using PPM models. CEM is significantly more general than previous systems, marking up large numbers of hierarchical tags, using n-gram models for large n and a variety of escape methods. Four corpora are discussed, including the bibliography corpus of 14682 bibliographies laid out in seven standard styles using the BibTeX system and marked up in XML with every field from the original BibTeX. Other corpora include the ROCLING Chinese text segmentation corpus, the Computists' Communique corpus and the Reuters' corpus. A detailed examination is presented of the methods of evaluating mark up algorithms, including computation complexity measures and correctness measures from the fields of information retrieval, string processing, machine learning and information theory. A new taxonomy of markup complexities is established and the properties of each taxon are examined in relation to the complexity of marked up documents. The performance of the new heuristics and optimisation are examined using the four corpora.
|
Page generated in 0.041 seconds