Spelling suggestions: "subject:"hidden markov models"" "subject:"midden markov models""
61 |
PELICAN : a PipELIne, including a novel redundancy-eliminating algorithm, to Create and maintain a topicAl family-specific Non-redundant protein databaseAndersson, Christoffer January 2005 (has links)
<p>The increasing number of biological databases today requires that users are able to search more efficiently among as well as in individual databases. One of the most widespread problems is redundancy, i.e. the problem of duplicated information in sets of data. This thesis aims at implementing an algorithm that distinguishes from other related attempts by using the genomic positions of sequences, instead of similarity based sequence comparisons, when making a sequence data set non-redundant. In an automatic updating procedure the algorithm drastically increases the possibility to update and to maintain the topicality of a non-redundant database. The procedure creates a biologically sound non-redundant data set with accuracy comparable to other algorithms focusing on making data sets non-redundant</p>
|
62 |
Integrative assistive system for dyslexic learners using hidden Markov models.Ndombo, Mpia Daniel January 2013 (has links)
D. Tech. Computer Science and Data Processing / The general research question is aimed at how to implement an integrative assistive system for dyslexic learners (IASD), which combines all their three major literacy barriers (phonological awareness, reading and writing skills) in one system. The main research question is therefore as follows: How can a framework for integrative assistive system be developed to mitigate learning barriers (DLB) using hidden Markov model machine learning techniques (HMM)?
|
63 |
Hidden Markov Models Predict Epigenetic Chromatin DomainsLarson, Jessica 20 December 2012 (has links)
Epigenetics is an important layer of transcriptional control necessary for cell-type specific gene regulation. We developed computational methods to analyze the combinatorial effect and large-scale organizations of genome-wide distributions of epigenetic marks. Throughout this dissertation, we show that regions containing multiple genes with similar epigenetic patterns are found throughout the genome, suggesting the presence of several chromatin domains. In Chapter 1, we develop a hidden Markov model (HMM) for detecting the types and locations of epigenetic domains from multiple histone modifications. We use this method to analyze a published ChIP-seq dataset of five histone modification marks in mouse embryonic stem cells. We successfully detect domains of consistent epigenetic patterns from ChIP-seq data, providing new insights into the role of epigenetics in longrange gene regulation. In Chapter 2, we expand our model to investigate the genome-wide patterns of histone modifications in multiple human cell lines. We find that chromatin states can be used to accurately classify cell differentiation stage, and that three cancer cell lines can be classified as differentiated cells. We also found that genes whose chromatin states change dynamically in accordance with differentiation stage are not randomly distributed across the genome, but tend to be embedded in multi-gene chromatin domains. Moreover, many specialized gene clusters are associated with stably occupied domains. In the last chapter, we develop a more sophisticated, tiered HMM to include a domain structure in our chromatin annotation. We find that a model with three domains and five sub-states per domain best fits our data. Each state has a unique epigenetic pattern, while still staying true to its domain’s specific functional aspects and expression profiles. The majority of the genome (including most introns and intergenic regions) has low epigenetic signals and is assigned to the same domain. Our model outperforms current chromatin state models due to its increased domain coherency and interpretation.
|
64 |
Bayesian Inference Approaches for Particle Trajectory Analysis in Cell BiologyMonnier, Nilah 28 August 2013 (has links)
Despite the importance of single particle motion in biological systems, systematic inference approaches to analyze particle trajectories and evaluate competing motion models are lacking. An automated approach for robust evaluation of motion models that does not require manual intervention is highly desirable to enable analysis of datasets from high-throughput imaging technologies that contain hundreds or thousands of trajectories of biological particles, such as membrane receptors, vesicles, chromosomes or kinetochores, mRNA particles, or whole cells in developing embryos. Bayesian inference is a general theoretical framework for performing such model comparisons that has proven successful in handling noise and experimental limitations in other biological applications. The inherent Bayesian penalty on model complexity, which avoids overfitting, is particularly important for particle trajectory analysis given the highly stochastic nature of particle diffusion. This thesis presents two complementary approaches for analyzing particle motion using Bayesian inference. The first method, MSD-Bayes, discriminates a wide range of motion models--including diffusion, directed motion, anomalous and confined diffusion--based on mean- square displacement analysis of a set of particle trajectories, while the second method, HMM-Bayes, identifies dynamic switching between diffusive and directed motion along individual trajectories using hidden Markov models. These approaches are validated on biological particle trajectory datasets from a wide range of experimental systems, demonstrating their broad applicability to research in cell biology.
|
65 |
Diagnostics and Generalizations for Parametric State EstimationNearing, Grey Stephen January 2013 (has links)
This dissertation is comprised of a collection of five distinct research projects which apply, evaluate and extend common methods for land surface data assimilation. The introduction of novel diagnostics and extensions of existing algorithms is motivated by an example, related to estimating agricultural productivity, of failed application of current methods. We subsequently develop methods, based on Shannon's theory of communication, to quantify the contributions from all possible factors to the residual uncertainty in state estimates after data assimilation, and to measure the amount of information contained in observations which is lost due to erroneous assumptions in the assimilation algorithm. Additionally, we discuss an appropriate interpretation of Shannon information which allows us to measure the amount of information contained in a model, and use this interpretation to measure the amount of information introduced during data assimilation-based system identification. Finally, we propose a generalization of the ensemble Kalman filter designed to alleviate one of the primary assumptions - that the observation function is linear.
|
66 |
An examination of predator habitat usage: movement analysis in a marine fishery and freshwater fishCharles, Colin 03 July 2013 (has links)
This thesis investigates the influence of predator movements upon habitat selection and foraging success. It deals with two very distinct datasets one from a marine system, the snow crab (Chionoecetes opilio) fishery, and the second from a freshwater system, an experimental rainbow trout (Oncorhynchus mykiss) aquaculture operation. Deriving a standardized measure of catch from logbook data is important because catch per unit effort (CPUE) is used in fisheries analysis to estimate abundance, but it some cases CPUE is a biased estimate. For the snow crab fishery, a relative abundance measure was developed using fisher movements and logbook data that reflected commercially available biomass and produced an improved relative abundance estimate. Results from the aquaculture dataset indicate that escaped farmed rainbow trout continue to use the cage site when waste feed is available, while native lake trout do not interact with the cage. Once access to waste feed is removed, both lake trout and escaped rainbow trout do not use the cage site. This thesis uses methods to identify patterns and behaviours using movement tracks to increase our understanding of predator habitat usage.
|
67 |
Automatic Driver Fatigue Monitoring Using Hidden Markov Models and Bayesian NetworksRashwan, Abdullah 11 December 2013 (has links)
The automotive industry is growing bigger each year. The central concern for any automotive company is driver and passenger safety. Many automotive companies have developed driver assistance systems, to help the driver and to ensure driver safety. These systems include adaptive cruise control, lane departure warning, lane change assistance, collision avoidance, night vision, automatic parking, traffic sign recognition, and driver fatigue detection.
In this thesis, we aim to build a driver fatigue detection system that advances the research in this area. Using vision in detecting driver fatigue is commonly the key part for driver fatigue detection systems. We have decided to investigate different direction. We examine the driver's voice, heart rate, and driving performance to assess fatigue level. The system consists of three main modules: the audio module, the heart rate and other signals module, and the Bayesian network module.
The audio module analyzes an audio recording of a driver and tries to estimate the level of fatigue for the driver. A Voice Activity Detection (VAD) module is used to extract driver speech from the audio recording. Mel-Frequency Cepstrum Coefficients, (MFCC) features are extracted from the speech signal, and then Support Vector Machines (SVM) and Hidden Markov Models (HMM) classifiers are used to detect driver fatigue. Both classifiers are tuned for best performance, and the performance of both classifiers is reported and compared.
The heart rate and other signals module uses heart rate, steering wheel position, and the positions of the accelerator, brake, and clutch pedals to detect the level of fatigue. These signals' sample rates are then adjusted to match, allowing simple features to be extracted from the signals, and SVM and HMM classifiers are used to detect fatigue level. The performance of both classifiers is reported and compared.
Bayesian networks' abilities to capture dependencies and uncertainty make them a sound choice to perform the data fusion. Prior information (Day/Night driving and previous decision) is also incorporated into the network to improve the final decision. The accuracies of the audio and heart rate and other signals modules are used to calculate certain CPTs for the Bayesian network, while the rest of the CPTs are calculated subjectively. The inference queries are calculated using the variable elimination algorithm. For those time steps where the audio module decision is absent, a window is defined and the last decision within this window is used as a current decision. The performance of the system is assessed based on the average accuracy per second.
A dataset was built to train and test the system. The experimental results show that the system is very promising. The performance of the system was assessed based on the average accuracy per second; the total accuracy of the system is 90.5%. The system design can be easily improved by easily integrating more modules into the Bayesian network.
|
68 |
An examination of predator habitat usage: movement analysis in a marine fishery and freshwater fishCharles, Colin 03 July 2013 (has links)
This thesis investigates the influence of predator movements upon habitat selection and foraging success. It deals with two very distinct datasets one from a marine system, the snow crab (Chionoecetes opilio) fishery, and the second from a freshwater system, an experimental rainbow trout (Oncorhynchus mykiss) aquaculture operation. Deriving a standardized measure of catch from logbook data is important because catch per unit effort (CPUE) is used in fisheries analysis to estimate abundance, but it some cases CPUE is a biased estimate. For the snow crab fishery, a relative abundance measure was developed using fisher movements and logbook data that reflected commercially available biomass and produced an improved relative abundance estimate. Results from the aquaculture dataset indicate that escaped farmed rainbow trout continue to use the cage site when waste feed is available, while native lake trout do not interact with the cage. Once access to waste feed is removed, both lake trout and escaped rainbow trout do not use the cage site. This thesis uses methods to identify patterns and behaviours using movement tracks to increase our understanding of predator habitat usage.
|
69 |
Voice query-by-example for resource-limited languages using an ergodic hidden Markov model of speechAli, Asif 13 January 2014 (has links)
An ergodic hidden Markov model (EHMM) can be useful in extracting underlying structure embedded in connected speech without the need for a time-aligned transcribed corpus. In this research, we present a query-by-example (QbE) spoken term detection system based on an ergodic hidden Markov model of speech.
An EHMM-based representation of speech is not invariant to speaker-dependent variations due to the unsupervised nature of the training. Consequently, a single phoneme may be mapped to a number of EHMM states. The effects of speaker-dependent and context-induced variation in speech on its EHMM-based representation have been studied and used to devise schemes to minimize these variations.
Speaker-invariance can be introduced into the system by identifying states with similar perceptual characteristics. In this research, two unsupervised clustering schemes have been proposed to identify perceptually similar states in an EHMM.
A search framework, consisting of a graphical keyword modeling scheme and a modified Viterbi algorithm, has also been implemented. An EHMM-based QbE system has been compared to the state-of-the-art and has been demonstrated to have higher precisions than those based on static clustering schemes.
|
70 |
Applications of hidden Markov models in financial modellingErlwein, Christina January 2008 (has links)
Various models driven by a hidden Markov chain in discrete or continuous time are developed to capture the stylised features of market variables whose levels or values constitute as the underliers of financial derivative contracts or investment portfolios. Since the parameters are switching regimes, the changes and developments in the economy as soon as they arise are readily reflected in these models. The change of probability measure technique and the EM algorithm are fundamental techniques utilised in the optimal parameter estimation. Recursive adaptive filters for the state of the Markov chain and other auxiliary processes related to the Markov chain are derived which in turn yield self-tuning dynamic financial models. A hidden Markov model (HMM)-based modelling set-up for commodity prices is developed and the predictability of the gold market under this setting is examined. An Ornstein-Uhlenbeck (OU) model with HMM parameters is proposed and under this set-up, we address two statistical inference issues: the sensitivity of the model to small changes in parameter estimates and the selection of the optimal number of states. The extended OU model is implemented on a data set of 30-day Canadian T-bill yields. An exponential of a Markov-switching OU process plus a compound Poisson process is put forward as a model for the evolution of electricity spot prices. Using a data set compiled by Nord Pool, we illustrate the vast improvements gained in incorporating regimes in the model. A multivariate HMM is employed as a framework in providing the solutions of two asset allocation problems; one involves the mean-variance utility function and the other entails the CVaR constraint. Finally, the valuation of credit default swaps highlights the important considerations necessitated by pricing in a regime-switching environment. Certain numerical schemes are applied to obtain approximations for the default probabilities and swap rates.
|
Page generated in 0.0657 seconds