Global ETD Search

81	Hidden Markov Models Predict Epigenetic Chromatin Domains Larson, Jessica 20 December 2012 (has links) Epigenetics is an important layer of transcriptional control necessary for cell-type specific gene regulation. We developed computational methods to analyze the combinatorial effect and large-scale organizations of genome-wide distributions of epigenetic marks. Throughout this dissertation, we show that regions containing multiple genes with similar epigenetic patterns are found throughout the genome, suggesting the presence of several chromatin domains. In Chapter 1, we develop a hidden Markov model (HMM) for detecting the types and locations of epigenetic domains from multiple histone modifications. We use this method to analyze a published ChIP-seq dataset of five histone modification marks in mouse embryonic stem cells. We successfully detect domains of consistent epigenetic patterns from ChIP-seq data, providing new insights into the role of epigenetics in longrange gene regulation. In Chapter 2, we expand our model to investigate the genome-wide patterns of histone modifications in multiple human cell lines. We find that chromatin states can be used to accurately classify cell differentiation stage, and that three cancer cell lines can be classified as differentiated cells. We also found that genes whose chromatin states change dynamically in accordance with differentiation stage are not randomly distributed across the genome, but tend to be embedded in multi-gene chromatin domains. Moreover, many specialized gene clusters are associated with stably occupied domains. In the last chapter, we develop a more sophisticated, tiered HMM to include a domain structure in our chromatin annotation. We find that a model with three domains and five sub-states per domain best fits our data. Each state has a unique epigenetic pattern, while still staying true to its domain’s specific functional aspects and expression profiles. The majority of the genome (including most introns and intergenic regions) has low epigenetic signals and is assigned to the same domain. Our model outperforms current chromatin state models due to its increased domain coherency and interpretation. biostatistics bioinformatics chromatin chromatin domains epigenetics hidden Markov models histone modifications
82	Bayesian Inference Approaches for Particle Trajectory Analysis in Cell Biology Monnier, Nilah 28 August 2013 (has links) Despite the importance of single particle motion in biological systems, systematic inference approaches to analyze particle trajectories and evaluate competing motion models are lacking. An automated approach for robust evaluation of motion models that does not require manual intervention is highly desirable to enable analysis of datasets from high-throughput imaging technologies that contain hundreds or thousands of trajectories of biological particles, such as membrane receptors, vesicles, chromosomes or kinetochores, mRNA particles, or whole cells in developing embryos. Bayesian inference is a general theoretical framework for performing such model comparisons that has proven successful in handling noise and experimental limitations in other biological applications. The inherent Bayesian penalty on model complexity, which avoids overfitting, is particularly important for particle trajectory analysis given the highly stochastic nature of particle diffusion. This thesis presents two complementary approaches for analyzing particle motion using Bayesian inference. The first method, MSD-Bayes, discriminates a wide range of motion models--including diffusion, directed motion, anomalous and confined diffusion--based on mean- square displacement analysis of a set of particle trajectories, while the second method, HMM-Bayes, identifies dynamic switching between diffusive and directed motion along individual trajectories using hidden Markov models. These approaches are validated on biological particle trajectory datasets from a wide range of experimental systems, demonstrating their broad applicability to research in cell biology. Biophysics Bayesian inference cell biology hidden Markov models mean-square displacement particle trajectories
83	Diagnostics and Generalizations for Parametric State Estimation Nearing, Grey Stephen January 2013 (has links) This dissertation is comprised of a collection of five distinct research projects which apply, evaluate and extend common methods for land surface data assimilation. The introduction of novel diagnostics and extensions of existing algorithms is motivated by an example, related to estimating agricultural productivity, of failed application of current methods. We subsequently develop methods, based on Shannon's theory of communication, to quantify the contributions from all possible factors to the residual uncertainty in state estimates after data assimilation, and to measure the amount of information contained in observations which is lost due to erroneous assumptions in the assimilation algorithm. Additionally, we discuss an appropriate interpretation of Shannon information which allows us to measure the amount of information contained in a model, and use this interpretation to measure the amount of information introduced during data assimilation-based system identification. Finally, we propose a generalization of the ensemble Kalman filter designed to alleviate one of the primary assumptions - that the observation function is linear. Data Assimilation Hidden Markov Models Information Theory Remote Sensing Soil Moisture Hydrology Bayesian Analysis
84	Active Control Strategies for Chemical Sensors and Sensor Arrays Gosangi, Rakesh 16 December 2013 (has links) Chemical sensors are generally used as one-dimensional devices, where one measures the sensor’s response at a fixed setting, e.g., infrared absorption at a specific wavelength, or conductivity of a solid-state sensor at a specific operating temperature. In many cases, additional information can be extracted by modulating some internal property (e.g., temperature, voltage) of the sensor. However, this additional information comes at a cost (e.g., sensing times, power consumption), so offline optimization techniques (such as feature-subset selection) are commonly used to identify a subset of the most informative sensor tunings. An alternative to offline techniques is active sensing, where the sensor tunings are adapted in real-time based on the information obtained from previous measurements. Prior work in domains such as vision, robotics, and target tracking has shown that active sensing can schedule agile sensors to manage their sensing resources more efficiently than passive sensing, and also balance between sensing costs and performance. Inspired from the history of active sensing, in this dissertation, we developed active sensing algorithms that address three different computational problems in chemical sensing. First, we consider the problem of classification with a single tunable chemical sensor. We formulate the classification problem as a partially observable Markov decision process, and solve it with a myopic algorithm. At each step, the algorithm estimates the utility of each sensing configuration as the difference between expected reduction in Bayesian risk and sensing cost, and selects the configuration with maximum utility. We evaluated this approach on simulated Fabry-Perot interferometers (FPI), and experimentally validated on metal-oxide (MOX) sensors. Our results show that the active sensing method obtains better classification performance than passive sensing methods, and also is more robust to additive Gaussian noise in sensor measurements. Second, we consider the problem of estimating concentrations of the constituents in a gas mixture using a tunable sensor. We formulate this multicomponent-analysis problem as that of probabilistic state estimation, where each state represents a different concentration profile. We maintain a belief distribution that assigns a probability to each profile, and update the distribution by incorporating the latest sensor measurements. To select the sensor’s next operating configuration, we use a myopic algorithm that chooses the operating configuration expected to best reduce the uncertainty in the future belief distribution. We validated this approach on both simulated and real MOX sensors. The results again demonstrate improved estimation performance and robustness to noise. Lastly, we present an algorithm that extends active sensing to sensor arrays. This algorithm borrows concepts from feature subset selection to enable an array of tunable sensors operate collaboratively for the classification of gas samples. The algorithm constructs an optimized action vector at each sensing step, which contains separate operating configurations for each sensor in the array. When dealing with sensor arrays, one needs to account for the correlation among sensors. To this end, we developed two objective functions: weighted Fisher scores, and dynamic mutual information, which can quantify the discriminatory information and redundancy of a given action vector with respect to the measurements already acquired. Once again, we validated the approach on simulated FPI arrays and experimentally tested it on an array of MOX sensors. The results show improved classification performance and robustness to additive noise. Active sensing Chemical sensors Tunable sensors Adaptive sensing Markov models Feature selection
85	An examination of predator habitat usage: movement analysis in a marine fishery and freshwater fish Charles, Colin 03 July 2013 (has links) This thesis investigates the influence of predator movements upon habitat selection and foraging success. It deals with two very distinct datasets one from a marine system, the snow crab (Chionoecetes opilio) fishery, and the second from a freshwater system, an experimental rainbow trout (Oncorhynchus mykiss) aquaculture operation. Deriving a standardized measure of catch from logbook data is important because catch per unit effort (CPUE) is used in fisheries analysis to estimate abundance, but it some cases CPUE is a biased estimate. For the snow crab fishery, a relative abundance measure was developed using fisher movements and logbook data that reflected commercially available biomass and produced an improved relative abundance estimate. Results from the aquaculture dataset indicate that escaped farmed rainbow trout continue to use the cage site when waste feed is available, while native lake trout do not interact with the cage. Once access to waste feed is removed, both lake trout and escaped rainbow trout do not use the cage site. This thesis uses methods to identify patterns and behaviours using movement tracks to increase our understanding of predator habitat usage. Abundance Index Aquaculture GLM UDOI Movement Hidden Markov Models Snow Crab Rainbow Trout CPUE
86	Automatic Driver Fatigue Monitoring Using Hidden Markov Models and Bayesian Networks Rashwan, Abdullah 11 December 2013 (has links) The automotive industry is growing bigger each year. The central concern for any automotive company is driver and passenger safety. Many automotive companies have developed driver assistance systems, to help the driver and to ensure driver safety. These systems include adaptive cruise control, lane departure warning, lane change assistance, collision avoidance, night vision, automatic parking, traffic sign recognition, and driver fatigue detection. In this thesis, we aim to build a driver fatigue detection system that advances the research in this area. Using vision in detecting driver fatigue is commonly the key part for driver fatigue detection systems. We have decided to investigate different direction. We examine the driver's voice, heart rate, and driving performance to assess fatigue level. The system consists of three main modules: the audio module, the heart rate and other signals module, and the Bayesian network module. The audio module analyzes an audio recording of a driver and tries to estimate the level of fatigue for the driver. A Voice Activity Detection (VAD) module is used to extract driver speech from the audio recording. Mel-Frequency Cepstrum Coefficients, (MFCC) features are extracted from the speech signal, and then Support Vector Machines (SVM) and Hidden Markov Models (HMM) classifiers are used to detect driver fatigue. Both classifiers are tuned for best performance, and the performance of both classifiers is reported and compared. The heart rate and other signals module uses heart rate, steering wheel position, and the positions of the accelerator, brake, and clutch pedals to detect the level of fatigue. These signals' sample rates are then adjusted to match, allowing simple features to be extracted from the signals, and SVM and HMM classifiers are used to detect fatigue level. The performance of both classifiers is reported and compared. Bayesian networks' abilities to capture dependencies and uncertainty make them a sound choice to perform the data fusion. Prior information (Day/Night driving and previous decision) is also incorporated into the network to improve the final decision. The accuracies of the audio and heart rate and other signals modules are used to calculate certain CPTs for the Bayesian network, while the rest of the CPTs are calculated subjectively. The inference queries are calculated using the variable elimination algorithm. For those time steps where the audio module decision is absent, a window is defined and the last decision within this window is used as a current decision. The performance of the system is assessed based on the average accuracy per second. A dataset was built to train and test the system. The experimental results show that the system is very promising. The performance of the system was assessed based on the average accuracy per second; the total accuracy of the system is 90.5%. The system design can be easily improved by easily integrating more modules into the Bayesian network. Driver fatigue detection Hidden Markov Models Support Vector Machines Bayesian networks
87	An examination of predator habitat usage: movement analysis in a marine fishery and freshwater fish Charles, Colin 03 July 2013 (has links) This thesis investigates the influence of predator movements upon habitat selection and foraging success. It deals with two very distinct datasets one from a marine system, the snow crab (Chionoecetes opilio) fishery, and the second from a freshwater system, an experimental rainbow trout (Oncorhynchus mykiss) aquaculture operation. Deriving a standardized measure of catch from logbook data is important because catch per unit effort (CPUE) is used in fisheries analysis to estimate abundance, but it some cases CPUE is a biased estimate. For the snow crab fishery, a relative abundance measure was developed using fisher movements and logbook data that reflected commercially available biomass and produced an improved relative abundance estimate. Results from the aquaculture dataset indicate that escaped farmed rainbow trout continue to use the cage site when waste feed is available, while native lake trout do not interact with the cage. Once access to waste feed is removed, both lake trout and escaped rainbow trout do not use the cage site. This thesis uses methods to identify patterns and behaviours using movement tracks to increase our understanding of predator habitat usage. Abundance Index Aquaculture GLM UDOI Movement Hidden Markov Models Snow Crab Rainbow Trout CPUE
88	Voice query-by-example for resource-limited languages using an ergodic hidden Markov model of speech Ali, Asif 13 January 2014 (has links) An ergodic hidden Markov model (EHMM) can be useful in extracting underlying structure embedded in connected speech without the need for a time-aligned transcribed corpus. In this research, we present a query-by-example (QbE) spoken term detection system based on an ergodic hidden Markov model of speech. An EHMM-based representation of speech is not invariant to speaker-dependent variations due to the unsupervised nature of the training. Consequently, a single phoneme may be mapped to a number of EHMM states. The effects of speaker-dependent and context-induced variation in speech on its EHMM-based representation have been studied and used to devise schemes to minimize these variations. Speaker-invariance can be introduced into the system by identifying states with similar perceptual characteristics. In this research, two unsupervised clustering schemes have been proposed to identify perceptually similar states in an EHMM. A search framework, consisting of a graphical keyword modeling scheme and a modified Viterbi algorithm, has also been implemented. An EHMM-based QbE system has been compared to the state-of-the-art and has been demonstrated to have higher precisions than those based on static clustering schemes. Speech recognition Hidden Markov model Ergodic theory Hidden Markov models Automatic speech recognition
89	Applications of hidden Markov models in financial modelling Erlwein, Christina January 2008 (has links) Various models driven by a hidden Markov chain in discrete or continuous time are developed to capture the stylised features of market variables whose levels or values constitute as the underliers of financial derivative contracts or investment portfolios. Since the parameters are switching regimes, the changes and developments in the economy as soon as they arise are readily reflected in these models. The change of probability measure technique and the EM algorithm are fundamental techniques utilised in the optimal parameter estimation. Recursive adaptive filters for the state of the Markov chain and other auxiliary processes related to the Markov chain are derived which in turn yield self-tuning dynamic financial models. A hidden Markov model (HMM)-based modelling set-up for commodity prices is developed and the predictability of the gold market under this setting is examined. An Ornstein-Uhlenbeck (OU) model with HMM parameters is proposed and under this set-up, we address two statistical inference issues: the sensitivity of the model to small changes in parameter estimates and the selection of the optimal number of states. The extended OU model is implemented on a data set of 30-day Canadian T-bill yields. An exponential of a Markov-switching OU process plus a compound Poisson process is put forward as a model for the evolution of electricity spot prices. Using a data set compiled by Nord Pool, we illustrate the vast improvements gained in incorporating regimes in the model. A multivariate HMM is employed as a framework in providing the solutions of two asset allocation problems; one involves the mean-variance utility function and the other entails the CVaR constraint. Finally, the valuation of credit default swaps highlights the important considerations necessitated by pricing in a regime-switching environment. Certain numerical schemes are applied to obtain approximations for the default probabilities and swap rates. 332.01519233
90	Actuarial Inference and Applications of Hidden Markov Models Till, Matthew Charles January 2011 (has links) Hidden Markov models have become a popular tool for modeling long-term investment guarantees. Many different variations of hidden Markov models have been proposed over the past decades for modeling indexes such as the S&P 500, and they capture the tail risk inherent in the market to varying degrees. However, goodness-of-fit testing, such as residual-based testing, for hidden Markov models is a relatively undeveloped area of research. This work focuses on hidden Markov model assessment, and develops a stochastic approach to deriving a residual set that is ideal for standard residual tests. This result allows hidden-state models to be tested for goodness-of-fit with the well developed testing strategies for single-state models. This work also focuses on parameter uncertainty for the popular long-term equity hidden Markov models. There is a special focus on underlying states that represent lower returns and higher volatility in the market, as these states can have the largest impact on investment guarantee valuation. A Bayesian approach for the hidden Markov models is applied to address the issue of parameter uncertainty and the impact it can have on investment guarantee models. Also in this thesis, the areas of portfolio optimization and portfolio replication under a hidden Markov model setting are further developed. Different strategies for optimization and portfolio hedging under hidden Markov models are presented and compared using real world data. The impact of parameter uncertainty, particularly with model parameters that are connected with higher market volatility, is once again a focus, and the effects of not taking parameter uncertainty into account when optimizing or hedging in a hidden Markov are demonstrated. Hidden Markov Models Regime-Switching Models Residual Analysis MCMC Portfolio Optimization Portfolio Replication Actuarial Science

Search results