41 |
Simulation Algorithms for Continuous Time Markov Chain ModelsBanks, H. T., Broido, Anna, Canter, Brandi, Gayvert, Kaitlyn, Hu, Shuhua, Joyner, Michele, Link, Kathryn 01 December 2012 (has links)
Continuous time Markov chains are often used in the literature to model the dynamics of a system with low species count and uncertainty in transitions. In this paper, we investigate three particular algorithms that can be used to numerically simulate continuous time Markov chain models (a stochastic simulation algorithm, explicit and implicit tau-leaping algorithms). To compare these methods, we used them to analyze two stochastic infection models with different level of complexity. One of these models describes the dynamics of Vancomycin-Resistant Enterococcus (VRE) infection in a hospital, and the other is for the early infection of Human Immunodeficiency Virus (HIV) within a host. The relative efficiency of each algorithm is determined based on computational time and degree of precision required. The numerical results suggest that all three algorithms have similar computational efficiency for the VRE model due to the low number of species and small number of transitions. However, we found that with the larger and more complex HIV model, implementation and modification of tau-Leaping methods are preferred.
|
42 |
Complexity penalized methods for structured and unstructured dataGoeva, Aleksandrina 08 November 2017 (has links)
A fundamental goal of statisticians is to make inferences from the sample about characteristics of the underlying population. This is an inverse problem, since we are trying to recover a feature of the input with the availability of observations on an output. Towards this end, we consider complexity penalized methods, because they balance goodness of fit and generalizability of the solution. The data from the underlying population may come in diverse formats - structured or unstructured - such as probability distributions, text tokens, or graph characteristics. Depending on the defining features of the problem we can chose the appropriate complexity penalized approach, and assess the quality of the estimate produced by it. Favorable characteristics are strong theoretical guarantees of closeness to the true value and interpretability. Our work fits within this framework and spans the areas of simulation optimization, text mining and network inference. The first problem we consider is model calibration under the assumption that given a hypothesized input model, we can use stochastic simulation to obtain its corresponding output observations. We formulate it as a stochastic program by maximizing the entropy of the input distribution subject to moment matching. We then propose an iterative scheme via simulation to approximately solve it. We prove convergence of the proposed algorithm under appropriate conditions and demonstrate the performance via numerical studies. The second problem we consider is summarizing text documents through an inferred set of topics. We propose a frequentist reformulation of a Bayesian regularization scheme. Through our complexity-penalized perspective we lend further insight into the nature of the loss function and the regularization achieved through the priors in the Bayesian formulation. The third problem is concerned with the impact of sampling on the degree distribution of a network. Under many sampling designs, we have a linear inverse problem characterized by an ill-conditioned matrix. We investigate the theoretical properties of an approximate solution for the degree distribution found by regularizing the solution of the ill-conditioned least squares objective. Particularly, we study the rate at which the penalized solution tends to the true value as a function of network size and sampling rate.
|
43 |
Stochastic simulation of near-surface atmospheric forcings for distributed hydrology / Simulation stochastique des forçages atmosphériques utiles aux modèles hydrologiques spatialisésChen, Sheng 01 February 2018 (has links)
Ce travail de thèse propose de nouveaux concepts et outils pour des activités de simulation stochastique du temps ciblant les besoins spécifiques de l'hydrologie. Nous avons utilisé une zone climatique contrastée dans le sud-est de la France, les Cévennes-Vivarais, qui est très attractive pour les aléas hydrologiques et les changements climatiques.Notre point de vue est que les caractéristiques physiques (humidité du sol, débit) liées aux préoccupations quotidiennes sont directement liées à la variabilité atmosphérique à l'échelle des bassins. Pour la modélisation de multi-variable, la covariabilité avec les précipitations est d'abord considérée.La première étape du thèse est dédiée à la prise en compte de l'hétérogénéité de la précipitation au sein du simulateur de pluie SAMPO [Leblois et Creutin, 2013]. Nous regroupons les pas de temps dans les types de pluie qui sont organisés dans le temps. Deux approches sont testées pour la simulation: un modèle semi-markovienne et un modèle de ré-échantillonnage pour la séquence des types de pluie historiques. Grâce au regroupement, toutes sortes de précipitations sont desservies par un type de pluie spécifique. Dans une zone plus vaste, où l'hypothèse d'homogénéité climatique n'est plus valide, une coordination doit être introduite entre les séquences de types de pluie sur les sous-zones délimitées, en formant à plus grande échelle.Nous avons d'abord étudié une coordination de modèle de Markov, en appliquant des durées de séjour observées par un algorithme glouton. Cet approche respecte les accumulations de longue durée et la variabilité interannuelle, mais les valeurs extrêmes de précipitation sont trop faibles. En revanche, le ré-échantillonnage est plus facile à mettre en œuvre et donne un comportement satisfaisant pour la variabilité à court terme. Cependant, il manque une variabilité inter-annuelle. Les deux accès souffrent de la délimitation stricte des zones homogènes et des types de précipitations homogènes.Pour ces raisons, une approche complètement différente est également envisagée, où les pluies totales sont modélisées conjointement en utilisant la copule, puis désagrégés sur la petite échelle en utilisant une simulation conditionnelle géostatistique.Enfin, la technique de la copule est utilisée pour relier les autres variables météorologiques (température, rayonnement solaire, humidité, vitesse du vent) aux précipitations. Puisque la modélisation multivariée vise à être pilotée par la simulation des précipitations, la copule doit être exécutée en mode conditionnel. La boîte à outils réalisée a déjà été utilisée dans des explorations scientifiques, elle est maintenant disponible pour tester aux applications réelles. En tant qu'approche pilotée par les données, elle est également adaptable à d'autres conditions climatiques. / This PhD work proposes new concepts and tools for stochastic weather simulation activities targeting the specific needs of hydrology. We used, as a demonstration, a climatically contrasted area in the South-East of France, Cévennes-Vivarais, which is highly attractive to hydrological hazards and climate change.Our perspective is that physical features (soil moisture, discharge) relevant to everyday concerns (water resources assessment and/or hydrological hazard) are directly linked to the atmospheric variability at the basins scale, meaning firstly that relevant time and space scales ranges must be respected in the rainfall simulation technique. Since hydrological purposes are the target, other near-surface variates must be also considered. They may exhibit a less striking variability, but it does exist. To build the multi-variable modeling, co-variability with rainfall is first considered.The first step of the PhD work is dedicated to take into account the heterogeneity of the precipitation within the rainfall simulator SAMPO [Leblois and Creutin, 2013]. We cluster time steps into rainfall types organized in time. Two approaches are tested for simulation: a semi-Markov simulation and a resampling of the historical rainfall types sequence. Thanks to clustering, all kind of rainfall is served by some specific rainfall type. In a larger area, where the assumption of climatic homogeneity is not considered valid, a coordination must be introduced between the rainfall type sequences over delineated sub-areas, forming rainy patterns at the larger scale.We first investigated a coordination of Markov models, enforcing observed lengths-of-stay by a greedy algorithm. This approach respects long duration aggregates and inter-annual variability, but the high values of rainfall are too low. As contrast, the joint resampling of historically observed sequences is easier to implement and gives a satisfactory behavior for short term variability. However it lacks inter-annual variability.Both approaches suffer from the strict delineation of homogeneous zones and homogeneous rainfall types.For these reasons, a completely different approach is also considered, where the areal rainfall totals are jointly modeled using a spatio-temporal copula approach, then disaggregated to the user grid using a non-deterministic, geostatistically-based conditional simulation technique. In the copula approach, the well-known problem of rainfall having atom at zero is handled in replacing historical rainfall by an appropriated atmospheric based rainfall index having a continuous distribution. Simulated values of this index can be turned to rainfall by quantile-quantile mapping.Finally, the copula technique is used to link other meteorological variables (i.e. temperature, solar radiation, humidity, wind speed) to rainfall. Since the multivariate simulation aims to be driven by the rainfall simulation, the copula needs to be run in conditional mode. The achieved toolbox has already been used in scientific explorations, it is now available for testing in real-size application. As a data-driven approach, it is also adaptable to other climatic conditions. The presence of atmospheric precursors a large scale values in some key steps may enable the simulation tools to be converted into a climate simulation disaggregation.
|
44 |
Stochastic Simulation of the Phage Lambda System and the Bioluminescence System Using the Next Reaction MethodAnanthanpillai, Balaji January 2009 (has links)
No description available.
|
45 |
A computational framework for analyzing chemical modification and limited proteolysis experimental data used for high confidence protein structure predictionAnderson, Paul E. 08 December 2006 (has links)
No description available.
|
46 |
The Origin of Life by Means of Autocatalytic Sets of BiopolymersWu, Meng 10 1900 (has links)
<p>A key problem in the origin of life is to understand how an autocatalytic, self-replicating biopolymer system may have originated from a non-living chemical system. This thesis presents mathematical and computational models that address this issue. We consider a reaction system in which monomers (nucleotides) and polymers (RNAs) can be formed by chemical reactions at a slow spontaneous rate, and can also be formed at a high rate by catalysis, if polymer catalysts (ribozymes) are present. The system has two steady states: a ‘dead’ state with a low concentration of ribozymes and a ‘living’ state with a high concentration of ribozymes. Using stochastic simulations, we show that if a small number of ribozymes is formed spontaneously, this can drive the system from the dead to the living state. In the well mixed limit, this transition occurs most easily in volumes of intermediate size. In a spatially-extended two-dimensional system with finite diffusion rate, there is an optimal diffusion rate at which the transition to life is very much faster than in the well-mixed case. We therefore argue that the origin of life is a spatially localized stochastic transition. Once life has arisen in one place by a rare stochastic event, the living state spreads deterministically through the rest of the system. We show that similar autocatalytic states can be controlled by nucleotide synthases as well as by polymerase ribozymes, and that the same mechanism can also work with recombinases, if the recombination reaction is not perfectly reversible. Chirality is introduced into the polymerization model by considering simultaneous synthesis and polymerization of left- and right-handed monomers. We show that there is a racemic non-living state and two chiral living states. In this model, the origin of life and the origin of homochirality may occur simultaneously due to the same stochastic transition.</p> / Doctor of Philosophy (PhD)
|
47 |
LHD operations in sublevel caving mines: a productivity perspectiveTariq, Muhammad January 1900 (has links)
Mining is a high-risk industry, so efficiency and safety are key priorities. As mines continue to go deeper and exploit low-grade deposits, bulk mining methods, such as sublevel caving (SLC), have become increasingly important. SLC is suitable for massive steeply dipping ore bodies and is known for its high degree of mechanisation, productivity, and low operational cost. Moreover, technological developments and mechanisation have allowed these methods to be applied at greater depths. In modern mechanised mines Load haul dump (LHD) machines are central to achieving the desired productivity. Therefore, automation of LHDs and their increasing use in mines make it crucial to understand the performance of these machines in actual mining environments. The aim of this research was to understand the differences in the productivity of semiautonomous and manual LHDs and identify how external factors impact the performance of these machines in SLC operations. The research also investigated how LHD operator training could improve the loading efficiency. Performance data for semi-autonomous and manual LHDs were collected from LKAB’s Kiirunavaara mine’s central database, GIRON. These data were used to compare cycle times and payloads of semi-autonomous and manual LHDs. The data were filtered and sorted so that only data where both machine types were operating in the same area (crosscut, ring, and ore pass) were used. To understand the impact of external factors, data on the occurrence of boulders were collected from LKAB’s Malmberget mine by recording videos of LHD buckets, while the data on operator training were obtained by performing baseline mapping and conducting a questionnaire study with the LHD operators at LKAB’s Kiirunavaara mine. The results of the comparative analysis of manual and semi-autonomous LHDs showed the mean payload was 0.34 tonnes higher for manual LHD machines. However, the differences were not consistent across different areas of the mine. Similarly, when comparing the cycle times, in 57% of the studied area, manual LHDs had lower cycle time, while the opposite was true in the remaining 43% of the areas. Therefore, the differences in cycle time and payload due to mode of operation are not conclusive, meaning that one machine type does not completely outperform the other. This highlights the importance of understanding the external factors that cause such differences. Moreover, the findings emphasize the need to upgrade LHD operator training based on pedagogical principles and the inclusion of new technologies to enhance loading efficiency and increase overall productivity.
|
48 |
Stochastic Modeling and Simulation of Multiscale Biochemical SystemsChen, Minghan 02 July 2019 (has links)
Numerous challenges arise in modeling and simulation as biochemical networks are discovered with increasing complexities and unknown mechanisms. With the improvement in experimental techniques, biologists are able to quantify genes and proteins and their dynamics in a single cell, which calls for quantitative stochastic models for gene and protein networks at cellular levels that match well with the data and account for cellular noise.
This dissertation studies a stochastic spatiotemporal model of the Caulobacter crescentus cell cycle. A two-dimensional model based on a Turing mechanism is investigated to illustrate the bipolar localization of the protein PopZ. However, stochastic simulations are often impeded by expensive computational cost for large and complex biochemical networks. The hybrid stochastic simulation algorithm is a combination of differential equations for traditional deterministic models and Gillespie's algorithm (SSA) for stochastic models. The hybrid method can significantly improve the efficiency of stochastic simulations for biochemical networks with multiscale features, which contain both species populations and reaction rates with widely varying magnitude. The populations of some reactant species might be driven negative if they are involved in both deterministic and stochastic systems. This dissertation investigates the negativity problem of the hybrid method, proposes several remedies, and tests them with several models including a realistic biological system.
As a key factor that affects the quality of biological models, parameter estimation in stochastic models is challenging because the amount of empirical data must be large enough to obtain statistically valid parameter estimates. To optimize system parameters, a quasi-Newton algorithm for stochastic optimization (QNSTOP) was studied and applied to a stochastic budding yeast cell cycle model by matching multivariate probability distributions between simulated results and empirical data. Furthermore, to reduce model complexity, this dissertation simplifies the fundamental cooperative binding mechanism by a stochastic Hill equation model with optimized system parameters. Considering that many parameter vectors generate similar system dynamics and results, this dissertation proposes a general α-β-γ rule to return an acceptable parameter region of the stochastic Hill equation based on QNSTOP. Different objective functions are explored targeting different features of the empirical data. / Doctor of Philosophy / Modeling and simulation of biochemical networks faces numerous challenges as biochemical networks are discovered with increased complexity and unknown mechanisms. With improvement in experimental techniques, biologists are able to quantify genes and proteins and their dynamics in a single cell, which calls for quantitative stochastic models, or numerical models based on probability distributions, for gene and protein networks at cellular levels that match well with the data and account for randomness. This dissertation studies a stochastic model in space and time of a bacterium’s life cycle— Caulobacter. A two-dimensional model based on a natural pattern mechanism is investigated to illustrate the changes in space and time of a key protein population. However, stochastic simulations are often complicated by the expensive computational cost for large and sophisticated biochemical networks. The hybrid stochastic simulation algorithm is a combination of traditional deterministic models, or analytical models with a single output for a given input, and stochastic models. The hybrid method can significantly improve the efficiency of stochastic simulations for biochemical networks that contain both species populations and reaction rates with widely varying magnitude. The populations of some species may become negative in the simulation under some circumstances. This dissertation investigates negative population estimates from the hybrid method, proposes several remedies, and tests them with several cases including a realistic biological system. As a key factor that affects the quality of biological models, parameter estimation in stochastic models is challenging because the amount of observed data must be large enough to obtain valid results. To optimize system parameters, the quasi-Newton algorithm for stochastic optimization (QNSTOP) was studied and applied to a stochastic (budding) yeast life cycle model by matching different distributions between simulated results and observed data. Furthermore, to reduce model complexity, this dissertation simplifies the fundamental molecular binding mechanism by the stochastic Hill equation model with optimized system parameters. Considering that many parameter vectors generate similar system dynamics and results, this dissertation proposes a general α-β-γ rule to return an acceptable parameter region of the stochastic Hill equation based on QNSTOP. Different optimization strategies are explored targeting different features of the observed data.
|
49 |
Computational Techniques for the Analysis of Large Scale Biological SystemsAhn, Tae-Hyuk 27 August 2012 (has links)
An accelerated pace of discovery in biological sciences is made possible by a new generation of computational biology and bioinformatics tools. In this dissertation we develop novel computational, analytical, and high performance simulation techniques for biological problems, with applications to the yeast cell division cycle, and to the RNA-Sequencing of the yellow fever mosquito.
Cell cycle system evolves stochastic effects when there are a small number of molecules react each other. Consequently, the stochastic effects of the cell cycle are important, and the evolution of cells is best described statistically. Stochastic simulation algorithm (SSA), the standard stochastic method for chemical kinetics, is often slow because it accounts for every individual reaction event. This work develops a stochastic version of a deterministic cell cycle model, in order to capture the stochastic aspects of the evolution of the budding yeast wild-type and mutant strain cells. In order to efficiently run large ensembles to compute statistics of cell evolution, the dissertation investigates parallel simulation strategies, and presents a new probabilistic framework to analyze the performance of dynamic load balancing algorithms. This work also proposes new accelerated stochastic simulation algorithms based on a fully implicit approach and on stochastic Taylor expansions.
Next Generation RNA-Sequencing, a high-throughput technology to sequence cDNA in order to get information about a sample's RNA content, is becoming an efficient genomic approach to uncover new genes and to study gene expression and alternative splicing. This dissertation develops efficient algorithms and strategies to find new genes in Aedes aegypti, which is the most important vector of dengue fever and yellow fever. We report the discovery of a large number of new gene transcripts, and the identification and characterization of genes that showed male-biased expression profiles. This basic information may open important avenues to control mosquito borne infectious diseases. / Ph. D.
|
50 |
A Stochastic Model for The Transmission Dynamics of Toxoplasma GondiiGao, Guangyue 01 June 2016 (has links)
Toxoplasma gondii (T. gondii) is an intracellular protozoan parasite. The parasite can infect all warm-blooded vertebrates. Up to 30% of the world's human population carry a Toxoplasma infection. However, the transmission dynamics of T. gondii has not been well understood, although a lot of mathematical models have been built. In this thesis, we adopt a complex life cycle model developed by Turner et al. and extend their work to include diffusion of hosts. Most of researches focus on the deterministic models. However, some scientists have reported that deterministic models sometimes are inaccurate or even inapplicable to describe reaction-diffusion systems, such as gene expression. In this case stochastic models might have qualitatively different properties than its deterministic limit. Consequently, the transmission pathways of T. gondii and potential control mechanisms are investigated by both deterministic and stochastic model by us. A stochastic algorithm due to Gillespie, based on the chemical master equation, is introduced. A compartment-based model and a Smoluchowski equation model are described to simulate the diffusion of hosts. The parameter analyses are conducted based on the reproduction number. The analyses based on the deterministic model are verified by stochastic simulation near the thresholds of the parameters. / Master of Science
|
Page generated in 0.1726 seconds