• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • 2
  • Tagged with
  • 65
  • 65
  • 65
  • 63
  • 30
  • 30
  • 20
  • 16
  • 15
  • 15
  • 13
  • 12
  • 11
  • 10
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Markov Bases for Noncommutative Harmonic Analysis of Partially Ranked Data

Johnston, Ann 01 May 2011 (has links)
Given the result $v_0$ of a survey and a nested collection of summary statistics that could be used to describe that result, it is natural to ask which of these summary statistics best describe $v_0$. In 1998 Diaconis and Sturmfels presented an approach for determining the conditional significance of a higher order statistic, after sampling a space conditioned on the value of a lower order statistic. Their approach involves the computation of a Markov basis, followed by the use of a Markov process with stationary hypergeometric distribution to generate a sample.This technique for data analysis has become an accepted tool of algebraic statistics, particularly for the study of fully ranked data. In this thesis, we explore the extension of this technique for data analysis to the study of partially ranked data, focusing on data from surveys in which participants are asked to identify their top $k$ choices of $n$ items. Before we move on to our own data analysis, though, we present a thorough discussion of the Diaconis–Sturmfels algorithm and its use in data analysis. In this discussion, we attempt to collect together all of the background on Markov bases, Markov proceses, Gröbner bases, implicitization theory, and elimination theory, that is necessary for a full understanding of this approach to data analysis.
42

Aspects of exchangeable coalescent processes

Pitters, Hermann-Helmut January 2015 (has links)
In mathematical population genetics a multiple merger <i>n</i>-coalescent process, or <i>Λ</i> <i>n</i>-coalescent process, {<i>Π<sup>n</sup>(t) t</i> ≥ 0} models the genealogical tree of a sample of size <i>n</i> (e.g. of DNA sequences) drawn from a large population of haploid individuals. We study various properties of <i>Λ</i> coalescents. Novel in our approach is that we introduce the partition lattice as well as cumulants into the study of functionals of coalescent processes. We illustrate the success of this approach on several examples. Cumulants allow us to reveal the relation between the tree height, <i>T<sub>n</sub></i>, respectively the total branch length, <i>L<sub>n</sub></i>, of the genealogical tree of Kingman’s <i>n</i>-coalescent, arguably the most celebrated coalescent process, and the Riemann zeta function. Drawing on results from lattice theory, we give a spectral decomposition for the generator of both the Kingman and the Bolthausen-Sznitman <i>n</i>-coalescent, the latter of which emerges as a genealogy in models of populations undergoing selection. Taking mutations into account, let <i>M<sub>j</sub></i> count the number of mutations that are shared by <i>j</i> individuals in the sample. The random vector (<i>M<sub>1</sub></i>,...,<i>M<sub>n-1</sub></i>), known as the site frequency spectrum, can be measured from genetical data and is therefore an important statistic from the point of view of applications. Fu worked out the expected value, the variance and the covariance of the marginals of the site frequency spectrum. Using the partition lattice we derive a formula for the cumulants of arbitrary order of the marginals of the site frequency spectrum. Following another line of research, we provide a law of large numbers for a family of <i>Λ</i> coalescents. To be more specific, we show that the process {<i>#Π<sup>n</sup>(t), t</i> ≥ 0} recording the number <i>#Π<sup>n</sup>(t)</i> of individuals in the coalescent at time <i>t</i>, coverges, after a suitable rescaling, towards a deterministic limit as the sample size <i>n</i> grows without bound. In the statistical physics literature this limit is known as a hydrodynamic limit. Up to date the hydrodynamic limit was known for Kingman’s coalescent, but not for other <i>Λ</i> coalescents. We work out the hydrodynamic limit for beta coalescents that come down from infinity, which is an important subclass of the <i>Λ</i> coalescents.
43

Stochastic modeling and methods for portfolio management in cointegrated markets

Angoshtari, Bahman January 2014 (has links)
In this thesis we study the utility maximization problem for assets whose prices are cointegrated, which arises from the investment practice of convergence trading and its special forms, pairs trading and spread trading. The major theme in the first two chapters of the thesis, is to investigate the assumption of market-neutrality of the optimal convergence trading strategies, which is a ubiquitous assumption taken by practitioners and academics alike. This assumption lacks a theoretical justification and, to the best of our knowledge, the only relevant study is Liu and Timmermann (2013) which implies that the optimal convergence strategies are, in general, not market-neutral. We start by considering a minimalistic pairs-trading scenario with two cointegrated stocks and solve the Merton investment problem with power and logarithmic utilities. We pay special attention to when/if the stochastic control problem is well-posed, which is overlooked in the study done by Liu and Timmermann (2013). In particular, we show that the problem is ill-posed if and only if the agent’s risk-aversion is less than a constant which is an explicit function of the market parameters. This condition, in turn, yields the necessary and sufficient condition for well-posedness of the Merton problem for all possible values of agent’s risk-aversion. The resulting well-posedness condition is surprisingly strict and, in particular, is equivalent to assuming the optimal investment strategy in the stocks to be market-neutral. Furthermore, it is shown that the well-posedness condition is equivalent to applying Novikov’s condition to the market-price of risk, which is a ubiquitous sufficient condition for imposing absence of arbitrage. To the best of our knowledge, these are the only theoretical results for supporting the assumption of market-neutrality of convergence trading strategies. We then generalise the results to the more realistic setting of multiple cointegrated assets, assuming risk factors that effects the asset returns, and general utility functions for investor’s preference. In the process of generalising the bivariate results, we also obtained some well-posedness conditions for matrix Riccati differential equations which are, to the best of our knowledge, new. In the last chapter, we set up and justify a Merton problem that is related to spread-trading with two futures assets and assuming proportional transaction costs. The model possesses three characteristics whose combination makes it different from the existing literature on proportional transaction costs: 1) finite time horizon, 2) Multiple risky assets 3) stochastic opportunity set. We introduce the HJB equation and provide rigorous arguments showing that the corresponding value function is the viscosity solution of the HJB equation. We end the chapter by devising a numerical scheme, based on the penalty method of Forsyth and Vetzal (2002), to approximate the viscosity solution of the HJB equation.
44

Probabilistic inference in ecological networks : graph discovery, community detection and modelling dynamic sociality

Psorakis, Ioannis January 2013 (has links)
This thesis proposes a collection of analytical and computational methods for inferring an underlying social structure of a given population, observed only via timestamped occurrences of its members across a range of locations. It shows that such data streams have a modular and temporally-focused structure, neither fully ordered nor completely random, with individuals appearing in "gathering events". By exploiting such structure, the thesis proposes an appropriate mapping of those spatio-temporal data streams to a social network, based on the co-occurrences of agents across gathering events, while capturing the uncertainty over social ties via the use of probability distributions. Given the extracted graphs mentioned above, an approach is proposed for studying their community organisation. The method considers communities as explanatory variables for the observed interactions, producing overlapping partitions and node membership scores to groups. The aforementioned models are motivated by a large ongoing experiment at Wytham woods, Oxford, where a population of Parus major wild birds is tagged with RFID devices and a grid of feeding locations generates thousands of spatio-temporal records each year. The methods proposed are applied on such data set to demonstrate how they can be used to explore wild bird sociality, reveal its internal organisation across a variety of different scales and provide insights into important biological processes relating to mating pair formation.
45

Numerical methods for approximating solutions to rough differential equations

Gyurko, Lajos Gergely January 2008 (has links)
The main motivation behind writing this thesis was to construct numerical methods to approximate solutions to differential equations driven by rough paths, where the solution is considered in the rough path-sense. Rough paths of inhomogeneous degree of smoothness as driving noise are considered. We also aimed to find applications of these numerical methods to stochastic differential equations. After sketching the core ideas of the Rough Paths Theory in Chapter 1, the versions of the core theorems corresponding to the inhomogeneous degree of smoothness case are stated and proved in Chapter 2 along with some auxiliary claims on the continuity of the solution in a certain sense, including an RDE-version of Gronwall's lemma. In Chapter 3, numerical schemes for approximating solutions to differential equations driven by rough paths of inhomogeneous degree of smoothness are constructed. We start with setting up some principles of approximations. Then a general class of local approximations is introduced. This class is used to construct global approximations by pasting together the local ones. A general sufficient condition on the local approximations implying global convergence is given and proved. The next step is to construct particular local approximations in finite dimensions based on solutions to ordinary differential equations derived locally and satisfying the sufficient condition for global convergence. These local approximations require strong conditions on the one-form defining the rough differential equation. Finally, we show that when the local ODE-based schemes are applied in combination with rough polynomial approximations, the conditions on the one-form can be weakened. In Chapter 4, the results of Gyurko & Lyons (2010) on path-wise approximation of solutions to stochastic differential equations are recalled and extended to the truncated signature level of the solution. Furthermore, some practical considerations related to the implementation of high order schemes are described. The effectiveness of the derived schemes is demonstrated on numerical examples. In Chapter 5, the background theory of the Kusuoka-Lyons-Victoir (KLV) family of weak approximations is recalled and linked to the results of Chapter 4. We highlight how the different versions of the KLV family are related. Finally, a numerical evaluation of the autonomous ODE-based versions of the family is carried out, focusing on SDEs in dimensions up to 4, using cubature formulas of different degrees and several high order numerical ODE solvers. We demonstrate the effectiveness and the occasional non-effectiveness of the numerical approximations in cases when the KLV family is used in its original version and also when used in combination with partial sampling methods (Monte-Carlo, TBBA) and Romberg extrapolation.
46

Importance sampling on the coalescent with recombination

Jenkins, Paul A. January 2008 (has links)
Performing inference on contemporary samples of homologous DNA sequence data is an important task. By assuming a stochastic model for ancestry, one can make full use of observed data by sampling from the distribution of genealogies conditional upon the sample configuration. A natural such model is Kingman's coalescent, with numerous extensions to account for additional biological phenomena. However, in this model the distribution of interest cannot be written down analytically, and so one solution is to utilize importance sampling. In this context, importance sampling (IS) simulates genealogies from an artificial proposal distribution, and corrects for this by weighting each resulting genealogy. In this thesis I investigate in detail approaches for developing efficient proposal distributions on coalescent histories, with a particular focus on a two-locus model mutating under the infinite-sites assumption and in which the loci are separated by a region of recombination. This model was originally studied by Griffiths (1981), and is a useful simplification for considering the correlated ancestries of two linked loci. I show that my proposal distribution generally outperforms an existing IS method which could be recruited to this model. Given today's sequencing technologies it is not difficult to find volumes of data for which even the most efficient proposal distributions might struggle. I therefore appropriate resampling mechanisms from the theory of sequential Monte Carlo in order to effect substantial improvements in IS applications. In particular, I propose a new resampling scheme and confirm that it ensures a significant gain in the accuracy of likelihood estimates. It outperforms an existing scheme which can actually diminish the quality of an IS simulation unless it is applied to coalescent models with care. Finally, I apply the methods developed here to an example dataset, and discuss a new measure for the way in which two gene trees are correlated.
47

Partly exchangeable fragmentations

Chen, Bo January 2009 (has links)
We introduce a simple tree growth process that gives rise to a new two-parameter family of discrete fragmentation trees that extends Ford's alpha model to multifurcating trees and includes the trees obtained by uniform sampling from Duquesne and Le Gall's stable continuum random tree. We call these new trees the alpha-gamma trees. In this thesis, we obtain their splitting rules, dislocation measures both in ranked order and in sized-biased order, and we study their limiting behaviour. We further extend the underlying exchangeable fragmentation processes of such trees into partly exchangeable fragmentation processes by weakening the exchangeability. We obtain the integral representations for the measures associated with partly exchangeable fragmentation processes and subordinator of the tagged fragments. We also embed the trees associated with such processes into continuum random trees and study their limiting behaviour. In the end, we generate a three-parameter family of partly exchangeable trees which contains the family of the alpha-gamma trees and another important two-parameter family based on Poisson-Dirichlet distributions.
48

Theoretical advances in the modelling and interrogation of biochemical reaction systems : alternative formulations of the chemical Langevin equation and optimal experiment design for model discrimination

Mélykúti, Bence January 2010 (has links)
This thesis is concerned with methodologies for the accurate quantitative modelling of molecular biological systems. The first part is devoted to the chemical Langevin equation (CLE), a stochastic differential equation driven by a multidimensional Wiener process. The CLE is an approximation to the standard discrete Markov jump process model of chemical reaction kinetics. It is valid in the regime where molecular populations are abundant enough to assume their concentrations change continuously, but stochastic fluctuations still play a major role. We observe that the CLE is not a single equation, but a family of equations with shared finite-dimensional distributions. On the theoretical side, we prove that as many Wiener processes are sufficient to formulate the CLE as there are independent variables in the equation, which is just the rank of the stoichiometric matrix. On the practical side, we show that in the case where there are m_1 pairs of reversible reactions and m_2 irreversible reactions, there is another, simple formulation of the CLE with only m_1+m_2 Wiener processes, whereas the standard approach uses 2m_1+m_2. Considerable computational savings are achieved with this latter formulation. A flaw of the CLE model is identified: trajectories may leave the nonnegative orthant with positive probability. The second part addresses the challenge when alternative, structurally different ordinary differential equation models of similar complexity fit the available experimental data equally well. We review optimal experiment design methods for choosing the initial state and structural changes on the biological system to maximally discriminate between the outputs of rival models in terms of L_2-distance. We determine the optimal stimulus (input) profile for externally excitable systems. The numerical implementation relies on sum of squares decompositions and is demonstrated on two rival models of signal processing in starving Dictyostelium amoebae. Such experiments accelerate the perfection of our understanding of biochemical mechanisms.
49

Bayesian methods for estimating human ancestry using whole genome SNP data

Churchhouse, Claire January 2012 (has links)
The past five years has seen the discovery of a wealth of genetics variants associated with an incredible range of diseases and traits that have been identified in genome- wide association studies (GWAS). These GWAS have typically been performed in in- dividuals of European descent, prompting a call for such studies to be conducted over a more diverse range of populations. These include groups such as African Ameri- cans and Latinos as they are recognised as bearing a disproportionately large burden of disease in the U.S. population. The variation in ancestry among such groups must be correctly accounted for in association studies to avoid spurious hits arising due to differences in ancestry between cases and controls. Such ancestral variation is not all problematic as it may also be exploited to uncover loci associated with disease in an approach known as admixture mapping, or to estimate recombination rates in admixed individuals. Many models have been proposed to infer genetic ancestry and they differ in their accuracy, the type of data they employ, their computational efficiency, and whether or not they can handle multi-way admixture. Despite the number of existing models, there is an unfulfilled requirement for a model that performs well even when the ancestral populations are closely related, is extendible to multi-way admixture scenarios, and can handle whole- genome data while remaining computationally efficient. In this thesis we present a novel method of ancestry estimation named MULTIMIX that satisfies these criteria. The underlying model we propose uses a multivariate nor- mal to approximate the distribution of a haplotype at a window of contiguous SNPs given the ancestral origin of that part of the genome. The observed allele types and the ancestry states that we aim to infer are incorporated in to a hidden Markov model to capture the correlations in ancestry that we expect to exist between neighbouring sites. We show via simulation studies that its performance on two-way and three-way admixture is competitive with state-of-the-art methods, and apply it to several real admixed samples of the International HapMap Project and the 1000 Genomes Project.
50

Colouring, centrality and core-periphery structure in graphs

Rombach, Michaela Puck January 2013 (has links)
Krivelevich and Patkós conjectured in 2009 that χ(G(n, p)) ∼ χ=(G(n, p)) ∼ χ∗=(G(n, p)) for C/n < p < 1 − ε, where ε > 0. We prove this conjecture for n−1+ε1 < p < 1 − ε2 where ε1, ε2 > 0. We investigate several measures that have been proposed to indicate centrality of nodes in networks, and find examples of networks where they fail to distinguish any of the vertices nodes from one another. We develop a new method to investigate core-periphery structure, which entails identifying densely-connected core nodes and sparsely-connected periphery nodes. Finally, we present an experiment and an analysis of empirical networks, functional human brain networks. We found that reconfiguration patterns of dynamic communities can be used to classify nodes into a stiff core, a flexible periphery, and a bulk. The separation between this stiff core and flexible periphery changes as a person learns a simple motor skill and, importantly, it is a good predictor of how successful the person is at learning the skill. This temporally defined core-periphery organisation corresponds well with the core- periphery detected by the method that we proposed earlier the static networks created by averaging over the subjects dynamic functional brain networks.

Page generated in 0.0887 seconds