531 |
Discrete and statistical approaches to geneticsBruen, Trevor Cormac Vincent. January 2006 (has links)
This thesis presents a number of major innovations in related but different areas of research. The contributions range along a continuum from mathematical phylogenetics, to development of statistical methodology for detecting recombination and finally to the application of statistical techniques to understand Feline Immunodeficiency Virus (FIV) an important pathogen. An underlying theme is the application of combinatorial and statistical ideas to problems in evolutionary biology and genetics. / Chapter 2 and Chapter 3 give a number of results relevant to mathematical phylogenetics, in particular maximum parsimony. Chapter 2 presents a new formulation of maximum parsimony in terms of character subdivision, providing a direct link with the character compatibility problem, also known as the perfect phylogeny problem. Specialization of this result to two characters gives a simple formula based on the intersection graph for calculating the parsimony score for a, pair of characters. Chapter 3 further explores maximum parsimony. In particular, it is shown that a maximum parsimony tree for a sequence of characters minimizes a subtree-prune and regraft (SPR) distance to the sets of trees on which each character is convex. Similar connections are also drawn between the Robinson-Foulds distance and a new variant of Dollo parsimony. / Chapter 4 presents an application of the work in Chapters 2 and 3 to develop a statistical test for detecting recombination. An extensive coalescent based simulation study shows that this new test is both robust and powerful in a variety of different circumstances compared to a number of current methods. In fact, a simple model of mutation rate correlation is shown to mislead a number of competing tests, causing recombination to be falsely inferred. Analysis of empirical data sets confirm that the new test is one of the best approaches to distinguish recurrent mutation from recombination. / Finally, Chapter 5 uses the test developed in Chapter 4 to localize recombinant breakpoints in 14 genomic strains of FIV taken from a wild population of cougars. Based on the technique, three recombinant strains of FIV are identified. Previous studies have focused on the epidemiology and population structure of the virus and this study shows that recombination has also played an important role in the evolution of FIV.
|
532 |
Statistical Inference on Stochastic GraphsHosseinkashi, Yasaman 17 June 2011 (has links)
This thesis considers modelling and applications of random graph processes.
A brief review on contemporary random graph models and a general Birth-Death
model with relevant maximum likelihood inference procedure are provided in chapter
one. The main result in this thesis is the construction of an epidemic model by
embedding a competing hazard model within a stochastic graph process (chapter
2). This model includes both individual characteristics and the population connectivity
pattern in analyzing the infection propagation. The dynamic outdegrees and
indegrees, estimated by the model, provide insight into important epidemiological
concepts such as the reproductive number. A dynamic reproductive number based
on the disease graph process is developed and applied in several simulated and actual
epidemic outbreaks. In addition, graph-based statistical measures are proposed
to quantify the effect of individual characteristics on the disease propagation. The
epidemic model is applied to two real outbreaks: the 2001 foot-and-mouth epidemic
in the United Kingdom (chapter 3) and the 1861 measles outbreak in Hagelloch,
Germany (chapter 4). Both applications provide valuable insight into the behaviour
of infectious disease propagation with di erent connectivity patterns and human
interventions.
|
533 |
Geostatistics with location-dependent statisticsMachuca-Mory, David Francisco 11 1900 (has links)
In Geostatistical modelling of the spatial distribution of rock attributes, the multivariate distribution of a Random Function defines the range of possible values and the spatial relationships among them. Under a decision of stationarity, the Random Function distribution and its statistics are inferred from data within a spatial domain deemed statistically homogenous. Assuming stationary multiGaussianity allows spatial prediction techniques to take advantage of this simple parametric distribution model. These techniques compute the local distributions with surrounding data and global spatially invariant statistics. They often fail to reproduce local changes in the mean, variability and, particularly, the spatial continuity, that are required for geologically realistic modelling of rock attributes. The proposed alternative is to build local Random Function models that are deemed stationary only in relation to the locations where they are defined. The corresponding location-dependent distributions and statistics are inferred by weighting the samples inversely proportional to their distance to anchor locations. These distributions are locally Gaussian transformed. The transformation models carry information on the local histogram. The distance weighted experimental measures of spatial correlation are able to adapt to local changes in the spatial continuity and are semi-automatically fitted by locally defined variogram models. The fields of local variogram and transformation parameters are used in locally stationary spatial prediction algorithms. The resulting attribute models are rich in non-stationary spatial features. This process implies a higher computational demand than the traditional techniques, but, if data is abundant enough to allow a reliable inference of the local statistics, the proposed locally stationary techniques outperform their stationary counterparts in terms of accuracy and precision. These improved models have the potential of providing better decision support for engineering design. / Mining Engineering
|
534 |
Rescuing Statistics from the MathematiciansBedwell, Mike 12 April 2012 (has links) (PDF)
Drawing on some 30 years’ experience in the UK and Central Europe, the author offers four assertions, three about education generally and the fourth that of the title. There the case is argued that statistics is a branch of logic, and therefore should be taught by experts in such subjects as philosophy and law and not exclusively by athematicians. Education in both Statistics and these other subjects would profit in consequence.
|
535 |
Statistical Diffusion Tensor ImagingHeim, Susanne 20 April 2007 (has links) (PDF)
HASH(0x642d360)
|
536 |
Statistical Issues in Machine LearningStrobl, Carolin 02 July 2008 (has links) (PDF)
HASH(0x6444a90)
|
537 |
Billiards and statistical mechanicsGrigo, Alexander 18 May 2009 (has links)
In this thesis we consider mathematical problems related to different aspects of hard sphere systems.
In the first part we study planar billiards, which arise in the context of hard sphere systems when only one or two spheres are present. In particular we investigate the possibility of elliptic periodic orbits in the general construction of hyperbolic billiards. We show that if non-absolutely focusing components are present there can be elliptic periodic orbits with arbitrarily long free paths. Furthermore, we show that smooth stadium like billiards have elliptic periodic orbits for a large range of separation distances.
In the second part we consider hard sphere systems with a large number of particles, which we model by the Boltzmann equation. We develop a new approach to derive hydrodynamic limits, which is based on classical methods of geometric singular perturbation theory of ordinary differential equations. This method provides new geometric and dynamical interpretations of hydrodynamic limits, in particular, for the of the dissipative Boltzmann equation.
|
538 |
Investigations in Graphical StatisticsMurrell, Paul R. January 1998 (has links)
This thesis is concerned with the design and development of statistical graphics software – programs to help draw graphs. Graphs serve two major functions in statistics. Firstly, graphs are used for exploratory data analysis – for detecting the message in a set of data – and secondly, graphs are used for data display – for presenting the message in a set of data. The most important feature of software for exploratory data analysis is extensibility. This is the ability to quickly and easily develop new graphical images and is vital for being able to explore a data set in many different ways. The most important feature of software for data display is customisation. This is the ability to fine-tune a graphical image in great detail and is vital for the production of presentation-quality graphics. In both cases it is important that a graphical image should be constructed to best explore or show-off the peculiarities of a specific data set. A pervading theme of this thesis is that statistical graphics software should be flexible. The software tools described herin allow graphical images to be modified in arbitrary ways; the structure of graphical images is also arbitrary and not restricted to standard graph formats; a simple, coherent method, based on a general constraint system, for developing novel graphical images is explored; and a mechanism for specifying the arrangement of the components of a graphical image is introduced. Some of these ideas are incorporated within an existing statistical analysis package.
|
539 |
Investigations in Graphical StatisticsMurrell, Paul R. January 1998 (has links)
This thesis is concerned with the design and development of statistical graphics software – programs to help draw graphs. Graphs serve two major functions in statistics. Firstly, graphs are used for exploratory data analysis – for detecting the message in a set of data – and secondly, graphs are used for data display – for presenting the message in a set of data. The most important feature of software for exploratory data analysis is extensibility. This is the ability to quickly and easily develop new graphical images and is vital for being able to explore a data set in many different ways. The most important feature of software for data display is customisation. This is the ability to fine-tune a graphical image in great detail and is vital for the production of presentation-quality graphics. In both cases it is important that a graphical image should be constructed to best explore or show-off the peculiarities of a specific data set. A pervading theme of this thesis is that statistical graphics software should be flexible. The software tools described herin allow graphical images to be modified in arbitrary ways; the structure of graphical images is also arbitrary and not restricted to standard graph formats; a simple, coherent method, based on a general constraint system, for developing novel graphical images is explored; and a mechanism for specifying the arrangement of the components of a graphical image is introduced. Some of these ideas are incorporated within an existing statistical analysis package.
|
540 |
Statistical modeling of multiword expressionsSu, Kim Nam January 2008 (has links)
In natural languages, words can occur in single units called simplex words or in a group of simplex words that function as a single unit, called multiword expressions (MWEs). Although MWEs are similar to simplex words in their syntax and semantics, they pose their own sets of challenges (Sag et al. 2002). MWEs are arguably one of the biggest roadblocks in computational linguistics due to the bewildering range of syntactic, semantic, pragmatic and statistical idiomaticity they are associated with, and their high productivity. In addition, the large numbers in which they occur demand specialized handling. Moreover, dealing with MWEs has a broad range of applications, from syntactic disambiguation to semantic analysis in natural language processing (NLP) (Wacholder and Song 2003; Piao et al. 2003; Baldwin et al. 2004; Venkatapathy and Joshi 2006). / Our goals in this research are: to use computational techniques to shed light on the underlying linguistic processes giving rise to MWEs across constructions and languages; to generalize existing techniques by abstracting away from individual MWE types; and finally to exemplify the utility of MWE interpretation within general NLP tasks. / In this thesis, we target English MWEs due to resource availability. In particular, we focus on noun compounds (NCs) and verb-particle constructions (VPCs) due to their high productivity and frequency. / Challenges in processing noun compounds are: (1) interpreting the semantic relation (SR) that represents the underlying connection between the head noun and modifier(s); (2) resolving syntactic ambiguity in NCs comprising three or more terms; and (3) analyzing the impact of word sense on noun compound interpretation. Our basic approach to interpreting NCs relies on the semantic similarity of the NC components using firstly a nearest-neighbor method (Chapter 5), then verb semantics based on the observation that it is often an underlying verb that relates the nouns in NCs (Chapter 6), and finally semantic variation within NC sense collocations, in combination with bootstrapping (Chapter 7). / Challenges in dealing with verb-particle constructions are: (1) identifying VPCs in raw text data (Chapter 8); and (2) modeling the semantic compositionality of VPCs (Chapter 5). We place particular focus on identifying VPCs in context, and measuring the compositionality of unseen VPCs in order to predict their meaning. Our primary approach to the identification task is to adapt localized context information derived from linguistic features of VPCs to distinguish between VPCs and simple verb-PP combinations. To measure the compositionality of VPCs, we use semantic similarity among VPCs by testing the semantic contribution of each component. / Finally, we conclude the thesis with a chapter-by-chapter summary and outline of the findings of our work, suggestions of potential NLP applications, and a presentation of further research directions (Chapter 9).
|
Page generated in 0.2064 seconds