• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 9
  • 1
  • Tagged with
  • 13
  • 13
  • 13
  • 4
  • 4
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Statistical evaluation of mixed DNA stains

Choy, Yan-tsun., 蔡恩浚. January 2009 (has links)
published_or_final_version / Statistics and Actuarial Science / Master / Master of Philosophy
2

Robust tests under genetic model uncertainty in case-control association studies

Zang, Yong, 臧勇 January 2011 (has links)
published_or_final_version / Statistics and Actuarial Science / Doctoral / Doctor of Philosophy
3

Statistical methods for Mendelian randomization using GWAS summary data

Hu, Xianghong 23 August 2019 (has links)
Mendelian Randomization (MR) is a powerful tool for accessing causality of exposure on an outcome using genetic variants as the instrumental variables. Much of the recent developments is propelled by the increasing availability of GWAS summary data. However, the accuracy of the MR causal effect estimates could be challenged in case of the MR assumptions are violated. The source of biases could attribute to the weak effects arising because of polygenicity, the presentence of horizontal pleiotropy and other biases, e.g., selection bias. In this thesis, we proposed two works, expecting to deal with these issues.In the first part, we proposed a method named 'Bayesian Weighted Mendelian Randomization (BMWR)' for causal inference using summary statistics from GWAS. In BWMR, we not only take into account the uncertainty of weak effects owning to polygenicity of human genomics but also models the weak horizontal pleiotropic effects. Moreover, BWMR adopts a Bayesian reweighting strategy for detection of large pleiotropic outliers. An efficient algorithm based on variational inference was developed to make BWMR computationally efficient and stable. Considering the underestimated variance provided by variational inference, we further derived a closed form variance estimator inspired by a linear response method. We conducted several simulations to evaluate the performance of BWMR, demonstrating the advantage of BWMR over other methods. Then, we applied BWMR to access causality between 126 metabolites and 90 complex traits, revealing novel causal relationships. In the second part, we further developed BWMR-C: Statistical correction of selection bias for Mendelian Randomization based on a Bayesian weighted method. Based on the framework of BWMR, the probability model in BWMR-C is built conditional on the IV selection criteria. In such way, BWMR-C delicated to reduce the influence of the selection process on the causal effect estimates and also preserve the good properties of BWMR. To make the causal inference computationally stable and efficient, we developed a variational EM algorithm. We conducted several comprehensive simulations to evaluate the performance of BWMR-C for correction of selection bias. Then, we applied BWMR-C on seven body fat distribution related traits and 140 UK Biobank traits. Our results show that BWMR-C achieves satisfactory performance for correcting selection bias. Keywords: Mendelian Randomization, polygenicity, horizontal pleiotropy, selection bias, variation inference.
4

Some topics in the statistical analysis of forensic DNA and genetic family data

Hu, Yueqing., 胡躍清. January 2007 (has links)
published_or_final_version / abstract / Statistics and Actuarial Science / Doctoral / Doctor of Philosophy
5

The likelihood of gene trees under selective models

Coop, Graham M. January 2004 (has links)
The extent to which natural selection shapes diversity within populations is a key question for population genetics. Thus, there is considerable interest in quantifying the strength of selection. In this thesis a full likelihood approach for inference about selection at a single site within an otherwise neutral fully-linked sequence of sites is developed. Integral to many of the ideas introduced in this thesis is the reversibility of the diffusion process, and some past approaches to this concept are reviewed. A coalescent model of evolution is used to model the ancestry of a sample of DNA sequences which have the selected site segregating. A novel method for simulating the coalescent with selection, acting at a single biallelic site, is described. Selection is incorporated through modelling the frequency of the selected and neutral allelic classes stochastically back in time. The ancestry is then simulated using a subdivided population model considering the population frequencies through time as variable population sizes. The approach is general and can be used for any selection scheme at a biallelic locus. The mutation model, for the selected and neutral sites, is the infinitely-many-sites model where there is no back or parallel mutation at sites. This allows a unique perfect phylogeny, a gene tree, to be constructed from the configuration of mutations on the sample sequences. An importance sampling algorithm is described to explore over coalescent tree space consistent with this gene tree. The method is used to assess the evidence for selection in a number of data sets. These are as follows: a partial selective sweep in the G6PD gene (Verrelli et al., 2002); a recent full sweep in the Factor IX gene (Harris and Hey, 2001); and balancing selection in the DCP1 gene (Rieder et al., 1999). Little evidence of the action of selection is found in the data set of Verrelli et al. (2002) and the data set of Rieder et al. (1999) seems inconsistent with the model of balancing selection. The patterns of diversity in the data set of Harris and Hey (2001) offer support of the hypothesis of a full sweep.
6

Reference-free identification of genetic variation in metagenomic sequence data using a probabilistic model

Ahiska, Bartu January 2012 (has links)
Microorganisms are an indispensable part of our ecosystem, yet the natural metabolic and ecological diversity of these organisms is poorly understood due to a historical reliance of microbiology on laboratory grown cultures. The awareness that this diversity cannot be studied by laboratory isolation, together with recent advances in low cost scalable sequencing technology, have enabled the foundation of culture-independent microbiology, or metagenomics. The study of environmental microbial samples with metagenomics has led to many advances, but a number of technological and methodological challenges still remain. A potentially diverse set of taxa may be represented in anyone environmental sample. Existing tools for representing the genetic composition of such samples sequenced with short-read data, and tools for identifying variation amongst them, are still in their infancy. This thesis makes the case that a new framework based on a joint-genome graph can constitute a powerful tool for representing and manipulating the joint genomes of population samples. I present the development of a collection of methods, called SCRAPS, to construct these efficient graphs in small communities without the availability or bias of a reference genome. A key novelty is that genetic variation is identified from the data structure using a probabilistic algorithm that can provide a measure of the confidence in each call. SCRAPS is first tested on simulated short read data for accuracy and efficiency. At least 95% of non-repetitive small-scale genetic variation with a minor allele read depth greater than 10x is correctly identified; the number false positives per conserved nucleotide is consistently better than 1 part in 333 x 103. SCRAPS is then applied to artificially pooled experimental datasets. As part of this study, SCRAPS is used to identify genetic variation in an epidemiological 11 sample Neisseria meningitidis dataset collected from the African meningitis belt". In total 14,000 sites of genetic variation are identified from 48 million Illumina/Solexa reads. The results clearly show the genetic differences between two waves of infection that has plagued northern Ghana and Burkina Faso.
7

Limits to the rate of adaptation

Cuthbertson, Charles January 2007 (has links)
No description available.
8

Statistical approaches to population genetics

Chun, Dan January 1962 (has links)
This paper is concerned with the statistical approach to population genetics. The genetical characteristics of the population under consideration are the population size and the gene frequencies. The limitations on the population are (a) the individuals are either all haploids or all diploids; (b) the alleles are only two in number, either A or a; (c) there are no selection and migration pressures occurring in the population. Three main model types, the branching processes, the Markov chains, and the diffusion processes are discussed. Published results on the subject are presented along with some new investigations. Comparable results obtained by various authors are checked, some of which are found to be invalid. Continuous approximations are used to derive some of the results of discrete processes, but a chapter is devoted to the justification of such approximations. / M.S.
9

Analysis for segmental sharing and linkage disequilibrium: a genomewide association study on myopia

Lee, Yiu-fai., 李耀暉. January 2009 (has links)
published_or_final_version / Psychiatry / Doctoral / Doctor of Philosophy
10

On Identifying Rare Variants for Complex Human Traits

Fan, Ruixue January 2015 (has links)
This thesis focuses on developing novel statistical tests for rare variants association analysis incorporating both marginal effects and interaction effects among rare variants. Compared with common variants, rare variants have lower minor allele frequencies (typically less than 5%), and hence traditional association tests for common variants will lose power for rare variants. Therefore, there is a pressing need of new analytical tools to tackle the problem of rare variants association with complex human traits. Several collapsing methods have been proposed that aggregate information of rare variants in a region and test them together. They can be divided into burden tests and non-burden tests based on their aggregation strategies. They are all variations of regression-based methods with the assumption that the phenotype is associated with the genotype via a (linear) regression model. Most of these methods consider only marginal effects of rare variants and fail to take into account gene-gene and gene-environmental interactive effects, which are ubiquitous and are of utmost importance in biological systems. In this thesis, we propose a summation of partition approach (SPA) -- a nonparametric strategy for rare variants association analysis. Extensive simulation studies show that SPA is powerful in detecting not only marginal effects but also gene-gene interaction effects of rare variants. Moreover, extensions of SPA are able to detect gene-environment interactions and other interactions existing in complicated biological system as well. We are also able to obtain the asymptotic behavior of the marginal SPA score, which guarantees the power of the proposed method. Inspired by the idea of stepwise variable selection, a significance-based backward dropping algorithm(SDA) is proposed to locate truly influential rare variants in a genetic region that has been identified significant. Unlike traditional backward dropping approaches which remove the least significant variables first, SDA introduces the idea of eliminating the most significant variable at each round. The removed variables are collected and their effects are evaluated by an influence ratio score -- the relative p-value change. Our simulation studies show that SDA is powerful to detect causal variables and SDA has lower false discovery rate than LASSO. We also demonstrate our method using the dataset provided by Genetic Analysis Workshop (GAW) 17 and the results support the superiority of SDA over LASSO. The general partition-retention framework can also be applied to detect gene-environmental interaction effects for common variants. We demonstrate this method using the dataset from Genetic Analysis Workshop (GAW) 18. Our nonparametric approach is able to identify a lot more possible influential gene-environmental pairs than traditional linear regression models. We propose in this thesis a "SPA-SDA" two step approach for rare variants association analysis at genomic scale: first identify significant regions of moderate sizes using SPA, and then apply SDA to the identified regions to pinpoint truly influential variables. This approach is computationally efficient for genomic data and it has the capacity to detect gene-gene and gene-environmental interactions.

Page generated in 0.1113 seconds