• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 25
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 35
  • 35
  • 13
  • 9
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

On genetic variants underlying common disease

Hechter, Eliana January 2011 (has links)
Genome-wide association studies (GWAS) exploit the correlation in ge- netic diversity along chromosomes in order to detect effects on disease risk without having to type causal loci directly. The inevitable downside of this approach is that, when the correlation between the marker and the causal variant is imperfect, the risk associated with carrying the predisposing allele is diluted and its effect is underestimated. This thesis explores four different facets of this risk dilution: (1) estimating true effect sizes from those observed in GWAS; (2) asking how the context of a GWAS, including the population studied, the genotyping chip employed, and the use of im- putation, affects risk estimates; (3) assessing how often the best-associated SNP in a GWAS coincides with the causal variant; and (4) quantifying how departures from the simplest disease risk model at a causal variant distort the observed disease risk model. Using simulations, where we have information about the true risk at the causal locus, we show that the correlation between the marker and the causal variant is the primary driver of effect size underestimation. The extent of the underestimation depends on a number of factors, including the population in which the study is conducted, the genotyping chip employed, whether imputation is used, and the strength, frequency, and disease model of the risk allele. Suppose that a GWAS study is conducted in a European population, with an Affymetrix 6.0 genotyping chip, without imputation, and that the causal loci have a modest effect on disease risk, are common in the population, and follow an additive disease risk model. In such a study, we show that the risk estimated from the most associated SNP is very close to the truth approximately two-thirds of the time (although we predict that fine mapping of GWAS loci will infrequently identify causal variants with considerably higher risk), and that the best-associated variant is very often perfectly or nearly-perfectly correlated with, and almost always within 0.1cM of, the causal variant. However, the strong correlations among nearby loci mean that the causal and best-associated variants coincide infrequently, less than one-fifth of the time, even if the causal variant is genotyped. We explore ways in which these results change quantitatively depending on the parameters of the GWAS study. Additionally, we demonstrate that we expect to identify substantial deviations from the additive disease risk model among loci where association is detected, even though power to detect departures from the model drops off very quickly as the correlation between the marker and causal loci decreases. Finally, we discuss the implications of our results for the design and interpretation of future GWAS studies.
32

Bayesian methods for multivariate phenotype analysis in genome-wide association studies

Iotchkova, Valentina Valentinova January 2013 (has links)
Most genome-wide association studies search for genetic variants associated to a single trait of interest, despite the main interest usually being the understanding of a complex genotype-phenotype network. Furthermore, many studies collect data on multiple phenotypes, each measuring a different aspect of the biological system under consideration, therefore it can often make sense to jointly analyze the phenotypes. However this is rarely the case and there is a lack of well developed methods for multiple phenotype analysis. Here we propose novel approaches for genome-wide association analysis, which scan the genome one SNP at a time for association with multivariate traits. The first half of this thesis focuses on an analytic model averaging approach which bi-partitions traits into associated and unassociated, fits all such models and measures evidence of association using a Bayes factor. The discrete nature of the model allows very fine control of prior beliefs about which sets of traits are more likely to be jointly associated. Using simulated data we show that this method can have much greater power than simpler approaches that do not explicitly model residual correlation between traits. On real data of six hematological parameters in 3 population cohorts (KORA, UKNBS and TwinsUK) from the HaemGen consortium, this model allows us to uncover an association at the RCL locus that was not identified in the original analysis but has been validated in a much larger study. In the second half of the thesis we propose and explore the properties of models that use priors encouraging sparse solutions, in the sense that genetic effects of phenotypes are shrunk towards zero when there is little evidence of association. To do this we explore and use spike and slab (SAS) priors. All methods combine both hypothesis testing, via calculation of a Bayes factor, and model selection, which occurs implicitly via the sparsity priors. We have successfully implemented a Variational Bayesian approach to fit this model, which provides a tractable approximation to the posterior distribution, and allows us to approximate the very high-dimensional integral required for the Bayes factor calculation. This approach has a number of desirable properties. It can handle missing phenotype data, which is a real feature of most studies. It allows for both correlation due to relatedness between subjects or population structure and residual phenotype correlation. It can be viewed as a sparse Bayesian multivariate generalization of the mixed model approaches that have become popular recently in the GWAS literature. In addition, the method is computationally fast and can be applied to millions of SNPs for a large number of phenotypes. Furthermore we apply our method to 15 glycans from 3 isolated population cohorts (ORCADES, KORCULA and VIS), where we uncover association at a known locus, not identified in the original study but discovered later in a larger one. We conclude by discussing future directions.
33

Bayesian methods for estimating human ancestry using whole genome SNP data

Churchhouse, Claire January 2012 (has links)
The past five years has seen the discovery of a wealth of genetics variants associated with an incredible range of diseases and traits that have been identified in genome- wide association studies (GWAS). These GWAS have typically been performed in in- dividuals of European descent, prompting a call for such studies to be conducted over a more diverse range of populations. These include groups such as African Ameri- cans and Latinos as they are recognised as bearing a disproportionately large burden of disease in the U.S. population. The variation in ancestry among such groups must be correctly accounted for in association studies to avoid spurious hits arising due to differences in ancestry between cases and controls. Such ancestral variation is not all problematic as it may also be exploited to uncover loci associated with disease in an approach known as admixture mapping, or to estimate recombination rates in admixed individuals. Many models have been proposed to infer genetic ancestry and they differ in their accuracy, the type of data they employ, their computational efficiency, and whether or not they can handle multi-way admixture. Despite the number of existing models, there is an unfulfilled requirement for a model that performs well even when the ancestral populations are closely related, is extendible to multi-way admixture scenarios, and can handle whole- genome data while remaining computationally efficient. In this thesis we present a novel method of ancestry estimation named MULTIMIX that satisfies these criteria. The underlying model we propose uses a multivariate nor- mal to approximate the distribution of a haplotype at a window of contiguous SNPs given the ancestral origin of that part of the genome. The observed allele types and the ancestry states that we aim to infer are incorporated in to a hidden Markov model to capture the correlations in ancestry that we expect to exist between neighbouring sites. We show via simulation studies that its performance on two-way and three-way admixture is competitive with state-of-the-art methods, and apply it to several real admixed samples of the International HapMap Project and the 1000 Genomes Project.
34

Statistical and computational methodology for the analysis of forensic DNA mixtures with artefacts

Graversen, Therese January 2014 (has links)
This thesis proposes and discusses a statistical model for interpreting forensic DNA mixtures. We develop methods for estimation of model parameters and assessing the uncertainty of the estimated quantities. Further, we discuss how to interpret the mixture in terms of predicting the set of contributors. We emphasise the importance of challenging any interpretation of a particular mixture, and for this purpose we develop a set of diagnostic tools that can be used in assessing the adequacy of the model to the data at hand as well as in a systematic validation of the model on experimental data. An important feature of this work is that all methodology is developed entirely within the framework of the adopted model, ensuring a transparent and consistent analysis. To overcome the challenge that lies in handling the large state space for DNA profiles, we propose a representation of a genotype that exhibits a Markov structure. Further, we develop methods for efficient and exact computation in a Bayesian network. An implementation of the model and methodology is available through the R package DNAmixtures.
35

Genetics of ankylosing spondylitis

Karaderi, Tugce January 2012 (has links)
Ankylosing spondylitis (AS) is a common inflammatory arthritis of the spine and other affected joints, which is highly heritable, being strongly influenced by the HLA-B27 status, as well as hundreds of mostly unknown genetic variants of smaller effect. The aim of my research was to confirm some of the previously observed genetic associations and to identify new associations, many of which are in biological pathways relevant to AS pathogenesis, most notably the IL-23/T<sub>H</sub>17 axis (IL23R) and antigen presentation (ERAP1 and ERAP2). Studies presented in this thesis include replication and refinement of several potential associations initially identified by earlier GWAS (WTCCC-TASC, 2007 and TASC, 2010). I conducted an extended study of IL23R association with AS and undertook a meta-analysis, confirming the association between AS and IL23R (non-synonymous SNP rs11209026, p=1.5 x 10-9, OR=0.61). An extensive re-sequencing and fine mapping project, including a meta-analysis, to replicate and refine the association of TNFRSF1A with AS was also undertaken; a novel variant in intron 6 was identified and a weak association with a low frequency variant, rs4149584 (p=0.01, OR=1.58), was detected. Somewhat stronger associations were seen with rs4149577 (p=0.002, OR=0.91) and rs4149578 (p=0.015, OR=1.14) in the meta-analysis. Associations at several additional loci had been identified by a more recent GWAS (WTCCC2-TASC, 2011). I used in silico techniques, including imputation using a denser panel of variants from the 1000 Genomes Project, conditional analysis and rare/low frequency variant analysis, to refine these associations. Imputation analysis (1782 cases/5167 controls) revealed novel associations with ERAP2 (rs4869313, p=7.3 x 10-8, OR=0.79) and several additional candidate loci including IL6R, UBE2L3 and 2p16.3. Ten SNPs were then directly typed in an independent sample (1804 cases/1848 controls) to replicate selected associations and to determine the imputation accuracy. I established that imputation using the 1000 Genomes Project pilot data was largely reliable, specifically for common variants (genotype concordence~97%). However, more accurate imputation of low frequency variants may require larger reference populations, like the most recent 1000 Genomes reference panels. The results of my research provide a better understanding of the complex genetics of AS, and help identify future targets for genetic and functional studies.

Page generated in 0.0993 seconds