Global ETD Search

31	Bayesian Model Uncertainty and Prior Choice with Applications to Genetic Association Studies Wilson, Melanie Ann January 2010 (has links) <p>The Bayesian approach to model selection allows for uncertainty in both model specific parameters and in the models themselves. Much of the recent Bayesian model uncertainty literature has focused on defining these prior distributions in an objective manner, providing conditions under which Bayes factors lead to the correct model selection, particularly in the situation where the number of variables, <italic>p</italic>, increases with the sample size, <italic>n</italic>. This is certainly the case in our area of motivation; the biological application of genetic association studies involving single nucleotide polymorphisms. While the most common approach to this problem has been to apply a marginal test to all genetic markers, we employ analytical strategies that improve upon these marginal methods by modeling the outcome variable as a function of a multivariate genetic profile using Bayesian variable selection. In doing so, we perform variable selection on a large number of correlated covariates within studies involving modest sample sizes. </p> <p>In particular, we present an efficient Bayesian model search strategy that searches over the space of genetic markers and their genetic parametrization. The resulting method for Multilevel Inference of SNP Associations MISA, allows computation of multilevel posterior probabilities and Bayes factors at the global, gene and SNP level. We use simulated data sets to characterize MISA's statistical power, and show that MISA has higher power to detect association than standard procedures. Using data from the North Carolina Ovarian Cancer Study (NCOCS), MISA identifies variants that were not identified by standard methods and have been externally 'validated' in independent studies. </p> <p></p> <p>In the context of Bayesian model uncertainty for problems involving a large number of correlated covariates we characterize commonly used prior distributions on the model space and investigate their implicit multiplicity correction properties first in the extreme case where the model includes an increasing number of redundant covariates and then under the case of full rank design matrices. We provide conditions on the asymptotic (in <italic>n</italic> and <italic>p</italic>) behavior of the model space prior </p> <p>required to achieve consistent selection of the global hypothesis of at least one associated variable in the analysis using global posterior probabilities (i.e. under 0-1 loss). In particular, under the assumption that the null model is true, we show that the commonly used uniform prior on the model space leads to inconsistent selection of the global hypothesis via global posterior probabilities (the posterior probability of at least one association goes to <italic>1</italic>) when the rank of the design matrix is finite. In the full rank case, we also show inconsistency when <italic>p</italic> goes to infinity faster than the square root of <italic>n</italic>. Alternatively, we show that any model space prior such that the global prior odds of association increases at a rate slower than the square root of <italic>n<italic> results in consistent selection of the global hypothesis in terms of posterior probabilities.</p> / Dissertation Statistics Biology, Genetics Applied Mathematics Bayes Model Uncertainty Genetic Association Studies Model Space Priors
32	Gene Expression Analyses and Association Studies of Wood Development Genes in Loblolly Pine (Pinus taeda L.) Palle, Sreenath Reddy 2010 August 1900 (has links) Gene expression analyses using native populations can provide information on the genetic and molecular mechanisms that determine intraspecific variation and contribute to the understanding of plant development and adaptation in multiple ways. Using quantitative real time – polymerase chain reaction (qRT-PCR), we analyzed the expression of 111 genes with probable roles in wood development in 400 loblolly pine individuals belonging to a population covering much of the natural range. Association mapping techniques are increasingly being used in plants to dissect complex genetic traits and identify genes responsible for the quantitative variation of these traits. We used candidate-gene based association studies to associate single nucleotide polymorphisms (SNPs) in candidate genes with the variation in gene expression. The specific objectives established for this study were to study natural variation in expression of xylem development genes in loblolly pine (Pinus taeda L.) using qRT-PCR, to associate SNPs in candidate genes with the variation in gene expression using candidate-gene based association analyses and to detect loblolly pine promoter polymorphisms and study their effect on gene expression. Out of the 111 genes analyzed using qRT-PCR, there were significant differences in expression among clones for 106 genes. Candidate-gene based association studies were performed between 3937 single nucleotide polymorphisms (SNPs) and gene expression to associate SNPs in candidate genes with the variation in gene expression. To the best of our knowledge, this is the first association genetic study where expression of a large number of genes, analyzed in a natural population, has been the phenotypic trait of interest. We cloned and sequenced promoters of 19 genes, 16 of which are transcription factors involved in wood development and drought response. SNP discovery was done in 13 of these promoters using a panel of 24 loblolly pine clones (unique genotypes). SNP genotyping is underway in the entire association population and association analyses will be done to study the effects of promoter SNPs on gene expression. The results from this project are promising and once these associations have been tested and proved, we believe that they will help in our understanding of the genetics of complex traits. Gene expression analysis Association studies Wood development Loblolly pine Promoter polymorphisms
33	Functional Analysis of the Ovarian Cancer Susceptibility Locus at 9p22.2 Reveals a Transcription Regulatory Network Mediated by BNC2 in Ovarian Cells Buckley, Melissa 01 January 2015 (has links) GWAS have identified several chromosomal loci associated with ovarian cancer risk. However, the mechanism underlying these associations remains elusive. We identify candidate functional Single Nucleotide Polymorphisms (SNPs) at the 9p22.2 ovarian cancer susceptibility locus, several of which map to transcriptional regulatory elements active in ovarian cells identified by FAIRE-seq (Formaldehyde assisted isolation of regulatory elements followed by sequencing) and ChIP-seq (Chromatin Immunoprecipitation followed by sequencing) in relevant cell types. Reporter and electrophoretic mobility shift assays (EMSA) determined the extent to which candidate SNPs had allele specific effects. Chromosome conformation capture (3C) reveals a physical association between Basonuclin 2 (BNC2) and SNPs with functional properties. This establishes BNC2 as a major target of four candidate functional SNPs in at least two distinct elements. BNC2 codes for a putative transcription regulator containing three pairs of zinc finger (ZF) domains. Furthermore, bnc2 mutation in zebrafish leads to developmental defects including dysmorphic ovaries and sterility, clearly implicating this protein in cellular processes associated with ovarian development. We show that BNC2 is a transcriptional regulator with a specific DNA recognition sequence of targets enriched in genes involved in cell communication through DNA binding assays, ChIP-seq, and expression analysis. This study reveals a comprehensive regulatory landscape at the 9p22.2 locus and indicates that a likely mechanism of susceptibility to ovarian cancer may include multiple allele-specific changes in DNA regulatory elements some of which alter BNC2 expression. This study begins to identify the underlying mechanisms of the 9p22.2 locus association with ovarian cancer and aims to provide data to support advances in care based on one’s genetic composition. Allele Genome Wide Association Studies Single Nucleotide Polymorphism Transcription Zinc Finger Domain Biology Genetics Molecular Biology
34	SNP-set Tests for Sequencing and Genome-Wide Association Studies Barnett, Ian 06 June 2014 (has links) In this dissertation we propose methodology for testing SNP-sets for genetic associations, both for sequencing and genome-wide association studies. Due to the large scale of this kind of data, there is an emphasis on producing methodology that is not only accurate and powerful, but also computationally efficient. Biostatistics genome-wide association studies higher criticism sequencing SNP-set testing
35	An Examination of Hardy-Weinberg Disequilibrium and Statistical Testing in Genetic Association Studies Grover, Vaneeta Kaur 18 June 2010 (has links) In an unpublished study in Toronto it was observed that cases were in Hardy-Weinberg Equilibrium at a locus whereas their family members were in Hardy-Weinberg Disequilibrium (HWD). This led to an investigation of relatives of affected individuals to see whether the multiplicative model could be revealed by a nonzero HWD coefficient in relatives. Genotypic frequencies and HWD coefficients were derived for affected individuals and their affected and unaffected relatives. Methods were also developed to test for association using data from affected individuals and their relatives. In addition, a model was developed to assess whether the HWD observed in a data set from a stratified population can be explained by both genetic association and stratification. Parameter estimates for these models can be obtained using maximum likelihood methods, and used to deduce the mode of inheritance of the disease. / Departure from HWE (HWD) in a sample may indicate genotyping error, population stratification, selection bias, or some combination thereof. Therefore, loci exhibiting HWD are often excluded from association studies. However, it has been shown that in case-control studies HWD can result from a genetic effect at the locus, and HWD at a marker locus can be interpreted as evidence for association with a disease. In an unpublished study in Toronto it was observed that cases were in Hardy- Weinberg equilibrium at a locus whereas their family members were in HWD. It has been shown that the HWD coefficient for a multiplicative genetic model is zero. This led to an investigation of relatives of affected individuals to see whether the multiplicative model could be revealed by a nonzero HWD coefficient in relatives. Genotypic frequencies and HWD coefficients were derived for affected individuals and their affected and unaffected relatives. A substantial HWD was found in both individuals in dominant and recessive genetic models but HWD is only slightly nonzero for additive and multiplicative models. Methods were also developed to test for association using data from affected individuals and their relatives. Parameter estimates for these models can be obtained using maximum likelihood methods, and estimates provide valuable information regarding the mode of inheritance of the disease. The methods were applied to 112 discordant sib pairs with Alzheimer’s disease typed for the ApoE polymorphism and a significant association was observed between the "4 ApoE allele and Alzheimer’s disease. Case-control studies may indicate spurious association with a marker locus in a stratified population. Methods were developed to determine if the HWD observed in a data set from a stratified population can be explained by both genetic association and stratification. Parameter estimates for these models can be obtained using maximum likelihood methods, and used to deduce the mode of inheritance of the disease. Applying the model to the R990G SNP of the CASR gene, it was found that the HWD was adequately explained by a recessive genetic association and a stratification proportion of 10%, consistent with the population of Toronto.
36	Estimating the Overlap of Top Instances in Lists Ranked by Correlation to Label Damavandi, Babak Unknown Date No description available. Machine Learning Bioinformatics Gene signatures Genome wide association studies GWAS Microarray
37	Approaches Incorporating Evidence for Population Stratification Bias in Genetic Association Analyses Combining Individual and Family Data Mirea, Olguta Lucia 13 June 2011 (has links) Statistical methods that integrate between-individual (IA) and within-family (FA) genetic association analyses can increase statistical power to identify disease susceptibility genes, however combining IA and FA is valid only when the IA are free of population stratification bias (PSB). Existing methods initially test for PSB by comparing IA and FA results using an arbitrary testing level αPSB, typically 5%. Combined analyses are performed if no significant PSB is detected, otherwise analyses are restricted to FA. As a novel alternative, we propose a weighted (WGT) framework that combines the estimate from the most powerful analysis subject to PSB with the most powerful robust FA estimate, using weights based on the p-value from the PSB test. The WGT approach generalizes existing methods by using a continuous weighting function that depends only on the observed PSB p-value instead of a binary one that also depends on specification of an arbitrary PSB testing level αPSB. Simulations of quantitative trait and case-control data show that in comparison to existing methods, the WGT approach has 5% type I error closer to the nominal level, increased (decreased) accuracy for larger (smaller) PSB levels, and overall increased positive predictive value. The resulting PSB correction is SNP-specific and provides a good compromise between type I error control and power in candidate gene or confirmation studies limited to few loci, when PSB is likely and there are no additional empirical data available to correct PSB. We applied the WGT approach to a case-control study of childhood leukemia and a study of diabetes complications with time-to-event outcomes derived from repeated measurements obtained over 17 years of follow-up. To directly analyze the longitudinal measurements without specification of event thresholds, we developed fully Bayesian latent change-point time (LCPT) models for IA and FA. In analogy with the WGT approach, we also considered an extended LCPT model incorporating PSB evidence in analyses combining IA and FA. genetic association studies population stratification bias individual-level analyses family-based analyses 0308
38	Approaches Incorporating Evidence for Population Stratification Bias in Genetic Association Analyses Combining Individual and Family Data Mirea, Olguta Lucia 13 June 2011 (has links) Statistical methods that integrate between-individual (IA) and within-family (FA) genetic association analyses can increase statistical power to identify disease susceptibility genes, however combining IA and FA is valid only when the IA are free of population stratification bias (PSB). Existing methods initially test for PSB by comparing IA and FA results using an arbitrary testing level αPSB, typically 5%. Combined analyses are performed if no significant PSB is detected, otherwise analyses are restricted to FA. As a novel alternative, we propose a weighted (WGT) framework that combines the estimate from the most powerful analysis subject to PSB with the most powerful robust FA estimate, using weights based on the p-value from the PSB test. The WGT approach generalizes existing methods by using a continuous weighting function that depends only on the observed PSB p-value instead of a binary one that also depends on specification of an arbitrary PSB testing level αPSB. Simulations of quantitative trait and case-control data show that in comparison to existing methods, the WGT approach has 5% type I error closer to the nominal level, increased (decreased) accuracy for larger (smaller) PSB levels, and overall increased positive predictive value. The resulting PSB correction is SNP-specific and provides a good compromise between type I error control and power in candidate gene or confirmation studies limited to few loci, when PSB is likely and there are no additional empirical data available to correct PSB. We applied the WGT approach to a case-control study of childhood leukemia and a study of diabetes complications with time-to-event outcomes derived from repeated measurements obtained over 17 years of follow-up. To directly analyze the longitudinal measurements without specification of event thresholds, we developed fully Bayesian latent change-point time (LCPT) models for IA and FA. In analogy with the WGT approach, we also considered an extended LCPT model incorporating PSB evidence in analyses combining IA and FA. genetic association studies population stratification bias individual-level analyses family-based analyses 0308
39	Genetic susceptibility to invasive Nontyphoidal Salmonella disease in African children Gilchrist, James January 2016 (has links) Nontyphoidal Salmonella (NTS) causes invasive, and frequently fatal, disease in African children. The burden of disease secondary to NTS reflects inadequacy of Salmonella-control strategies in Africa, with expanding antibiotic resistance, and no licensed anti-NTS vaccine. The delivery of improved interventions to prevent, diagnose, and treat invasive NTS (iNTS) infection, will be facilitated by an improved understanding of the biological determinants of susceptibility to iNTS, including host genetic factors. To identify host genetic determinants of iNTS disease, we performed a GWAS and replication analysis of NTS bacteraemia in African children. This analysis identified and validated a common genetic variant in STAT4 associated with increased iNTS risk. To characterise the function of the NTS-associated STAT4 variant, we utilised a genotype-selectable bioresource of healthy European adults and samples from African children with iNTS disease. In these experiments, the risk genotype at STAT4 is associated with reduced STAT4 RNA expression in stimulated leukocytes, and reduced IFNγ production in both ex vivo stimulated natural killer cells and in the serum of African children with acute NTS bacteraemia. To validate genetic variation suggestively associated with NTS bacteraemia in the GWAS, NTS-associated loci with evidence of regulatory function were prioritised for functional characterisation. Using in vitro models of intracellular Salmonella infection and RNA interference, I characterise the role of a candidate NTS-susceptibility determinant, EVI5L, in Salmonella infections. Finally, applying a pathway enrichment analysis to the NTS bacteraemia GWAS demonstrated that NTS-associated genetic variation in African children is enriched for methionine salvage enzymes. I further investigate the potential for host-pathogen interaction in this pathway, generating and characterising Salmonella mutants deficient in methionine metabolism. Taken together, these data represent the first unbiased assessment of genetic susceptibility to iNTS disease in unselected populations. These results have important implications for the design of Salmonella-control strategies for use in Africa.
40	Genetic association methods for multiple types of traits in family samples Wang, Shuai 08 April 2016 (has links) Statistical association tests of quantitative traits have been widely used in the past decade, to locate loci associated with a disease trait. For instance, Genome Wide Association Studies (GWAS) have led to tremendous success in finding susceptible genes or associated loci. However, most of the past studies were based on unrelated samples focusing on quantitative or qualitative traits. The analysis of polychotomous traits in family samples is very challenging. This dissertation describes three projects related to methods to conduct association tests beyond continuous traits, such as multinomial traits, bivariate traits, and tests involving haplotypes. The first project focuses on developing a statistical approach to test the association between common or low-frequency variants with a multinomial trait in family samples. It is an important issue because there is no computer efficient software available for this type of question. We employ Laplace approximation in conjunction with an efficient grid-search strategy to obtain an approximate maximum log-likelihood function and the Maximum Likelihood Estimate (MLE) of the variance component. We also successfully incorporate the kinship matrix to adjust for the familial correlation, based on a regression framework. Extensive simulation studies are performed to evaluate the type-I error rate and power in scenarios with causal variant with different Minor Allele Frequency (MAF). In the second project, we propose an approach to test the association between a genetic variant and a bivariate trait arising from a combination of a quantitative and a binary trait in family samples, based on Extended Generalized Estimating Equations (EGEE). Multiple phenotype-genotype association tests are often reduced to univariate tests, decreasing efficiency and power. Our approach is shown to be much more powerful and efficient than univariate association tests adjusted for multiple testing. The third project involves the development of a general framework for meta-analysis of haplotype association tests, applicable to both unrelated and family samples. Although meta-analysis has been widely used in single-variant and gene-based tests, there are few existing methods to meta-analyze haplotype association tests. A predominant advantage of our novel approach is that it accommodates cohort-specific haplotypes as well as haplotypes common to all cohorts. The cohort participants may be either related or unrelated. Our approach consists of two stages: in the first stage, each cohort performs a haplotype association test, reports the estimates of effect size, variance, haplotypes, and their frequency. In the second stage, a generalized least square method is applied to combine the results of all the cohorts into one vector of meta-analysis coefficients. Our approach is shown to have the correct type-I error rate in scenarios with different between and within cohort variation. We also present an application to exome-chip data from a large consortium. Through the three projects, we are able to tackle the problem of conducting association tests for non-continuous traits in family samples. All the approaches achieve the correct type-I error rate and are computationally efficient. We hope these approaches will not only facilitate analyses of categorical traits in family samples, but will also provide a basis for future methodological development of statistical approaches for non-continuous traits. Biostatistics EGEE Family samples Genetic association studies Haplotype association test Meta-analysis Multinomial trait

Search results