1 |
Genetics of a color polymorphism in Heliconius dorisBenson, Caleb 07 August 2020 (has links)
Balancing selection refers to the maintenance of multiple phenotypic variants within a population. There are a number of proposed mechanisms explaining the origin and persistence of the evolution and genetics of polymorphisms, but they largely remain unresolved in the specific instances in which they occur. This study aims to identify the genetic basis of a polymorphism in the butterfly, Heliconius doris, which displays four distinct color patterns on the dorsal hindwings of individuals. While Mullerian mimetic theory proposes that phenotypes will converge on a common, aposematic phenotype, this is not the case in Heliconius doris. We identify an interval perfectly associated with the presence/absence of the red ray phenotype, and propose potential mechanisms and genetic architecture through which this polymorphism has been allowed to persist.
|
2 |
ITGB5 and AGFG1 variants are associated with severity of airway responsivenessHimes, Blanca, Qiu, Weiliang, Klanderman, Barbara, Ziniti, John, Senter-Sylvia, Jody, Szefler, Stanley, Lemanske, Jr, Robert, Zeiger, Robert, Strunk, Robert, Martinez, Fernando, Boushey, Homer, Chinchilli, Vernon, Israel, Elliot, Mauger, David, Koppelman, Gerard, Nieuwenhuis, Maartje, Postma, Dirkje, Vonk, Judith, Rafaels, Nicholas, Hansel, Nadia, Barnes, Kathleen, Raby, Benjamin, Tantisira, Kelan, Weiss, Scott January 2013 (has links)
BACKGROUND:Airway hyperresponsiveness (AHR), a primary characteristic of asthma, involves increased airway smooth muscle contractility in response to certain exposures. We sought to determine whether common genetic variants were associated with AHR severity.METHODS:A genome-wide association study (GWAS) of AHR, quantified as the natural log of the dosage of methacholine causing a 20% drop in FEV1, was performed with 994 non-Hispanic white asthmatic subjects from three drug clinical trials: CAMP, CARE, and ACRN. Genotyping was performed on Affymetrix 6.0 arrays, and imputed data based on HapMap Phase 2, was used to measure the association of SNPs with AHR using a linear regression model. Replication of primary findings was attempted in 650 white subjects from DAG, and 3,354 white subjects from LHS. Evidence that the top SNPs were eQTL of their respective genes was sought using expression data available for 419 white CAMP subjects.RESULTS:The top primary GWAS associations were in rs848788 (P-value 7.2E-07) and rs6731443 (P-value 2.5E-06), located within the ITGB5 and AGFG1 genes, respectively. The AGFG1 result replicated at a nominally significant level in one independent population (LHS P-value 0.012), and the SNP had a nominally significant unadjusted P-value (0.0067) for being an eQTL of AGFG1.CONCLUSIONS:Based on current knowledge of ITGB5 and AGFG1, our results suggest that variants within these genes may be involved in modulating AHR. Future functional studies are required to confirm that our associations represent true biologically significant findings.
|
3 |
Analysis of high-density SNP data from complex populationsFloyd, James A. B. January 2011 (has links)
Data from a Croatian isolate population are analysed in a genome-wide association study (GWAS) for a variety of disease-related quantitative traits. A novel genomewide approach to analysing pedigree-based association data called GRAMMAR is utilised. One of the significant findings, for uric acid, is followed up in greater detail, and is replicated in another isolate population, from Orkney. The associated SNPs are located in the SLC2A9 gene, coding for a known glucose transporter, which leads to identification of SLC2A9 as a urate transporter too (Vitart et al., 2008). These SNPs are later implicated in affecting gout, a disease known to be linked with high serum uric acid levels, in an independent study (Dehghan et al., 2008). Subsequently, investigation into different ways in which to use SNP data to identify quantitative trait loci (QTL) for genome-wide association (GWA) studies is performed. Several multi-marker approaches are compared to single SNP analysis using simulated phenotypes and real genotype data, and results show that for rare variants haplotype analysis is the most effective method of detection. Finally, the multi-marker methods are compared with single SNP analysis on the real uric acid data. Interpretation of real data results was complicated due to low sample size, since only founder and unrelated individuals may be used for population-based haplotype analysis, nonetheless, results of the prior analyses of simulated data indicate that multi-marker methods, in particular haplotypes, may greatly facilitate detection of QTL with low minor allele frequency in GWA studies.
|
4 |
The information bottleneck method for genome-wide association studies.Fang, Shenying. Xiong, Momiao, Boerwinkle, Eric Kapadia, Asha Seth, Unknown Date (has links)
Source: Dissertation Abstracts International, Volume: 69-10, Section: B, page: 5857. Adviser: Momiao Xiong. Includes bibliographical references (leaves xx-xx).
|
5 |
Polygenic prediction and GWAS of depression, PTSD, and suicidal ideation/self-harm in a Peruvian cohortShen, Hanyang, Gelaye, Bizu, Huang, Hailiang, Rondon, Marta B., Sanchez, Sixto, Duncan, Laramie E. 01 September 2020 (has links)
LED and HS have been funded by startup funds from Stanford and a pilot grant to LED from the Stanford Center for Clinical and Translation Research and Education (UL1 TR001085, PI Greenberg). LED has also been funded by Cohen Veterans Bioscience (CVB), and she is part of the CVB Working Group for PTSD Adaptive Platform Trial. BG has been funded by the NIH (R01-HD-059835, PI Williams) and CVB. HH has been funded by the NIH (NIH K01DK114379 and NIH R21AI139012), the Zhengxu and Ying He Foundation, and the Stanley Center for Psychiatric Research. MBR received funds from WPA Congress Mexico City 2018, Guayaquil CEPAM 2019, Asunción X CONGRESO LATINOAMERICANO DE LA FLAPB 2018, Guayaquil 2019 (Bago), and Lancet Psychiatry, London (commission on Violence against women) 2019. SS declares no potential conflict of interest. / Genome-wide approaches including polygenic risk scores (PRSs) are now widely used in medical research; however, few studies have been conducted in low- and middle-income countries (LMICs), especially in South America. This study was designed to test the transferability of psychiatric PRSs to individuals with different ancestral and cultural backgrounds and to provide genome-wide association study (GWAS) results for psychiatric outcomes in this sample. The PrOMIS cohort (N = 3308) was recruited from prenatal care clinics at the Instituto Nacional Materno Perinatal (INMP) in Lima, Peru. Three major psychiatric outcomes (depression, PTSD, and suicidal ideation and/or self-harm) were scored by interviewers using valid Spanish questionnaires. Illumina Multi-Ethnic Global chip was used for genotyping. Standard procedures for PRSs and GWAS were used along with extra steps to rule out confounding due to ancestry. Depression PRSs significantly predicted depression, PTSD, and suicidal ideation/self-harm and explained up to 0.6% of phenotypic variation (minimum p = 3.9 × 10−6). The associations were robust to sensitivity analyses using more homogeneous subgroups of participants and alternative choices of principal components. Successful polygenic prediction of three psychiatric phenotypes in this Peruvian cohort suggests that genetic influences on depression, PTSD, and suicidal ideation/self-harm are at least partially shared across global populations. These PRS and GWAS results from this large Peruvian cohort advance genetic research (and the potential for improved treatments) for diverse global populations. / National Institutes of Health / Revisión por pares
|
6 |
New Statistical Methods and Computational Tools for Mining Big Data, with Applications in Plant SciencesMichels, Kurt Andrew January 2016 (has links)
The purpose of this dissertation is to develop new statistical tools for mining big data in plant sciences. In particular, the dissertation consists of four inter-related projects to address various methodological and computational challenges in phylogenetic methods. Project 1 aims to systematically test different optimization tools and provide useful strategies to improve optimization in practice. Project 2 develops a new R package rPlant, which provides a friendly and convenient toolbox for users of iPlant. Project 3 presents a fast and effective group-screening method to identify important genetic factors in GWAS, with theoretical justifications and nice asymptotic properties. Project 4 develops a new statistical tool to identify gene-gene interactions, with the ability of handling the interactions between groups of covariates.
|
7 |
Design and analysis of genome-wide association studiesBarrett, Jeffrey C. January 2008 (has links)
Despite many years of effort, linkage and candidate gene association studies have yielded disappointingly few risk loci for common human diseases such as diabetes, auto-immune disorders and cancers. Large sample sizes, increased understanding of the patterns of correlation in genetic variation, and plunging genotyping costs have enabled genome-wide association studies, which have good power to detect common risk alleles of modest effect. I present an evaluation of SNP choice in study design and show that overall, despite substantial differences in genotyping technologies, marker selection strategies and number of markers assayed, the first generation platforms all offer good levels of genome coverage (∼ 70%). I next describe the largest such project undertaken to date, the Wellcome Trust Case Control Consortium, which consisted of 2000 cases from each of seven common diseases and 3000 shared controls. It identified nearly two dozen new associations. I demonstrate the importance of careful data quality control, including both standard and unorthodox analyses. I next focus on the association results therein for Crohn’s disease. I present a replication experiment in over 1000 additional Crohn’s patients which unambiguously confirmed six previously published loci and four new loci. Next I describe, in a general context, several issues impeding the combination of genome-wide scans, including data annotation, population structure and differences in genotyping platform. Each of these problems is shown to be tractable with available methods, provided that these methods are applied prudently. I present the results of a meta-analysis of three genome-wide scans for Crohn’s disease. The data showed a striking excess of significant associations, and a replication experiment involving over 4000 independent Crohn’s patients verified twenty new risk loci. Finally, I discuss the early success of genome-wide association and its consequences for further understanding the biology of human disease.
|
8 |
Dissecting genetic interactions in complex traitsHemani, Gibran January 2012 (has links)
Of central importance in the dissection of the components that govern complex traits is understanding the architecture of natural genetic variation. Genetic interaction, or epistasis, constitutes one aspect of this, but epistatic analysis has been largely avoided in genome wide association studies because of statistical and computational difficulties. This thesis explores both issues in the context of two-locus interactions. Initially, through simulation and deterministic calculations it was demonstrated that not only can epistasis maintain deleterious mutations at intermediate frequencies when under selection, but that it may also have a role in the maintenance of additive variance. Based on the epistatic patterns that are evolutionarily persistent, and the frequencies at which they are maintained, it was shown that exhaustive two dimensional search strategies are the most powerful approaches for uncovering both additive variance and the other genetic variance components that are co-precipitated. However, while these simulations demonstrate encouraging statistical benefits, two dimensional searches are often computationally prohibitive, particularly with the marker densities and sample sizes that are typical of genome wide association studies. To address this issue different software implementations were developed to parallelise the two dimensional triangular search grid across various types of high performance computing hardware. Of these, particularly effective was using the massively-multi-core architecture of consumer level graphics cards. While the performance will continue to improve as hardware improves, at the time of testing the speed was 2-3 orders of magnitude faster than CPU based software solutions that are in current use. Not only does this software enable epistatic scans to be performed routinely at minimal cost, but it is now feasible to empirically explore the false discovery rates introduced by the high dimensionality of multiple testing. Through permutation analysis it was shown that the significance threshold for epistatic searches is a function of both marker density and population sample size, and that because of the correlation structure that exists between tests the threshold estimates currently used are overly stringent. Although the relaxed threshold estimates constitute an improvement in the power of two dimensional searches, detection is still most likely limited to relatively large genetic effects. Through direct calculation it was shown that, in contrast to the additive case where the decay of estimated genetic variance was proportional to falling linkage disequilibrium between causal variants and observed markers, for epistasis this decay was exponential. One way to rescue poorly captured causal variants is to parameterise association tests using haplotypes rather than single markers. A novel statistical method that uses a regularised parameter selection procedure on two locus haplotypes was developed, and through extensive simulations it can be shown that it delivers a substantial gain in power over single marker based tests. Ultimately, this thesis seeks to demonstrate that many of the obstacles in epistatic analysis can be ameliorated, and with the current abundance of genomic data gathered by the scientific community direct search may be a viable method to qualify the importance of epistasis.
|
9 |
Imputing Genotypes Using Regularized Generalized Linear Regression ModelsGriesman, Joshua 14 June 2012 (has links)
As genomic sequencing technologies continue to advance, researchers are furthering their understanding of the relationships between genetic variants and expressed traits (Hirschhorn and Daly, 2005). However, missing data can significantly limit the power of a genetic study. Here, the use of a regularized generalized linear model, denoted GLMNET is proposed to impute missing genotypes. The method aimed to address certain limitations of earlier regression approaches in regards to genotype imputation, particularly multicollinearity among predictors. The performance of GLMNET-based method is compared to the performance of the phase-based method fastPHASE. Two simulation settings were evaluated: a sparse-missing model, and a small-panel expan- sion model. The sparse-missing model simulated a scenario where SNPs were missing in a random fashion across the genome. In the small-panel expansion model, a set of test individuals that were only genotyped at a small subset of the SNPs of the large panel. Each imputation method was tested in the context of two data-sets: Canadian Holstein cattle data and human HapMap CEU data. Although the proposed method was able to perform with high accuracy (>90% in all simulations), fastPHASE per- formed with higher accuracy (>94%). However, the new method, which was coded in R, was able to impute genotypes with better time efficiency than fastPHASE and this could be further improved by optimizing in a compiled language.
|
10 |
Exploiting Historical Data and Diverse Germplasm to Increase Maize Grain Yield in TexasBarrero Farfan, Ivan D. 16 December 2013 (has links)
The U.S. is the largest maize producer in the world with a production of 300 million tons in 2012. Approximately 86% of the maize production is focused on the Midwestern states. The rest of the production is focused in the Southern states, where Texas is the largest maize producer. Grain yield in Texas ranges from 18 tons/ha in the irrigated production zones to 3 tons/ha in the dryland production zones. As a result, grain yield has increased slowly because of the poor production in the non-irrigated acres. Methods to improve the grain yield in Texas is to breed for maize varieties adapted to Texas growing conditions, including mapping genes that can be incorporated into germplasm through marker assisted selection. This dissertation includes two separate projects that exploit historical data and maize diversity to increase grain yield in Texas.
For the first project, a large dataset collected by Texas AgriLife program was analyzed to elucidate past trends and future hints on how to improve maize yield within Texas. This study confirmed previous reports that the rate of increase for grain yield in Texas is less than the rate observed in the Midwestern US.
For the second project, a candidate gene and whole genome association mapping analysis was performed for drought and aflatoxin resistance in maize. In order to do so, maize inbred lines from a diversity panel were testcrossed to isogenic versions of Tx714. The hybrids were evaluated under irrigated and non-irrigated conditions. The irrigated trials were inoculated with Aspergillus flavus and the aflatoxin level was quantified. This study found that the gene ZmLOX4 was associated with days to silk, and the gene ZmLOX5 gene was associated with plant and ear height. In addition, this study identified 13 QTL variants for grain yield, plant height, days to anthesis and days to silk. Furthermore, this study shows that diverse maize inbred lines can make hybrids that out yield commercial hybrids under heat and drought stress. Therefore, there are useful genes present in these diverse lines that can be exploited in maize breeding programs
|
Page generated in 0.0857 seconds