41 |
Multivariate linear mixed models for statistical geneticsCasale, Francesco Paolo January 2016 (has links)
In the last decade, genome-wide association studies have helped to advance our understanding of the genetic architecture of many important traits, including diseases. However, the statistical analysis of genotype-phenotype associations remains challenging due to multiple factors. First, many traits have polygenic architectures, which means that they are controlled by a large number of variants with small individual effects. Second, as increasingly deep phenotype data are being generated there is a need for multivariate analysis approaches to leverage multiple related phenotypes while retaining computational efficiency. Additionally, genetic analyses are confronted by strong confounding factors that can create spurious associations when not properly accounted for in the statistical model. We here derive more flexible methods that allow integrating genetic effects across variants and multiple quantitative traits. To do so, we build on the classical linear mixed model (LMM), a widely adopted framework for genetic studies. The first contribution of this thesis is mtSet, an efficient mixed-model approach that enables genome-wide association testing between sets of genetic variants and multiple traits while accounting for confounding factors. In both simulations and real-data applications we demonstrate that mtSet effectively combines the advantages of variant-set and multi-trait analyses. Next, we present a new model for gene-context interactions that builds on mtSet. The proposed interaction set test (iSet) yields increased statistical power for detecting polygenic interactions. Additionally, iSet enables the identification of genetic loci that are associated with different configurations of causal variants across contexts. After benchmarking the proposed method using simulated data, we consider two applications to real datasets, where we investigate genetic effects on gene expression across different cellular contexts and sex-specific genetic effects on lipid levels. Finally, we describe LIMIX, a software framework for the flexible implementation of different LMMs. Most of the models considered in this thesis, including mtSet and iSet, are implemented and available in LIMIX. A unique aspect of the software is an inference framework that allows a large class of genetic models to be defined and, in many cases, to be efficiently fitted by exploiting specific algebraic properties. We demonstrate the utility of this software suite in two applied collaboration projects. Taken together, this thesis demonstrates the value of flexible and integrative modelling in genetics and contributes new statistical methods for genetic analysis. These approaches generalise previous models, yet retain the computational efficiency that is needed to tackle large genetic datasets.
|
42 |
ANALYSIS OF BIOMASS COMPOSITION IN A SORGHUM DIVERSITY PANELPatrick K. Sweet (5930888) 16 January 2019 (has links)
<p>Plant
biomass is an abundant source of renewable energy, but the efficiency of its
conversion into liquid fuels is low. One reason for this inefficiency is the
recalcitrance of biomass to extraction and saccharification of cell wall
polysaccharides. This recalcitrance is due to the complex and rigid structure
of the plant cell wall. A better understanding of the genes effecting cell wall
composition in bioenergy crops could improve feedstock quality and increase
conversion efficiency. To identify genetic loci associated with biomass quality
traits, we utilized genome-wide association studies (GWAS) in an 840-line <i>Sorghum</i> diversity panel. We identified
several QTL from these GWAS including some for lignin composition and saccharification.
Linkage disequilibrium (LD) analysis suggested that multiple polymorphisms are
driving the association of SNPs within these QTL. Sequencing and further
analysis led to the identification of a SNP within the coding region of a gene
encoding phenylalanine ammonia-lyase (PAL) that creates a premature stop codon
and co-segregates with an increase in the ratio of syringyl (S) to guaiacyl (G)
lignin. A comparison of net PAL activity between lines with and without the
mutation revealed that this mutation results in decreased PAL activity. </p>
|
43 |
Analysis of high-density SNP data from complex populationsFloyd, James A. B. January 2011 (has links)
Data from a Croatian isolate population are analysed in a genome-wide association study (GWAS) for a variety of disease-related quantitative traits. A novel genomewide approach to analysing pedigree-based association data called GRAMMAR is utilised. One of the significant findings, for uric acid, is followed up in greater detail, and is replicated in another isolate population, from Orkney. The associated SNPs are located in the SLC2A9 gene, coding for a known glucose transporter, which leads to identification of SLC2A9 as a urate transporter too (Vitart et al., 2008). These SNPs are later implicated in affecting gout, a disease known to be linked with high serum uric acid levels, in an independent study (Dehghan et al., 2008). Subsequently, investigation into different ways in which to use SNP data to identify quantitative trait loci (QTL) for genome-wide association (GWA) studies is performed. Several multi-marker approaches are compared to single SNP analysis using simulated phenotypes and real genotype data, and results show that for rare variants haplotype analysis is the most effective method of detection. Finally, the multi-marker methods are compared with single SNP analysis on the real uric acid data. Interpretation of real data results was complicated due to low sample size, since only founder and unrelated individuals may be used for population-based haplotype analysis, nonetheless, results of the prior analyses of simulated data indicate that multi-marker methods, in particular haplotypes, may greatly facilitate detection of QTL with low minor allele frequency in GWA studies.
|
44 |
Using functional annotation to characterize genome-wide association resultsFisher, Virginia Applegate 11 December 2018 (has links)
Genome-wide association studies (GWAS) have successfully identified thousands of variants robustly associated with hundreds of complex traits, but the biological mechanisms driving these results remain elusive. Functional annotation, describing the roles of known genes and regulatory elements, provides additional information about associated variants. This dissertation explores the potential of these annotations to explain the biology behind observed GWAS results.
The first project develops a random-effects approach to genetic fine mapping of trait-associated loci. Functional annotation and estimates of the enrichment of genetic effects in each annotation category are integrated with linkage disequilibrium (LD) within each locus and GWAS summary statistics to prioritize variants with plausible functionality. Applications of this method to simulated and real data show good performance in a wider range of scenarios relative to previous approaches. The second project focuses on the estimation of enrichment by annotation categories. I derive the distribution of GWAS summary statistics as a function of annotations and LD structure and perform maximum likelihood estimation of enrichment coefficients in two simulated scenarios. The resulting estimates are less variable than previous methods, but the asymptotic theory of standard errors is often not applicable due to non-convexity of the likelihood function. In the third project, I investigate the problem of selecting an optimal set of tissue-specific annotations with greatest relevance to a trait of interest. I consider three selection criteria defined in terms of the mutual information between functional annotations and GWAS summary statistics. These algorithms correctly identify enriched categories in simulated data, but in the application to a GWAS of BMI the penalty for redundant features outweighs the modest relationships with the outcome yielding null selected feature sets, due to the weaker overall association and high similarity between tissue-specific regulatory features.
All three projects require little in the way of prior hypotheses regarding the mechanism of genetic effects. These data-driven approaches have the potential to illuminate unanticipated biological relationships, but are also limited by the high dimensionality of the data relative to the moderate strength of the signals under investigation. These approaches advance the set of tools available to researchers to draw biological insights from GWAS results.
|
45 |
A zebrafish model system for drug screening in diabetesMathews, Bobby January 2019 (has links)
GWAS (Genome wide association studies) have aided in the discovery of various novel variants associated with diabetes. However, a detailed study is required to uncover the role of these genes and to determine how their dysfunction affects pathophysiology. Previous work in the lab has been successful in establishing zebrafish as an efficient model to characterise the effects of these candidate genes. Consequently, efforts have been also made to establish zebrafish as an efficient model system for drug screening as well. The current POP (Proof of principle) study aims to find whether treatment with tolbutamide drug in zebrafish carrying MODY (Maturity onset diabetes of the young) mutations has the similar effects in humans. The study employed zebrafish carrying five (gck, hnf1a, hnf1ba, hnf1bb, pdx1) CRISPR induced MODY orthologues. The zebrafish larvae were supplemented with tolbutamide drug from 5dpf till 10dpf (day post fertilisation). At 10dpf, larvae were screened for various glycaemic traits, whole body glucose and lipids as well body size. CRISPR-CAS9- induced mutations were quantified using paired end sequencing. The results showed that treatment with tolbutamide had a significant effect on the hyperglycaemic outcome induced by hnf1bb, hnf1a, and pdx1 mutations which was in line with the known effects of the drug in humans. In conclusion, the POP study proved to be successful in leveraging zebrafish as an efficient model system for, in vivo characterisation of drugs and can likely help to identify novel targets for therapeutic interventions.
|
46 |
On the Prediction of Warfarin DoseEriksson, Niclas January 2012 (has links)
Warfarin is one of the most widely used anticoagulants in the world. Treatment is complicated by a large inter-individual variation in the dose needed to reach adequate levels of anticoagulation i.e. INR 2.0 – 3.0. The objective of this thesis was to evaluate which factors, mainly genetic but also non-genetic, that affect the response to warfarin in terms of required maintenance dose, efficacy and safety with special focus on warfarin dose prediction. Through candidate gene and genome-wide studies, we have shown that the genes CYP2C9 and VKORC1 are the major determinants of warfarin maintenance dose. By combining the SNPs CYP2C9 *2, CYP2C9 *3 and VKORC1 rs9923231 with the clinical factors age, height, weight, ethnicity, amiodarone and use of inducers (carbamazepine, phenytoin or rifampicin) into a prediction model (the IWPC model) we can explain 43 % to 51 % of the variation in warfarin maintenance dose. Patients requiring doses < 29 mg/week and doses ≥ 49 mg/week benefitted the most from pharmacogenetic dosing. Further, we have shown that the difference across ethnicities in percent variance explained by VKORC1 was largely accounted for by the allele frequency of rs9923231. Other novel genes affecting maintenance dose (NEDD4 and DDHD1), as well as the replicated CYP4F2 gene, have small effects on dose predictions and are not likely to be cost-effective, unless inexpensive genotyping is available. Three types of prediction models for warfarin dosing exist: maintenance dose models, loading dose models and dose revision models. The combination of these three models is currently being used in the warfarin treatment arm of the European Pharmacogenetics of Anticoagulant Therapy (EU-PACT) study. Other clinical trials aiming to prove the clinical validity and utility of pharmacogenetic dosing are also underway. The future of pharmacogenetic warfarin dosing relies on results from these ongoing studies, the availability of inexpensive genotyping and the cost-effectiveness of pharmacogenetic driven warfarin dosing compared with new oral anticoagulant drugs.
|
47 |
Genome-Wide Studies of Transcriptional Regulation in Human Liver Cells by High-throughput SequencingBysani, Madhusudhan Reddy January 2013 (has links)
The human genome contains slightly more than 20 000 genes that are expressed in a tissue specific manner. Transcription factors play a key role in gene regulation. By mapping the transcription factor binding sites genome-wide we can understand their role in different biological processes. In this thesis we have mapped transcription factors and histone marks along with nucleosome positions and RNA levels. In papers I and II, we used ChIP-seq to map five liver specific transcription factors that are crucial for liver development and function. We showed that the mapped transcription factors are involved in metabolism and other cellular processes. We showed that ChIP-seq can also be used to detect protein-protein interactions and functional SNPs. Finally, we showed that the epigenetic histone mark studied in paper I is associated with transcriptional activity at promoters. In paper III, we mapped nucleosome positions before and after treatment with transforming growth factor β (TGFβ) and found that many nucleosomes changed positions when expression changed. After treatment with TGFβ, the transcription factor HNF4α was replaced by a nucleosome in some regions. In paper IV, we mapped USF1 transcription factor and three active chromatin marks in normal liver tissue and in liver tissue of patients diagnosed with alcoholic steatohepatitis. Using gene ontology, we as expected identified many metabolism related genes as active in normal samples whereas genes in cancer pathways were active in steatohepatitis tissue. Cancer is a common complication to the disease and early signs of this were found. We also found many novel and GWAS catalogue SNPs that are candidates to be functional. In conclusion, our results have provided information on location and structure of regulatory elements which will lead to better knowledge on liver function and disease.
|
48 |
Resequencing and Association Analysis of the KALRN and EPHB1 Genes And Their Contribution to Schizophrenia SusceptibilityOzaki, Norio, Iwata, Nakao, Kaibuchi, Kozo, Takeda, Masatoshi, Hashimoto, Ryota, Inada, Toshiya, Suzuki, Michio, Ujike, Hiroshi, Fukuo, Yasuhisa, Okochi, Tomo, Shiino, Tomoko, Ito, Yoshihito, Ikeda, Masashi, Aleksic, Branko, Nakamura, Yukako, Kushima, Itaru 03 1900 (has links)
First published online: November 1, 2010 / 名古屋大学博士学位論文 学位の種類 : 博士(医学)(課程) 学位授与年月日:平成23年3月25日 久島周氏の博士論文として提出された
|
49 |
Genetic Studies in Dogs Implicate Novel Genes Involved in Atopic Dermatitis and IgA DeficiencyTengvall, Katarina January 2015 (has links)
This thesis presents genetic studies of atopic dermatitis (AD) and IgA deficiency in dogs. AD is a chronic inflammatory and pruritic skin disorder caused by allergic reactions against environmental allergens. Both genetic and environmental factors are involved in the development of Canine AD (CAD) and human AD. In Paper I, we performed genome-wide association studies (GWAS) and identified a locus on chromosome 27 significantly associated with CAD in German shepherd dogs (GSDs). The locus contains several genes and fine-mapping indicated strongest association close to the candidate gene PKP2. In Paper II, we performed additional fine-mapping and identified four highly associated SNPs located in regions with transcriptional regulatory potential in epithelial and immune cells. The risk alleles were associated with increased transcriptional activity and the effect on expression was cell-type dependent. These data indicate that multiple cell-type specific enhancers regulate the expression of PKP2, and/or the neighboring genes YARS2, DNM1L and FGD4, and predispose GSDs to CAD. IgA deficiency is the most common primary immune deficiency disorder in both humans and dogs, characterized by a higher risk of recurrent mucosal tract infections, allergic and other immune-mediated diseases. In Paper III, we performed the widest screening (to date) of serum IgA levels in dog breeds (Ndogs=1267, Nbreeds=22) and defined eight breeds as predisposed to low IgA levels. In Paper IV, we performed GWAS in four of the breeds defined as prone to low IgA levels. We used a novel percentile groups-approach to establish breed-specific cut-offs to perform analyses in a close to continuous manner. In total, 35 genomic loci were suggestively associated (p<0.0005) to IgA levels, and three genomic regions (including the genes KIRREL3 and SERPINA9) were genome-wide significantly associated with IgA levels in GSDs. A ~20kb long haplotype on chromosome 28, significantly associated to IgA levels in Shar-Pei dogs, was positioned within the first intron of the gene SLIT1 overlapping with a possible dog domestication sweep. This thesis suggests novel candidate genes involved in two immune-mediated disorders in the dog. Hopefully, these results will become an important resource for the genetic research of the corresponding human diseases.
|
50 |
Rare and common genetic variant associations with quantitative human phenotypesZhao, Jing 21 September 2015 (has links)
This dissertation aims at investigating the association between genotypes and phenotypes in human. Both common and rare regulatory variants have been studied. The phenotypes include disease risk, clinical traits and gene expression levels. This dissertation describes three different types of association study. The first study investigated the relationship between common variants and three sub-clinical traits as well as three complex diseases in the Center for Health Discovery and Well Being study (CHDWB). The second study is GWAS analysis of TNF-α and BMI/CRP conducted as a contribution to meta-GWAS analyses of these traits with investigators at the University of Groningen in the Netherlands, and the 1000 Genomes Consortium. The third study was the most original contribution of my thesis as it assessed the association between rare regulatory variants in promoter regions and gene expression levels. The results clearly show an enrichment of rare variants at both extremes of gene expression. This dissertation provides insight into how common and rare variants associate with broadly-defined quantitative phenotypes. The demonstration that rare regulatory variants make a substantial contribution to gene expression variation has important implications for personalized medicine as it implies that de novo and other rare alleles need to be considered as candidate effectors of rare disease risk.
|
Page generated in 0.0361 seconds