Global ETD Search

21	MAVEN: a tool for Visualization and Functional Analysis of Genome-Wide Association Studies Narayanan, Kanchana 17 May 2010 (has links) No description available. Bioinformatics Computer Science Visualization Genome-wide association studies Functional Analysis
22	The Genetic Predisposition of Paralytic Poliomyelitis Using Genome-Wide Association Studies Olagunju, Tinuke O. January 2019 (has links) Poliomyelitis is a foremost cause of paralysis among preventable diseases among children and adolescents globally. It is caused by persistent infection with poliovirus (PV). The PV infection does not always cause paralysis. A lack of immunization always increases the risk of paralytic polio. Genetic factors also been shown to affect the risk of developing the disease. The aim of this thesis is to investigate whether there are any genetic associations to paralytic poliomyelitis. This is based on a model for understanding its nature as a complex disease, where many genes are involved in contributing to the disease state. This is a population-based case-control study to identify genetic loci that influence disease risk. The study examined the association of genetic variation in single nucleotide polymorphisms (SNPs) across the genome with paralytic poliomyelitis susceptibility in the United States and Canadian survivors of poliomyelitis population, using a genome-wide association study (GWAS) approach. No association was observed. Loci that have been previously implicated were not found to affect the susceptibility to poliomyelitis in this study. The thesis consists of four chapters. Chapter 1 describes the epidemiology, pathogenesis and management of poliomyelitis. Chapter 2 gives an overview of the genomics of infectious diseases in general. Chapter 3 introduces the study population and presents the genome-wide analysis and associations with logistic regression to identify loci explore genes that might be associated with paralytic poliomyelitis and presents results. Chapter 4 discusses the implications of the results and explains future directions. / Thesis / Master of Science (MSc) Poliomyelitis Genome-wide association studies (GWAS) Case-control Logistic regression
23	GWAS for quantitative resistance phenotypes in Mycobacterium tuberculosis reveals resistance genes and regulatory regions Farhat, M.R., Freschi, L., Calderon, R., Ioerger, T., Snyder, M., Meehan, Conor J., de Jong, B.C., Rigouts, L., Sloutsky, A., Kaur, D., Sunyaev, S., van Soolingen, D., Shendure, J., Sacchettini, J., Murray, M. 16 September 2019 (has links) Yes / Drug resistance diagnostics that rely on the detection of resistance-related mutations could expedite patient care and TB eradication. We perform minimum inhibitory concentration testing for 12 anti-TB drugs together with Illumina whole-genome sequencing on 1452 clinical Mycobacterium tuberculosis (MTB) isolates. We evaluate genome-wide associations between mutations in MTB genes or non-coding regions and resistance, followed by validation in an independent data set of 792 patient isolates. We confirm associations at 13 non-canonical loci, with two involving non-coding regions. Promoter mutations are measured to have smaller average effects on resistance than gene body mutations. We estimate the heritability of the resistance phenotype to 11 anti-TB drugs and identify a lower than expected contribution from known resistance genes. This study highlights the complexity of the genomic mechanisms associated with the MTB resistance phenotype, including the relatively large number of potentially causal loci, and emphasizes the contribution of the non-coding portion of the genome. / Biomedical research grant from the American Lung Association (PI MF, RG-270912-N), a K01 award from the BD2K initiative (PI MF, ES026835), and an NIAID U19 CETR grant (P.I. M.M., AI109755), the Belgian Science Policy (Belspo) (L.R., C.J.M.). Antimicrobial resistance Bacterial genetics Genome-wide association studies Tuberculosis
24	Inherited copy number variation in the chicken genome and association with breast muscle traits / Variação de número de cópias herdadas no genoma da galinha e associação com características de músculo de peito Godoy, Thaís Fernanda 08 March 2018 (has links) Copy number variation (CNV) is an important polymorphism that is associated with a wide range of traits in human, wild and livestock species. In chicken, an important source of animal protein and a developmental model organism, CNV is associated with several phenotypes and evolutionary footprints. However, identification and characterization of CNV inheritance on chicken genome lacks further investigation. We screened CNVs in chicken using two distinct populations with known pedigree. In 826 broilers we identified 25,819 CNVs (4,299 deletions and 21,520 duplications) of which 21,077 were inherited, 201 showed no inheritance and 4,541 were classified as de novo CNVs. In 514 F2 animals (layer and broiler cross) we identified 21,796 CNVs (2,254 deletions and 19,543 duplications) of which 18,230 were inherited, 587 not inherited and 2,979 were classified as de novo CNVs. After a strict filtering step to remove potential false positives and negative CNVs, only 220 (4.84%) and 430 (14.43%) de novo CNVs remained in the broiler and F2 populations, respectively. A total of 33.11% (50 out of 151) of the inherited CNVs identified in ten animals were validated by sequencing data. From the validated CNVs, 64% had more than 80% of their size (bp) validated. A total of 59% and 48.8% were classified as novel CNVs regions (CNVRs) in the broiler and F2, respectively. Considering the Bonferroni-corrected p-values for multiple testing and statistically significant p-values ≤ 0.01, we found two CNV segments significantly associated with breast weight, one with breast weight yield, six with breast meat weight, 18 CNV segments with breast meat yield, four with breast filet weight and two with breast yield. These CNV segments that were significantly associated overlapped with 181 protein-coding genes. The CNVseg 300, that was associated with all traits and encompass six CNVRs, overlapped a total of 26 protein-coding genes. Among these genes, the gene MYL1 (Myosin Light Chain 1) is expressed in the fast skeletal muscle fibers, and the genes MLPH (Melanophilin), PRLH (Prolactin Releasing Hormone) and RAB17 (Member RAS Oncogene Family), that were associated with the lavender phenotype (feather blue-grey color) and regulation of homeothermy and the metabolism. The present study improves our knowledge about CNV in the chicken genome and provides insight in the distribution and of different classes of CNVs, i.e. inherited and de novo CNVs, in two experimental chicken populations. In addition, the genome-wide association analyses were the first performed on broiler population with breast muscle traits, that are important characteristics for poultry production. The GWAS results allow us to understand the probably relationship between some genes and CNVRs that are significantly associated with breast muscle traits. / A variação de número de cópias (CNV) é um polimorfismo importante que está associado a uma ampla gama de características em seres humanos, espécies selvagens e domésticas. Em frango, que é uma importante fonte de proteína e considerado um modelo biológico, CNVs foram associados a vários fenótipos e passos evolutivos. No entanto, nenhum estudo foi realizado para a identificação e caracterização da herança da CNV no genoma da galinha. Identificamos as CNVs no genoma da galinha usando duas populações experimentais e com pedigree conhecido: uma população de frangos de corte e uma F2. Em 826 frangos de corte, identificamos 25.819 CNVs (4.299 deleções e 21.520 duplicações), dos quais 21.077 foram herdados, 201 não foram herdados e 4.541 foram CNVs denominados de novo. Em 514 animais F2, identificamos 21.796 CNVs (2.254 deleções e 19.543 duplicações) das quais 18.230 foram herdadas, 587 não foram herdadas e 2.979 foram de novo CNVs. Após a etapa de filtragem nos de novo CNVs, apenas 220 (4,84%) e 430 (14,43%) permaneceram nas populações de frango de corte e F2, respectivamente. Um total de 33,11% (50 de 151) das CNV identificadas por dados de genotipagem em dez animais foram validados por dados de sequenciamento. Dos validados, 64% tinham mais de 80% do tamanho (pb) validados. Um total de 59% e 48,8% foram classificados como novas regiões de CNVs (CNVRs) nas populações de frango de corte e F2, respectivamente. Considerando os p-values corrigidos por Bonferroni para testes múltiplos e estatisticamente significativos (≤ 0,01), encontramos dois segmentos de CNV significativamente associados ao peso do peito, um ao rendimento de peso de peito, seis ao peso de carne de peito, 18 ao rendimento de carne de peito, quatro ao peso de filé de peito e dois ao rendimento do filé de peito. Esses segmentos de CNV significativamente associados estão sobrepostos com 181 genes codificadores de proteínas. O CNVseg 300, que foi associado a todas as características e abrange seis CNVRs, foram sobrepostos a um total de 26 genes codificadores de proteínas. Entre estes genes, o gene MYL1 (Myosin Light Chain 1) é expresso nas fibras rápidas do músculo esquelético, e os genes MLPH (Melanophilin), PRLH (Prolactin Releasing Hormone) e RAB17 (Member RAS Oncogene Family), que foram anteiromente associados ao fenótipo de cor azul acinzentado de penas e à regulação da homeotermia e do metabolismo. O presente estudo melhora o conhecimento sobre CNVs no genoma de frango, especialmente sobre a distribuição de CNV herdadas, não herdadas e de novo, em duas populações experimentais de frango. Além disso, a associação genômica foi a primeira realizada na população de frangos de corte com características do músculo do peito, que são muito importantes para a avicultura. Os resultados do GWAS nos permitem compreender a provável relação entre alguns genes e CNVRs que foram significativamente associados às características do músculo do peito. Gallus gallus Gallus gallus Associação genômica CNVs Herdados De novo De novo Genome-wide association Genome-wide association Genotipagem Genotyping Genotyping, PennCNV Inherited CNVs Inherited CNVs PennCNV
25	Inherited copy number variation in the chicken genome and association with breast muscle traits / Variação de número de cópias herdadas no genoma da galinha e associação com características de músculo de peito Thaís Fernanda Godoy 08 March 2018 (has links) Copy number variation (CNV) is an important polymorphism that is associated with a wide range of traits in human, wild and livestock species. In chicken, an important source of animal protein and a developmental model organism, CNV is associated with several phenotypes and evolutionary footprints. However, identification and characterization of CNV inheritance on chicken genome lacks further investigation. We screened CNVs in chicken using two distinct populations with known pedigree. In 826 broilers we identified 25,819 CNVs (4,299 deletions and 21,520 duplications) of which 21,077 were inherited, 201 showed no inheritance and 4,541 were classified as de novo CNVs. In 514 F2 animals (layer and broiler cross) we identified 21,796 CNVs (2,254 deletions and 19,543 duplications) of which 18,230 were inherited, 587 not inherited and 2,979 were classified as de novo CNVs. After a strict filtering step to remove potential false positives and negative CNVs, only 220 (4.84%) and 430 (14.43%) de novo CNVs remained in the broiler and F2 populations, respectively. A total of 33.11% (50 out of 151) of the inherited CNVs identified in ten animals were validated by sequencing data. From the validated CNVs, 64% had more than 80% of their size (bp) validated. A total of 59% and 48.8% were classified as novel CNVs regions (CNVRs) in the broiler and F2, respectively. Considering the Bonferroni-corrected p-values for multiple testing and statistically significant p-values ≤ 0.01, we found two CNV segments significantly associated with breast weight, one with breast weight yield, six with breast meat weight, 18 CNV segments with breast meat yield, four with breast filet weight and two with breast yield. These CNV segments that were significantly associated overlapped with 181 protein-coding genes. The CNVseg 300, that was associated with all traits and encompass six CNVRs, overlapped a total of 26 protein-coding genes. Among these genes, the gene MYL1 (Myosin Light Chain 1) is expressed in the fast skeletal muscle fibers, and the genes MLPH (Melanophilin), PRLH (Prolactin Releasing Hormone) and RAB17 (Member RAS Oncogene Family), that were associated with the lavender phenotype (feather blue-grey color) and regulation of homeothermy and the metabolism. The present study improves our knowledge about CNV in the chicken genome and provides insight in the distribution and of different classes of CNVs, i.e. inherited and de novo CNVs, in two experimental chicken populations. In addition, the genome-wide association analyses were the first performed on broiler population with breast muscle traits, that are important characteristics for poultry production. The GWAS results allow us to understand the probably relationship between some genes and CNVRs that are significantly associated with breast muscle traits. / A variação de número de cópias (CNV) é um polimorfismo importante que está associado a uma ampla gama de características em seres humanos, espécies selvagens e domésticas. Em frango, que é uma importante fonte de proteína e considerado um modelo biológico, CNVs foram associados a vários fenótipos e passos evolutivos. No entanto, nenhum estudo foi realizado para a identificação e caracterização da herança da CNV no genoma da galinha. Identificamos as CNVs no genoma da galinha usando duas populações experimentais e com pedigree conhecido: uma população de frangos de corte e uma F2. Em 826 frangos de corte, identificamos 25.819 CNVs (4.299 deleções e 21.520 duplicações), dos quais 21.077 foram herdados, 201 não foram herdados e 4.541 foram CNVs denominados de novo. Em 514 animais F2, identificamos 21.796 CNVs (2.254 deleções e 19.543 duplicações) das quais 18.230 foram herdadas, 587 não foram herdadas e 2.979 foram de novo CNVs. Após a etapa de filtragem nos de novo CNVs, apenas 220 (4,84%) e 430 (14,43%) permaneceram nas populações de frango de corte e F2, respectivamente. Um total de 33,11% (50 de 151) das CNV identificadas por dados de genotipagem em dez animais foram validados por dados de sequenciamento. Dos validados, 64% tinham mais de 80% do tamanho (pb) validados. Um total de 59% e 48,8% foram classificados como novas regiões de CNVs (CNVRs) nas populações de frango de corte e F2, respectivamente. Considerando os p-values corrigidos por Bonferroni para testes múltiplos e estatisticamente significativos (≤ 0,01), encontramos dois segmentos de CNV significativamente associados ao peso do peito, um ao rendimento de peso de peito, seis ao peso de carne de peito, 18 ao rendimento de carne de peito, quatro ao peso de filé de peito e dois ao rendimento do filé de peito. Esses segmentos de CNV significativamente associados estão sobrepostos com 181 genes codificadores de proteínas. O CNVseg 300, que foi associado a todas as características e abrange seis CNVRs, foram sobrepostos a um total de 26 genes codificadores de proteínas. Entre estes genes, o gene MYL1 (Myosin Light Chain 1) é expresso nas fibras rápidas do músculo esquelético, e os genes MLPH (Melanophilin), PRLH (Prolactin Releasing Hormone) e RAB17 (Member RAS Oncogene Family), que foram anteiromente associados ao fenótipo de cor azul acinzentado de penas e à regulação da homeotermia e do metabolismo. O presente estudo melhora o conhecimento sobre CNVs no genoma de frango, especialmente sobre a distribuição de CNV herdadas, não herdadas e de novo, em duas populações experimentais de frango. Além disso, a associação genômica foi a primeira realizada na população de frangos de corte com características do músculo do peito, que são muito importantes para a avicultura. Os resultados do GWAS nos permitem compreender a provável relação entre alguns genes e CNVRs que foram significativamente associados às características do músculo do peito. Gallus gallus Associação genômica CNVs Herdados De novo Genome-wide association Genotipagem Genotyping Inherited CNVs PennCNV Gallus gallus De novo Genome-wide association Genotyping, PennCNV Inherited CNVs
26	New Statistical Methods and Computational Tools for Mining Big Data, with Applications in Plant Sciences Michels, Kurt Andrew January 2016 (has links) The purpose of this dissertation is to develop new statistical tools for mining big data in plant sciences. In particular, the dissertation consists of four inter-related projects to address various methodological and computational challenges in phylogenetic methods. Project 1 aims to systematically test different optimization tools and provide useful strategies to improve optimization in practice. Project 2 develops a new R package rPlant, which provides a friendly and convenient toolbox for users of iPlant. Project 3 presents a fast and effective group-screening method to identify important genetic factors in GWAS, with theoretical justifications and nice asymptotic properties. Project 4 develops a new statistical tool to identify gene-gene interactions, with the ability of handling the interactions between groups of covariates. Forward Regression Genome Wide Association Study Group Data Interactions R Statistics Big Data
27	Design and analysis of genome-wide association studies Barrett, Jeffrey C. January 2008 (has links) Despite many years of effort, linkage and candidate gene association studies have yielded disappointingly few risk loci for common human diseases such as diabetes, auto-immune disorders and cancers. Large sample sizes, increased understanding of the patterns of correlation in genetic variation, and plunging genotyping costs have enabled genome-wide association studies, which have good power to detect common risk alleles of modest effect. I present an evaluation of SNP choice in study design and show that overall, despite substantial differences in genotyping technologies, marker selection strategies and number of markers assayed, the first generation platforms all offer good levels of genome coverage (∼ 70%). I next describe the largest such project undertaken to date, the Wellcome Trust Case Control Consortium, which consisted of 2000 cases from each of seven common diseases and 3000 shared controls. It identified nearly two dozen new associations. I demonstrate the importance of careful data quality control, including both standard and unorthodox analyses. I next focus on the association results therein for Crohn’s disease. I present a replication experiment in over 1000 additional Crohn’s patients which unambiguously confirmed six previously published loci and four new loci. Next I describe, in a general context, several issues impeding the combination of genome-wide scans, including data annotation, population structure and differences in genotyping platform. Each of these problems is shown to be tractable with available methods, provided that these methods are applied prudently. I present the results of a meta-analysis of three genome-wide scans for Crohn’s disease. The data showed a striking excess of significant associations, and a replication experiment involving over 4000 independent Crohn’s patients verified twenty new risk loci. Finally, I discuss the early success of genome-wide association and its consequences for further understanding the biology of human disease. 616.042
28	Dissecting genetic interactions in complex traits Hemani, Gibran January 2012 (has links) Of central importance in the dissection of the components that govern complex traits is understanding the architecture of natural genetic variation. Genetic interaction, or epistasis, constitutes one aspect of this, but epistatic analysis has been largely avoided in genome wide association studies because of statistical and computational difficulties. This thesis explores both issues in the context of two-locus interactions. Initially, through simulation and deterministic calculations it was demonstrated that not only can epistasis maintain deleterious mutations at intermediate frequencies when under selection, but that it may also have a role in the maintenance of additive variance. Based on the epistatic patterns that are evolutionarily persistent, and the frequencies at which they are maintained, it was shown that exhaustive two dimensional search strategies are the most powerful approaches for uncovering both additive variance and the other genetic variance components that are co-precipitated. However, while these simulations demonstrate encouraging statistical benefits, two dimensional searches are often computationally prohibitive, particularly with the marker densities and sample sizes that are typical of genome wide association studies. To address this issue different software implementations were developed to parallelise the two dimensional triangular search grid across various types of high performance computing hardware. Of these, particularly effective was using the massively-multi-core architecture of consumer level graphics cards. While the performance will continue to improve as hardware improves, at the time of testing the speed was 2-3 orders of magnitude faster than CPU based software solutions that are in current use. Not only does this software enable epistatic scans to be performed routinely at minimal cost, but it is now feasible to empirically explore the false discovery rates introduced by the high dimensionality of multiple testing. Through permutation analysis it was shown that the significance threshold for epistatic searches is a function of both marker density and population sample size, and that because of the correlation structure that exists between tests the threshold estimates currently used are overly stringent. Although the relaxed threshold estimates constitute an improvement in the power of two dimensional searches, detection is still most likely limited to relatively large genetic effects. Through direct calculation it was shown that, in contrast to the additive case where the decay of estimated genetic variance was proportional to falling linkage disequilibrium between causal variants and observed markers, for epistasis this decay was exponential. One way to rescue poorly captured causal variants is to parameterise association tests using haplotypes rather than single markers. A novel statistical method that uses a regularised parameter selection procedure on two locus haplotypes was developed, and through extensive simulations it can be shown that it delivers a substantial gain in power over single marker based tests. Ultimately, this thesis seeks to demonstrate that many of the obstacles in epistatic analysis can be ameliorated, and with the current abundance of genomic data gathered by the scientific community direct search may be a viable method to qualify the importance of epistasis. 572.8
29	Model selection strategies in genome-wide association studies Keildson, Sarah January 2011 (has links) Unravelling the genetic architecture of common diseases is a continuing challenge in human genetics. While genome-wide association studies (GWAS) have proven to be successful in identifying many new disease susceptibility loci, the extension of these studies beyond single-SNP methods of analysis has been limited. The incorporation of multi-locus methods of analysis may, however, increase the power of GWAS to detect genes of smaller effect size, as well as genes that interact with each other and the environment. This investigation carried out large-scale simulations of four multi-locus model selection techniques; namely forward and backward selection, Bayesian model averaging (BMA) and least angle regression with a lasso modification (lasso), in order to compare the type I error rates and power of each method. At a type I error rate of ~5%, lasso showed the highest power across varied effect sizes, disease frequencies and genetic models. Lasso penalized regression was then used to perform three different types of analysis on GWAS data. Firstly, lasso was applied to the Wellcome Trust Case Control Consortium (WTCCC) data and identified many of the WTCCC SNPs that had a moderate-strong association (p<10-5) type 2 diabetes (T2D), as well as some of the moderate WTCCC associations (p<10-4) that have since been replicated in a large-scale meta-analysis. Secondly, lasso was used to fine-map the 17q21 childhood asthma risk locus and identified putative secondary signals in the 17q21 region, that may further contribute to childhood asthma risk. Finally, lasso identified three potential interaction effects potentially contributing towards coronary artery disease (CAD) risk. While the validity of these findings hinges on their replication in follow-up studies, the results suggest that lasso may provide scientists with exciting new methods of dissecting, and ultimately understanding, the complex genetic framework underlying common human diseases. 616.027
30	Genetic and genomic studies on wheat pre-harvest sprouting resistance Lin, Meng January 1900 (has links) Doctor of Philosophy / Department of Agronomy / Guihua Bai / Allan K. Fritz / Wheat pre-harvest sprouting (PHS), germination of physiologically matured grains in a wheat spike before harvesting, can cause significant reduction in grain yield and end-use quality. Many quantitative trait loci (QTL) for PHS resistance have been reported in different sources. To determine the genetic architecture of PHS resistance and its relationship with grain color (GC) in US hard winter wheat, a genome-wide association study (GWAS) on both PHS resistance and GC was conducted using in a panel of 185 U.S. elite breeding lines and cultivars and 90K wheat SNP arrrays. PHS resistance was assessed by evaluating sprouting rates in wheat spikes harvested from both greenhouse and field experiments. Thirteen QTLs for PHS resistance were identified on 11 chromosomes in at least two experiments, and the effects of these QTLs varied among different environments. The common QTLs for PHS resistance and GC were identified on the long arms of the chromosome 3A and 3D, indicating pleiotropic effect of the two QTLs. Significant QTLs were also detected on chromosome arms 3AS and 4AL, which were not related to GC, suggesting that it is possible to improve PHS resistance in white wheat. To identify markers closely linked to the 4AL QTL, genotyping-by-sequencing (GBS) technology was used to analyze a population of recombinant inbred lines (RILs) developed from a cross between two parents, “Tutoumai A” and “Siyang 936”, contrasting in 4AL QTL. Several closely linked GBS SNP markers to the 4AL QTL were identified and some of them were coverted to KASP for marker-assisted breeding. To investigate effects of the two non-GC related QTLs on 3AS and 4AL, both QTLs were transferered from “Tutoumai A” and “AUS1408” into a susceptible US hard winter wheat breeding line, NW97S186, through marker-assisted backcrossing using the gene marker TaPHS1 for 3AS QTL and a tightly linked KASP marker we developed for 4AL QTL. The 3AS QTL (TaPHS1) significantly interacted with environments and genetic backgrounds, whereas 4AL QTL (TaMKK3-A) interacted with environments only. The two QTLs showed additive effects on PHS resistance, indicating pyramiding these two QTLs can increase PHS resistance. To improve breeding selection efficiency, genomic prediction using genome-wide markers and marker-based prediction (MBP) using selected trait-linked markers were conducted in the association panel. Among the four genomic prediction methods evaluated, the ridge regression best linear unbiased prediction (rrBLUP) provides the best prediction among the tested methods (rrBLUP, BayesB, BayesC and BayesC0). However, MBP using 11 significant SNPs identified in the association study provides a better prediction than genomic prediction. Therefore, for traits that are controlled by a few major QTLs, MBP may be more effective than genomic selection. Triticum aestivum Pre-harvest sprouting resistance Quantitative trait locus Genome- wide association studies Genomic prediction

Search results