Spelling suggestions: "subject:"genome wide association 2studies"" "subject:"genome wide association 3studies""
31 |
Novel Bioinformatics Applications for Protein Allergology, Genome-Wide Association and Retrovirology StudiesMartínez Barrio, Álvaro January 2010 (has links)
Recently, the pace of growth in the amount of data sources within Life Sciences has increased exponentially until pose a difficult problem to efficiently manage their integration. The data avalanche we are experiencing may be significant for a turning point in science, with a change of orientation from proprietary to publicly available data and a concomitant acceptance of studies based on the latter. To investigate these issues, a Network of Excellence (EMBRACE) was launched with the aim to integrate the major databases and the most popular bioinformatics software tools. The focus of this thesis is therefore to approach the problem of seamlessly integrating varied data sources and/or distributed research tools. In paper I, we have developed a web service to facilitate allergenicity risk assessment, based on allergen descriptors, in order to characterize proteins with the potential for sensitization and cross-reactivity. In paper II, a web service was developed which uses a lightweight protocol to integrate human endogenous retrovirus (ERV) data within a public genome browser. This new data catalogue and many other publicly available sources were integrated and tested in a bioinformatics-rich client application. In paper III, GeneFinder, a distributed tool for genome-wide association studies, was developed and tested. Useful information based on a particular genomic region can be easily retrieved and assessed. Finally, in paper IV, we developed a prototype pipeline to mine the dog genome for endogenous retroviruses and displaying the transcriptional landscape of these retroviral integrations. Moreover, we further characterized a group that until this point was believed to be primate-specific. Our results also revealed that the dog has been very effective in protecting itself from such integrations. This work integrates different applications in the fields of protein allergology, biotechnology, genome association studies and endogenous retroviruses. / EMBRACE NoE EU FP6
|
32 |
GENOME-WIDE ASSOCIATION STUDIES AT THE INTERFACE OF ALZHEIMER’S DISEASE AND EPIDEMIOLOGICALLY RELATED DISORDERSSimmons, Christopher Ryan 01 January 2011 (has links)
Genome-wide association studies (GWAS)s provide an unbiased means of exploring the landscape of complex genetic disease. As such, these studies have identified genetic variants that are robustly associated with a multitude of conditions. I hypothesize that these genetic variants serve as excellent tools for evaluation of the genetic interface between epidemiologically related conditions. Herein, I test the association between SNPs associated with either (i) plasma lipids, (ii) rheumatoid arthritis (RA) or (iii) diabetes mellitus (DM) and late-onset Alzheimer’s disease (AD) to identify shared genetic variants. Regarding the most significantly AD-associated variants, I have also attempted to elucidate their molecular function.
Only cholesterol-associated SNPs, as a group, are significantly associated with AD. This association remains after excluding APOE SNPs and suggests that peripheral and or central cholesterol metabolism contribute to AD risk. The general lack of association between RA-associated SNPs and AD is also significant in that these data challenge the hypothesis that genetic variants that increase risk of RA confer protection against AD. Functional studies of variants exhibiting novel associations with AD reveal that the lipid-associated SNP rs3846662 modulates HMGCR exon 13 splicing differentially in different cell types. Although less clear, trends were also observed between the RA-associated rs2837960 and the expression of several BACE2 isoforms, and between the DM-associated rs7804356 and expression of a rare SKAP2 isoform, respectively.
In conclusion, the overlap of lipid-, RA- or DM-associated SNPs with AD is modest but in several instances significant. Continued analysis of the interface between GWAS of separate conditions will likely facilitate novel associations missed by conventional GWAS. Furthermore, the identification of functional variants associated with multiple conditions should provide insight into novel mechanisms of disease and may lead to the identification of new therapeutic targets in an era of personalized genomic medicine.
|
33 |
Gene-Environment Interaction and Extension to Empirical Hierarchical Bayes Models in Genome-Wide Association StudiesViktorova, Elena 17 June 2014 (has links)
No description available.
|
34 |
Using molecular QTLs to identify cell types and causal variants for complex traitsSchwartzentruber, Jeremy Andrew January 2018 (has links)
Genetic associations have been discovered for many human complex traits, and yet for most associated loci the causal variants and molecular mechanisms remain unknown. Studies mapping quantitative trait loci (QTLs) for molecular phenotypes, such as gene expression, RNA splicing, and chromatin accessibility, provide rich data that can link variant effects in specific cell types with complex traits. These genetic effects can also now be modeled in vitro by differentiating human induced pluripotent stem cells (iPSCs) into specific cell types, including inaccessible cell types such as those of the brain. In this thesis, I explore a range of approaches for using QTLs to identify causal variants and to link these with molecular functions and complex traits. In Chapter 2, I describe QTL mapping in 123 sensory neuronal cell lines differentiated from human iPSCs. I observed that gene expression was highly variable across iPSC-derived neuronal cultures in specific gene categories, and that a portion of this variability was explained by commonly used iPSC culture conditions, which influenced differentiation efficiency. A number of QTLs overlapped with common disease associations; however, using simulations I showed that identifying causal regulatory variants with a recall-by- genotype approach in iPSC-derived neurons is likely to require large sample sizes, even for variants with moderately large effect sizes. In Chapter 3, I developed a computational model that uses publicly available gene expression QTL data, along with molecular annotations, to generate cell type-specific probability of regulatory function (PRF) scores for each variant. I found that predictive power was improved when the model was modified to use the quantitative value of annotations. PRF scores outperformed other genome-wide scores, including CADD and GWAVA, in identifying likely causal eQTL variants. In Chapter 4, I used PRF scores to identify relevant cell types and to fine map potential causal variants using summary association statistics in six complex traits. By examining individual loci in detail, I showed how the enrichments contributing to a high PRF score are transparent, which can help to distinguish plausible causal variant predictions from model misspecification.
|
35 |
Études d’association pangénomique appliquées à la recherche de nouveaux facteurs de risque génétique de la maladie d’Alzheimer / Genome-wide association studies applied to the discovery of new genetic risk factors of Alzheimer's diseaseChouraki, Vincent 20 June 2013 (has links)
Les démences regroupent un ensemble de pathologies cérébrales affectant progressivement les fonctions cognitives et survenant plus fréquemment chez les personnes âgées. L’augmentation du nombre de cas liée au vieillissement de la population et la lourdeur de la prise en charge font des démences un problème de santé publique important.La maladie d’Alzheimer (MA) est la plus fréquente des démences. Elle apparaît généralement après 65 ans et possède une forte composante génétique. En dehors de certaines formes familiales précoces liées à des mutations dans les gènes du précurseur de la protéineamyloïde, et des présénilines 1 et 2, la grande majorité des cas résulte de l’interaction de facteurs environnementaux avec divers gènes de susceptibilité.L’approche gène candidat a permis l’identification de nombreux gènes associés au risquede MA. Cependant, en raison de problèmes techniques et méthodologiques, seul le gène de l’apoprotéine E (APOE) a pu être identifié de manière robuste par cette approche. Les études d’association pangénomiques permettent d’identifier sans a priori des variations génétiques fréquentes associées à une maladie sur l’ensemble du génome. À partir de 2009, plusieurs consortia ayant pour objectif de réaliser ce type d’étude dans le champs de la MA ont identifiéquatre nouveaux gènes d’intérêt pour la MA, CLU, PICALM, CR1 et BIN1. Cependant,ces gènes n’expliquent qu’une petite partie de la variabilité génétique de la maladie et denombreux autres variants restent à découvrir.Durant cette thèse, nous avons d’abord chercher à répliquer les résultats des principauxgènes identifiés par approche gène candidat en utilisant les données du consortium EuropeanAlzheimer’s Disease Initiative (EADI). Nous avons pu montrer qu’une grande partie deces gènes présentait un faible niveau d’association avec la MA. En utilisant l’approche pangénomique,nous avons ensuite pu identifier 19 gènes associés au risque de MA en dehorsd’APOE, dont 11 n’ayant pas été identifiés par les précédentes études, via la mise en placed’une collaboration informelle entre consortia puis au sein du International Genomics ofAlzheimer’s Disease Project (IGAP).Nous nous sommes également interessés à plusieurs phénotypes intermédiaires associésà la MA, et en particulier aux taux plasmatiques des peptides amyloïde b (Ab), en partantde l’hypothèse qu’ils pourraient permettre la recherche de variants impliqués dans des mécanismesphysiopathologiques pré-symptomatiques. Ce travail a permis l’identification d’uneassociation potentielle entre le gène CTXN3 et les taux plasmatiques d’Ab1−42.En conclusion, l’utilisation des études d’association pangénomiques a permis d’identifierde nombreux nouveaux gènes associés au risque de MA. Ces gènes ouvrent des voies derecherche intéressantes pour mieux comprendre la physiopathologie de la MA et permettrele développement de traitements efficaces qui font actuellement défaut. / Dementia is a syndrom caused by several brain diseases progressively deteriorating cognitivefunctions and occurs more frequently in the elderly. The increased number of patients withdementia due to the ageing of the general population and the high cost of care add up tomake dementia a concerning public health issue.Alzheimer’s disease (AD) is the most common form of dementia. It is often diagnosedafter 65 years old and has a strong genetic component. Familial forms exist and are mainlycaused by mutations in the amyloid-b protein precursor, presenilin 1 and presenilin 2 genes.However, the vast majority of cases result from the complex interaction of environmental factors with susceptibility genes.Using a candidate gene approach, numerous genes associated with AD risk were identified,but due to technical and methodological problems, only the apoliprotein E (APOE) genewas replicated. Genome-wide association studies (GWAS) aim to identify frequent geneticvariants associated with disease risk in a hypothesis-free manner. Starting 2009, severalconsortia aiming to perform this type of analyses in the field of AD robustly identified fournew genes associated with AD risk, CLU, PICALM, CR1 and BIN1. However, these genes puttogether only explain a small proportion of the total genetic variance of AD and the searchfor new susceptibility genes remains an important goal for AD research.In this work, we first tried to replicate the results of the top genes reported using thecandidate gene approach, using GWAS data from the European Alzheimer’s Disease Initiative(EADI). Most of these genes showed weak levels of association. Using GWAS, we were ableto identify 19 new genes associated with AD risk besides APOE, including 11 that had notbeen reported by previous studies, first through an informal collaboration between consortia,then under the name of International Genomics of Alzheimer’s Disease Project (IGAP).Assuming that use of endophenotypes related to AD would be relevant for the discoveryof genetic variants involved in the early pathophysiology of AD, we then performed aGWAS of plasma amyloid-b (Ab) concentrations. This study showed suggestive asssociationsbetween the CTXN3 gene on chromosome 5 and Ab1−42 plasma levels.To sum up, using GWAS enabled us to identify new genes associated with AD risk. Thesegenes point to interesting new research hypotheses and hopefully, to a better understandingof AD pathophysiology and development of effective drugs.
|
36 |
Développement de méthodes statistiques pour l'identification de gènes d'intérêt en présence d'apparentement et de dominance, application à la génétique du maïs / Development of Statistical Methods for the Identification of Interesting Genes with Relatedness and Dominance, Application to the Maize GeneticLaporte, Fabien 13 March 2018 (has links)
La détection de gènes est une étape importante dans la compréhension des effets de l'information génétique d'un individu sur ses caractères phénotypiques. Durant mon doctorat, j'ai étudié les méthodes statistiques pour conduire les analyses de génétique d'association, avec les hybrides de maïs comme modèle d'application. Je me suis tout d'abord intéressé à l'estimation des paramètres d'apparentement entre individus à partir de données de marqueurs bialléliques. Cette estimation est réalisée dans le cadre d'un modèle de mélange paramétrique. J'ai étudié l'identifiabilité de ce modèle dans un cadre général mais aussi dans un cadre plus spécifique où les individus étudiés étaient issus de croisements entre lignées, cadre représentatif des plans de croisement classiquement utilisés en génétique végétale. Je me suis ensuite intéressé à l'estimation des paramètres des modèles mixtes à plusieurs composantes de variance et plus particulièrement à la performance des algorithmes pour tester l'effet de très nombreux marqueurs. J'ai comparé pour cela des logiciels existants et optimisé un algorithme Min-Max. La pertinence des différentes méthodes développées a finalement été illustrée dans le cadre de la détection de QTL à travers une analyse d'association réalisée sur un panel d'hybrides de maïs. / The detection of genes is a first step to understand the impact of the genetic information of individuals on their phenotypes. During my PhD, I studied statistical methods to perform genome-wide association studies, with maize hybrids as an application case. Firstly, I studied the inference of relatedness coefficients between individuals from biallelic marker data. This estimation is based on a parametric mixture model. I studied the identifiability of this model in the generic case but also in the specific case of mating design where observed individuals are obtained by crossing lines, a representative case of classical mating design in plant genetics. Then I studied inference of variance component mixed model parameters and particularly the performance of algorithms to test effects of numerous markers. I compared existing programs and I optimized a Min-Max algorithm. Relevance of developed methods had been illustrated for the detection of QTLs through a genome-wide association analysis in a maize hybrids panel.
|
37 |
Contribution des variations structurales de type insertions/délétions à l'adaptation, la variation des caractères et les performances hybrides chez le maïs / Contribution of insertions/deletions-type structural variations to adaptation, phenotypic variation and hybrid performances in maizeMabire, Clément 23 April 2019 (has links)
Le récent développement des méthodes de séquençage permet aujourd’hui d’identifier des variations structurales chez de nombreuses espèces. Chez le maïs, des milliers de grandes insertions et délétions (InDel) de quelques pb à plusieurs centaines de Kbp ont été découvertes entre le génome de référence B73 et de nombreux autres génomes reséquencés. Ces InDel peuvent changer la composition des gènes entre les individus et donc être impliquées dans la variation du phénotype, mais cet effet sur le phénotype reste mal connu. L’objectif de cette thèse était d'étudier la contribution des InDel à l'adaptation, aux variations phénotypiques et aux performances hybrides chez le maïs. Nous avons développé une puce de génotypage des InDel Affymetrix® Axiom® capable de génotyper 105 927 InDel de 35bp à 129,7Kbp. 79 969 de ces InDel ont leur séquences absentes du génome de référence B73 et ont été identifiées par l’assemblage 3 génomes (F2, C103, and PH207). Nous avons sélectionné 61 492 InDel polymorphiques pour génotyper 362 lignées de maïs représentant une large gamme de diversité pour étudier la contribution des InDel à la diversité génétique, l’adaptation et la variation des caractères. Nous avons également génotypé 1 million de SNP à partir de deux puces de génotypage et du génotypage par séquençage pour étudier la complémentarité entre les InDel et les SNP. Qu’ils soient calculés avec les InDel ou les SNP la structuration génétique et les valeurs d’apparentement entre les lignées sont très similaires, ce qui suggère que la plupart des InDel ont suivi la même trajectoire évolutive que les SNP. 51% des InDel ne sont pas en déséquilibre de liaison élevé (>0.8) avec aucun SNP proche donc l’effet de ces InDel n’est donc a priori pas capturé pas des SNP à cette densité. Parmi les 294 régions génomiques associées au phénotype (QTL), 13 nouveaux QTL ont été détectés grâce aux InDel par rapport aux SNP par une approche de génétique d’association (GA). Nous avons détecté un enrichissement en InDel sous sélection entre les lignées tropicales, cornées et dentées par rapport aux SNP, avec 56 sur 188 régions sous sélection détectées avec les InDel. Ces régions contiennent des gènes impliqués dans l’adaptation et/ou la tolérance aux stress. De plus, le plus grand nombre d’associations a été découvert pour la floraison, caractère adaptatif chez le maïs. Ces résultats suggèrent que les InDel sont plus souvent impliquées dans l’adaptation et la tolérance aux stress. Nous avons enfin testé l’effet des InDel sur les performances hybride en analysant un panel de 287 hybrides issus du croisement de 210 lignées tempérées du panel précédent. Nous avons décomposé la variance des performances hybrides en distinguant les effets de dominance et d’additivité pour la floraison femelle (FF), la hauteur (PH) et le rendement (GY). La plus forte part de dominance et d’interaction génotype-environnement a été observée pour le GY et la plus faible pour la FF. L’effet additif et de dominance de 51,844 InDel et 469 267 SNP a été testé pour 4 combinaisons d’environnements par une approche de GA. 78 et 133 QTL avec un effet additif et dominant respectivement ont été identifiés, dont 6 et 11 avec des InDel. 83% de ces QTL ont été identifiés dans une seule combinaison d’environnements. Un des QTL de rendement identifié avec des InDel est situé dans un large cluster d’InDel sur le chromosome 6 et colocalise avec un QTL déjà identifié avec des SNP avec un effet fort dans l’augmentation du rendement sous des températures élevées. L’ajout de l’effet de dominance en plus de l’effet additif permet d’augmenter la précision des prédictions génomiques jusqu’à 5,6% pour le rendement. Cependant, l’ajout du génotypage des InDel en plus de celui des SNP n’a pas permis d’améliorer les prédictions des phénotypes hybrides. / In the last decades, the rapid development of genome sequencing allowed to identify structural variations in many species. In maize, thousands of large insertions and deletions (InDels) from few bp to hundreds of Kbp were discovered by comparing the reference genome B73 and many other resequenced genomes. These InDel sequences can carry genes and therefore be involved in phenotypic variation by changing the gene composition between individuals, but their effect on phenotype was not well studied. The aim of this thesis was to study the contribution of InDels to adaptation, phenotypic variations and hybrid performances in maize. We developed an Affymetrix® Axiom® genotyping array that allowed to genotype 105,947 InDels sequences ranging from 35bp to 129,7Kbp of size. 79,969 out 105,947 sequences of these InDels were not present in B73 reference genome and have been discovered by assembling three genomes (F2, C103, and PH207). We selected 61,492 polymorphic InDels to genotype a 362 maize inbred lines panel representing a broad range of diversity to study the contribution of InDels to genetic diversity, adaptation and trait variation. We also assembled one million of SNPs from two genotyping arrays and genotyping by sequencing to study the complementarity between InDels and SNPs. Genetic structuration and relatedness between inbred lines displayed by SNPs or by InDels were highly similar suggesting that almost all indels and SNPs followed a similar evolutionary trajectory. 51% of InDels were not in high linkage disequilibrium (LD>0.8) with any nearby SNP suggesting that the effect of these InDels was not be well captured using this density of SNP. Thanks to InDels, we detected 13 new quantitative trait loci (QTLs) among 294 QTLs identified for 23 traits by a genome wide association studies (GWAS). Similarly, 56 out 188 regions under selection between tropical, dent and flint maize lines were identified by InDels leading to an enrichment of genomic regions under selection detected by InDels compared to SNPs. These InDels include genes involved in tolerance to biotic and abiotic stress and/or adaptive traits as flowering time. Accordingly, the highest number of associated InDels was found for flowering time. These results suggest that InDels were often involved in adaptation and stress tolerance. In order to study the effect of InDels on hybrid performances, we analyzed a panel of 287 hybrids derived from the crossing of 210 maize temperate inbred lines from the previous panel. We decomposed the variance of female flowering (FF), plant height (PH) and grain yield (GY) by distinguishing the additive and dominant genetic effects. We observed the highest dominance and genotype by environment effects for GY and the lowest for FF. We performed GWAS on this panel by testing additive and dominance effects of 51,844 InDels and 469,267 SNPs on these three traits in 4 different environment combinations. We identified 78 and 133 QTLs with an additive and dominance effect, respectively including 6 and 11 QTLs discovered only by InDels. 83% of all QTLs were found with only one environment combination. One QTL for GY detected with InDels was located in a large cluster of InDels on chromosome 6, previously identified to have a strong effect on GY in heat conditions. We finally used InDels and/or SNPs genotyping to predict hybrid performances. Whereas including a dominance effect in genomic prediction models increased by 1.5 to 5.6% predictive abilities (PA) for GY, including InDels genotyping did not increased PA.
|
38 |
Childhood Obesity: A Systems Medicine ApproachStone, William L., Schetzina, Karen, Stuart, Charles 01 June 2016 (has links)
Childhood obesity and its sequelae are a major public health problem in both the USA and globally. This review will focus on a systems medicine approach to obesity. Systems medicine is an integrative approach utilizing the vast amount of data garnered from "omics" technology and integrating these data with conventional pathophysiology as well as diverse environmental factors such as diet, exercise, community dynamics and the intestinal microbiome. Omics technology includes genomics, epigenomics, metagenomics, metabolomics and proteomics. In addition to unraveling etiology, the goals of a systems medicine approach are to provide actionable and evidenced-based clinical approaches. In the case of childhood obesity, an additional goal is characterizing measureable risk factors/biomarkers for obesity at the earliest possible age and devising age-appropriate optimal intervention strategies. It is also important to establish the age at which interventions could be critical. As discussed below, it is possible that some of the pathophysiological and epigenetic changes resulting from childhood obesity could become more irreversible the longer the obesity remains untreated.
|
39 |
Computational Methods for Solving Next Generation Sequencing ChallengesAldwairi, Tamer Ali 13 December 2014 (has links)
In this study we build solutions to three common challenges in the fields of bioinformatics through utilizing statistical methods and developing computational approaches. First, we address a common problem in genome wide association studies, which is linking genotype features within organisms of the same species to their phenotype characteristics. We specifically studied FHA domain genes in Arabidopsis thaliana distributed within Eurasian regions by clustering those plants that share similar genotype characteristics and comparing that to the regions from which they were taken. Second, we also developed a tool for calculating transposable element density within different regions of a genome. The tool is built to utilize the information provided by other transposable element annotation tools and to provide the user with a number of options for calculating the density for various genomic elements such as genes, piRNA and miRNA or for the whole genome. It also provides a detailed calculation of densities for each family and subamily of the transposable elements. Finally, we address the problem of mapping multi reads in the genome and their effects on gene expression. To accomplish this, we implemented methods to determine the statistical significance of expression values within the genes utilizing both a unique and multi-read weighting scheme. We believe this approach provides a much more accurate measure of gene expression than existing methods such as discarding multi reads completely or assigning them randomly to a set of best assignments, while also providing a better estimation of the proper mapping locations of ambiguous reads. Overall, the solutions we built in these studies provide researchers with tools and approaches that aid in solving some of the common challenges that arise in the analysis of high throughput sequence data.
|
40 |
Enrichment of inflammatory bowel disease and colorectal cancer risk variants in colon expression quantitative trait lociHulur, Imge, Gamazon, Eric R., Skol, Andrew D., Xicola, Rosa M., Llor, Xavier, Onel, Kenan, Ellis, Nathan A., Kupfer, Sonia S. January 2015 (has links)
BACKGROUND: Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with diseases of the colon including inflammatory bowel diseases (IBD) and colorectal cancer (CRC). However, the functional role of many of these SNPs is largely unknown and tissue-specific resources are lacking. Expression quantitative trait loci (eQTL) mapping identifies target genes of disease-associated SNPs. This study provides a comprehensive eQTL map of distal colonic samples obtained from 40 healthy African Americans and demonstrates their relevance for GWAS of colonic diseases. RESULTS: 8.4 million imputed SNPs were tested for their associations with 16,252 expression probes representing 12,363 unique genes. 1,941 significant cis-eQTL, corresponding to 122 independent signals, were identified at a false discovery rate (FDR) of 0.01. Overall, among colon cis-eQTL, there was significant enrichment for GWAS variants for IBD (Crohn's disease [CD] and ulcerative colitis [UC]) and CRC as well as type 2 diabetes and body mass index. ERAP2, ADCY3, INPP5E, UBA7, SFMBT1, NXPE1 and REXO2 were identified as target genes for IBD-associated variants. The CRC-associated eQTL rs3802842 was associated with the expression of C11orf93 (COLCA2). Enrichment of colon eQTL near transcription start sites and for active histone marks was demonstrated, and eQTL with high population differentiation were identified. CONCLUSIONS: Through the comprehensive study of eQTL in the human colon, this study identified novel target genes for IBD- and CRC-associated genetic variants. Moreover, bioinformatic characterization of colon eQTL provides a tissue-specific tool to improve understanding of biological differences in diseases between different ethnic groups.
|
Page generated in 0.1221 seconds