61 |
Dissecting heterogeneity in GWAS meta-analysisMagosi, Lerato Elaine January 2017 (has links)
Statistical heterogeneity refers to differences among results of studies combined in a meta-analysis beyond that expected by chance. On the one hand, excessive heterogeneity can diminish power to discover genetic signals; on the other, moderate heterogeneity can reveal important biological differences among studies. Given its double-edged nature, this thesis dissects heterogeneity in genetic association meta-analyses from three vantage points. First, a novel multi-variant statistic, M is proposed to detect genome-wide (systematic) heterogeneity patterns in genetic association meta-analyses. This was motivated by the limited availability of appropriate methodology to measure the impact of heterogeneity across genetic signals, since traditional metrics (Q, I<sup>2</sup> and T<sup>2</sup>) measure heterogeneity at individual variants. Second, given that meta-analyses comprising small numbers of studies typically report imprecise summary effect estimates; GWAS-derived empirical heterogeneity priors are used to improve precision in estimation of average genetic effects and heterogeneity in smaller meta-analyses (e.g. ≤ 10 studies). Third, a critical evaluation of the Han-Eskin random-effects model shows how it can identify small effect heterogeneous loci overlooked by traditional fixed and random-effects methods. This work draws attention to the existence of genome-wide heterogeneity patterns, to reveal systematic differences among the ascertainment criteria of participating studies in a meta-analysis of coronary disease (CAD) risk. Furthermore, simulation studies with the Han-Eskin random-effects model revealed inflated genetic signals at small effect loci when heterogeneity levels were high. However, it did reveal an additional CAD risk variant overlooked by traditional meta-analysis methods. We therefore recommend a holistic approach to exploring heterogeneity in meta-analyses which assesses heterogeneity of genetic effects both at individual variants with traditional statistics and across multiple genetic signals with the M statistic. Furthermore, it is critically important to review forest plots for small effect loci identified using the Han-Eskin random-effects model amidst moderate-to-high heterogeneity (I<sup>2</sup> ≥ 40%).
|
62 |
Um método para seleção de atributos em dados genômicosOliveira, Fabrízzio Condé de 26 November 2015 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2016-05-05T18:05:07Z
No. of bitstreams: 1
fabrizziocondedeoliveira.pdf: 6115188 bytes, checksum: 9810536208119e2012e4ee9015470c3e (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-06-07T15:41:26Z (GMT) No. of bitstreams: 1
fabrizziocondedeoliveira.pdf: 6115188 bytes, checksum: 9810536208119e2012e4ee9015470c3e (MD5) / Made available in DSpace on 2016-06-07T15:41:26Z (GMT). No. of bitstreams: 1
fabrizziocondedeoliveira.pdf: 6115188 bytes, checksum: 9810536208119e2012e4ee9015470c3e (MD5)
Previous issue date: 2015-11-26 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Estudos de associação em escala genômica buscam encontrar marcadores moleculares
do tipo SNP que estão associados direta ou indiretamente a um fenótipo em questão
tais como, uma ou mais características do indivíduo ou, até mesmo, uma doença. O
SNP pode ser a própria mutação causal ou pode estar correlacionado com a mesma por
serem herdados juntos. Para identi car a região causadora ou promotora do fenótipo,
a qual não é conhecida a priori, milhares ou milhões de SNPs são genotipados em
amostras compostas de centenas ou milhares de indivíduos. Com isso, surge o desa o
de selecionar os SNPs mais informativos no conjunto de dados genotípico, onde o número
de atributos é, geralmente, muito superior ao número de indivíduos, com a possibilidade
de que existam atributos altamente correlacionados e, ainda, podendo haver interações
entre pares, trios ou combinações de SNPs de quaisquer ordens. Os métodos mais usados
em estudos de associação em escala genômica utilizam o valor-p de cada SNP em testes
estatísticos de hipóteses, baseados em regressão para fenótipos contínuos e baseados nos
testes qui-quadrado ou similares em classi cação para fenótipos discretos, como ltro
para selecionar os SNPs mais signi cativos. Entretanto, essa classe de métodos captura
somente SNPs com efeitos aditivos, pois a relação adotada é linear. Na tentativa de
superar as limitações de procedimentos já estabelecidos, este trabalho propõe um novo
método de seleção de SNPs baseado em técnicas de Aprendizado de Máquina e Inteligência
Computacional denominado SNP Markers Selector (SMS). O modelo é construído a partir
de uma abordagem que divide o problema de seleção de SNPs em três fases distintas: a
primeira relacionada à análise de relevância dos marcadores, a segunda responsável pela
de nição do conjunto de marcadores relevantes que serão considerados por meio de uma
estratégia de corte com base em um limite de relevância dos marcadores e, nalmente,
uma fase para o re namento do processo de corte, geralmente para diminuir marcadores
falsos-positivos. No SMS, essas três etapas, foram implementadas utilizando-se Florestas
Aleatórias, Máquina de Vetores Suporte e Algoritmos Genéticos respectivamente. O
SMS objetiva a criação de um uxo de trabalho que maximize o potencial de seleção
do modelo através de etapas complementares. Assim, espera-se aumentar o potencial
do SMS capturar efeitos aditivos e/ou não-aditivos com interação moderada entre pares
e trios de SNPs, ou até mesmo, interações de ordens superiores com efeitos que sejam
minimamente detectáveis. O SMS pode ser aplicado tanto em problemas de regressão
(fenótipo contínuo) quanto de classi cação (fenótipo discreto). Experimentos numéricos
foram realizados para avaliação do potencial da estratégia apresentada, com o método
sendo aplicado em sete conjuntos de dados simulados e em uma base de dados real, onde
a capacidade de produção de leite predita de vacas leiteiras foi medida como fenótipo
contínuo. Além disso, o método proposto foi comparado com os métodos baseados no
valor-p e com o Lasso Bayesiano apresentando, de forma geral, melhores resultados do
ponto de vista de SNPs verdadeiros-positivos nos dados simulados com efeitos aditivos
juntamente com interações entre pares e trios de SNPs. No conjunto de dados reais,
baseado em 56.947 SNPs e um único fenótipo relativo à produção de leite, o método
identi cou 245 QTLs associados à produção e à composição do leite e 90 genes candidatos
associados à mastite, à produção e à composição do leite, sendo esses QTLs e genes
identi cados por estudos anteriores utilizando outros métodos de seleção. Assim, o método
demonstrou ser competitivo frente aos métodos utilizados para comparação em cenários
complexos, com dados simulados ou reais, o que indica seu potencial para estudos de
associação em escala genômica em humanos, animais e vegetais. / Genome-wide association studies have as main objective to discovery SNP type molecular
markers associated directly or indirectly to a speci c phenotype related to one or more
characteristics of an individual or even a disease. The SNP could be the causative
mutation itself or correlated with the causative mutation due to common inheritance.
Aiming to identify the causal or promoter region of the phenotype, which is unknown a
priori, thousands or millions of SNPs are genotyped in samples composed of hundreds
or thousands of individuals. Therefore, emerges the necessity to confront a challenge of
selecting the most informative SNPs in genotype data set where the number of attributes
are, usually, much higher than the number of individuals. Besides, the possibility of
highly correlated attributes should be considered, as well as interactions between pairs,
trios or combinations of high order SNPs. The most usual methods applied on genomewide
association studies adopt the p-value of each SNP as a lter to select the SNPs most
signi cant. For continuous phenotypes the statistical regression-based hypothesis test is
used and the Chi-Square test or similar for classi cation of discrete phenotypes. However,
this class of methods capture only SNPs with additive e ects, due to the linear relationship
considered. In an attempt to overcome the limitations of established procedures, this
work proposes a new SNPs selection method, named SNP Markers Selector (SMS), based
on Machine Learning and Computational Intelligence strategies. The model is built
considering an approach which divides the SNPs selection problem in three distinct phases:
the rst related to the evaluation of the markers relevance, a second responsible for the
de nition of the set of the relevant markers that will be considered by means of a cut
strategy based on a threshold of markers relevance and, nally, a phase for the re nement
of the cut process, usually to diminish false-positive markers. In the SMS, these three
steps were implemented using Random Forests, Support Vector Machine and Genetic
Algorithms, respectively. The SMS intends to create a work ow that maximizes the SNPs
selection potential of the model due to the adoption of steps considered complementary.
In this way, there is an increasing expectation on the performance of the SMS to capture
additive e ects, moderate non-additive interaction between pairs and trios of SNPs,
or even, higher order interactions with minimally detectable e ects. The SMS can be
applied both in regression problems (continuous phenotype) as in classi cation problems
(discrete phenotype). Numerical experiments were performed to evaluate the potential
of the strategy, with the method being applied in seven sets of simulated data and in a
real data set, where milk production capacity predicated of dairy cows was measured as
continuous phenotype. Besides, the comparison of the proposed method with methods
based on p-value and Lasso Bayesian technique indicate, in general, competitive results
from the point of view of true-positive SNPs using simulated data set with additive e ects
in conjunction with interactions of pairs and trios of SNPs. In the real data, based on
56,947 SNPs and a single phenotype of milk production, the method identi ed 245 QTLs
associated with milk production and composition and 90 candidate genes associated with
mastitis, milk production and composition, standing out that these QTLs and genes
were identi ed by previous studies using other selection methods. Thus, the experiments
showed the potential of the method in relation to other strategies when complex scenarios
with simulated or real data are adopted, indicating that the work ow developed to guide
the construction of the method should be considered for genome-wide asociation studies
in humans, animals and plants.
|
63 |
Deciphering causal genetic determinants of red blood cell traitsLessard, Samuel 04 1900 (has links)
Les études d’association pan-génomiques ont révélé plusieurs variants génétiques associés à des traits complexes. Les mesures érythrocytaires ont souvent fait l’objet de ce genre d’études, étant mesurées de façon routinière et précise. Comprendre comment les variations génétiques influencent ces phénotypes est primordial étant donné leur importance comme marqueurs cliniques et leur influence sur la sévérité de plusieurs maladies. En particulier, des niveaux élevés d’hémoglobine fœtal chez les patients atteints d’anémie falciforme est associé à une réduction des complications et une augmentation de l’espérance de vie. Néanmoins, la majorité des variants génétiques identifiés par ces études tombent à l’intérieur de régions génétiques non-codantes, augmentant la difficulté d’identifier des gènes causaux.
L’objectif premier de ce projet est l’identification et la caractérisation de gènes influençant les traits complexes, et tout particulièrement les traits sanguins. Pour y arriver, j’ai tout d’abord développé une méthode permettant d’identifier et de tester l’effet de gènes knockouts sur les traits anthropométriques. Malgré un échantillon de grande taille, cette approche n’a révélé aucune association. Ensuite, j’ai caractérisé le méthylome et le transcriptome d’érythroblastes différentiés à partir de cellules souches hématopoïétiques et identifié plusieurs gènes potentiellement impliqués dans les programmes érythroïdes fœtaux et adultes. Par ailleurs, j’ai identifié plusieurs micro-ARNs montrant des motifs d’expression spécifiques entre les stages fœtaux et adultes et qui sont enrichis pour des cibles exprimées de façon opposée. Finalement, j’ai identifié plusieurs variants génétiques associés à l’expression de gènes dans les érythroblastes (eQTL). Cette étude a permis d’identifier des variants associés à l’expression du gène ATP2B4, qui encode le principal transporteur de calcium des érythrocytes. Ces variants, qui sont également associés à des traits sanguins et à la susceptibilité à la malaria, tombent dans un élément d’ADN spécifique aux cellules érythroïdes. La délétion de cet élément par le système CRISPR/Cas9 induit une forte diminution de l’expression du gène et une augmentation des niveaux de calcium intracellulaires.
En conclusion, des échantillons de génotypages exhaustifs seront nécessaires pour étudier l’effet de gènes knockouts sur les traits complexes. Les érythroblastes montrent de grandes différences au niveau de leur méthylome et transcriptome entre les différents stages développementaux. Ces différences influencent potentiellement la régulation de l’hémoglobine fœtale et impliquent de nombreux micro-ARNs et régions régulatrices non-codantes. Finalement, l’exemple d’ATP2B4 montre qu’intégrer des études épigénomiques, transcriptomiques et des expériences d’édition de génome est une approche puissante pour caractériser des variants génétiques non-codants. Par ailleurs, ces résultats impliquent ATP2B4 dans l’hydratation des érythroblastes, qui est associé à la susceptibilité à la malaria et la sévérité de l’anémie falciforme. Cibler ATP2B4 de façon thérapeutique pourrait avoir un impact majeur sur ces maladies qui affectent des millions d’individus à travers le monde. / Genome-wide association studies (GWAS) have revealed several genetic variants associated with complex phenotypes. This is the case for red blood cell (RBC) traits, which are particularly amenable to GWAS as they are routinely and accurately measured. Understanding RBC trait variation is important given their significance as clinical markers and modifiers of disease severity. Notably, increased fetal hemoglobin (HbF) production in sickle cell disease (SCD) patients is associated with a higher life expectancy and decreased morbidity. Nonetheless, most variants identified through GWAS fall in non-coding regions of the human genome, increasing the difficulty of identifying causal links.
The main goal of this project was to identify and characterize genes influencing complex traits, and in particular RBC phenotypes. First, I developed an approach to identify and test potential gene knockouts affecting anthropometric traits in a large sample from the general population, which did not yield significant associations. Then, I characterized the DNA methylome and transcriptome of erythroblasts differentiated ex vivo from hematopoietic progenitor stem cells (HPSC), and identified several genes potentially implicated in fetal and adult-stage erythroid programs. I also identified microRNAs (miRNA) that show specific developmental expression patterns and that are enriched in inversely expressed targets. Finally, I mapped expression quantitative trait loci (eQTL) in erythroblasts, and identify erythroid-specific eQTLs for ATP2B4, the main calcium ATPase of RBCs. These genetic variants are associated with RBC traits and malaria susceptibly, and overlap an erythroid-specific enhancer of ATP2B4. Deletion of this regulatory element using CRISPR/Cas9 experiments in human erythroid cells minimized ATP2B4 expression and increased intracellular calcium levels.
In conclusion, large and comprehensive genotyping datasets will be necessary to test the role of rare gene knockouts on complex phenotypes. The transcriptomes and DNA methylomes of erythroblasts show substantial differences correlating with their developmental stages and that may be implicated in HbF production. These results also suggest a strong implication of erythroid enhancers and miRNAs in developmental stage specificity. Finally, characterizing the erythroid-specific enhancer of ATP2B4 suggest that integrating epigenomic, transcriptomic and gene editing experiments can be a powerful approach to characterize non-coding genetic variants. These results implicate ATP2B4 in erythroid cell hydration, which is associated with malaria susceptibility and SCD severity, suggesting that therapies targeting this gene could impact diseases affecting millions of individuals worldwide.
|
64 |
Functional Analysis of the TRIB1 Locus in Coronary Artery DiseaseDouvris, Adrianna January 2011 (has links)
The TRIB1 locus (8q24.13) is a novel locus associated with plasma TGs and CAD risk. Trib1 is a regulator of MAPK activity, and has been shown to regulate hepatic lipogenesis and VLDL production in mice. However, the functional relationship between common SNPs at the TRIB1 locus and plasma lipid traits is unknown; TRIB1 has not been identified as an eQTL. This cluster of SNPs falls within an intergenic region 25kb to 50kb downstream of the TRIB1 coding region. By phylogenetic footprinting analysis and DNA genotyping, we identified an evolutionarily conserved region (CNS1) within the risk locus that harbours two common SNPs in tight LD with GWAS risk SNPs and significantly associated with CAD. We investigated the regulatory function of CNS1 by luciferase reporter assays in HepG2 cells and demonstrate that this region has promoter activity. In addition, the rs2001844 risk allele significantly reduces luciferase activity, suggesting that altered expression of the EST-based gene may be associated with plasma TGs. We identified an EST within the risk locus directly downstream of CNS1. We performed 5'/3' RACE using HepG2 RNA, identified multiple variants of this EST-based gene, and confirmed its transcription start site within CNS1. We hypothesize that this EST is a long noncoding RNA due to low abundance, poor conservation, and absence of significant ORF. Over-expression of a short variant implicates its function in the regulation of target gene transcription, although the mechanism of action remains unknown. We conclude that the risk locus at 8q24.13 harbours a novel EST-based gene that may explain the relationship between GWAS SNPs at this locus and plasma lipid traits.
|
65 |
Stratégies d'analyses multi-marqueurs pour identifier des gènes et des interactions gène-gène impliqués dans le mélanome cutané / Multi-Marker Analytical Strategies to Identify Genes and Gene-Gene Interactions Associated with Cutaneous MelanomaBrossard, Myriam 14 December 2015 (has links)
Le mélanome cutané est un cancer des cellules de la peau (mélanocytes) qui se situe, en France, au 11e rang des cancers les plus fréquents. Sa mortalité reste élevée lorsqu’il est diagnostiqué à un stade tardif. Ce cancer résulte de nombreux facteurs génétiques, environnementaux et des interactions entre ces facteurs. La susceptibilité génétique à ce cancer recouvre un large spectre de variabilité génétique, depuis des mutations rares conférant un risque élevé jusqu’à des variants fréquents conférant un risque modeste. C’est dans le cadre de l’identification de variants fréquents liés à l’apparition du mélanome et à son pronostic que se situe mon travail de thèse. À ce jour, les études d’associations pangénomiques du mélanome ont identifié des variants fréquents à effets relativement modestes qui expliquent seulement une part de la composante génétique. Les variants fonctionnels au sein des régions identifiées sont le plus souvent inconnus. Les études pangénomiques ont eu principalement recours à des analyses simple-marqueur qui peuvent manquer de puissance pour détecter des variants ayant un effet individuel faible ou interagissant avec d’autres variants. L’objectif principal de ce travail de thèse a été de proposer des stratégies d’analyse multi-marqueurs pour identifier de nouveaux gènes impliqués dans le mélanome et pour caractériser des variants potentiellement fonctionnels au sein des régions du génome associées au mélanome.Pour identifier de nouveaux gènes associés au risque de mélanome et à un facteur pronostique de ce cancer (l’indice de Breslow), nous avons proposé une stratégie d’analyse multi-marqueurs qui intègre une analyse de pathways biologiques basée sur la méthode GSEA (Gene Set Enrichment Analysis) et une analyse d’interactions entre gènes au sein des pathways associés au mélanome. Ces analyses ont été menées dans deux études : l’étude française MELARISK et l’étude américaine du MD Anderson Cancer Center (MDACC), totalisant 2 980 cas et 3 823 témoins. Nous avons identifié une interaction entre les gènes, TERF1 et AFAP1L2, pour le risque de mélanome et une interaction entre les gènes, CDC42 et SCIN, pour l’indice de Breslow. Ces gènes sont particulièrement pertinents sur le plan biologique du fait de leur rôle dans la biologie des télomères pour la première paire de gènes et dans la dynamique des filaments d’actine pour la seconde paire. Afin d’identifier les variants potentiellement fonctionnels au sein des régions du génome mises en évidence par études pangénomiques, nous avons proposé une stratégie de cartographie fine qui repose principalement sur une méthode de régression pénalisée (méthode HyperLasso) appliquée à tous les variants de la région étudiée. Par l’analyse de la région 16q24 qui contient le gène MC1R dont les variants fonctionnels sont connus, nous avons montré que cette stratégie était capable d’identifier ces variants parmi de nombreux variants associés au mélanome dans cette région. Nous avons contribué à identifier cinq nouvelles régions du génome associées au mélanome par méta-analyse d’études pangénomiques réalisées au niveau mondial (43 000 sujets) puis mené une étude de cartographie fine de toutes les régions associées au mélanome, en se basant sur la stratégie proposée et validée dans la région 16q24. Les stratégies d’analyses multi-marqueurs proposées dans le cadre de ce travail de thèse ont permis d’identifier de nouveaux gènes associés au risque de mélanome et à un facteur pronostique de ce cancer et de caractériser les variants génétiques potentiellement fonctionnels au sein des régions du génome identifiées par études pangénomiques. / Cutaneous melanoma is a skin cancer developed from melanocytes. It is the 11th most common cancers in France. Mortality due to melanoma remains high when diagnosed at a late stage. This cancer results from many genetic, environmental factors and interactions between these factors. The genetic susceptibility to melanoma covers a broad spectrum of genetic variation, from rare mutations conferring high risk to common variants conferring low risk. My thesis was conducted in the framework of low-risk variants associated with melanoma occurrence and prognosis. To date, genome-wide association studies (GWAS) of melanoma have identified common variants with relatively modest effects which only explain a part of the genetic component of this cancer. Functional variants at the identified loci are mostly unknown. GWASs have been mainly conducted using single-marker analysis which may be underpowered to detect variants with small effect or interacting with each other. The main objective of this thesis was to propose multi-marker analysis strategies to identify novel genes involved in melanoma and to characterize potentially functional variants in chromosomal regions found associated with melanoma. To identify new genes associated with melanoma risk and a prognostic factor for this cancer (Breslow thickness), we proposed a multi-marker analysis strategy which integrates pathway analysis based on the GSEA (Gene Set Enrichment Analysis) method and gene-gene interaction analysis within melanoma-associated pathways. These analyses were conducted in two studies: the French MELARISK study and the North-American MD Anderson Cancer Center (MDACC) study, with a total of 2,980 cases and 3,823 controls. We identified gene-gene interactions between TERF1 and AFAP1L2 genes for melanoma risk and between CDC42 and SCIN genes for Breslow thickness. These genes are biologically relevant because of their role in telomere biology for the former gene pair and in actin dynamics for the latter pair. To identify potentially functional variants at loci identified by GWAS, we proposed a fine mapping strategy which is mainly based on a penalized regression approach (HyperLasso method) that can be applied to all variants of the region under study. By studying the 16q24 region which harbors the MC1R gene whose functional variants are known, we showed this strategy was able to identify those variants among many variants associated with melanoma in this region. We contributed to the identification of five novel regions associated with melanoma through a worldwide meta-analysis of melanoma GWASs (43,000 subjects) and conducted fine mapping of all melanoma-associated loci using the strategy we proposed and validated in the 16q24 region. The multi-marker strategies proposed in this work have allowed identifying new biologically relevant genes associated with risk of melanoma and a major melanoma prognostic factor and characterizing potentially functional genetic variants within regions identified by GWAS.
|
66 |
Optimizing Body Mass Index Targets Using Genetics and BiomarkersKhan, Irfan January 2021 (has links)
Introduction/Background: Guidelines from the World Health Organization currently recommend targeting a body mass index (BMI) between 18.5 and 24.9 kg/m2 based on the lowest risk of mortality observed in epidemiological studies. However, these recommendations are based on population observations and do not take into account potential inter-individual differences. We hypothesized that genetic and non-genetic differences in adiposity, anthropometric, and metabolic measures result in inter-individual variation in the optimal BMI. Methods: Genetic variants associated with BMI as well as related adiposity, anthropometric, and metabolic phenotypes (e.g. triglyceride (TG)) were combined into polygenic risk scores (PRS), cumulative risk scores derived from the weighted contributions of each variant. 387,692 participants in the UK Biobank were split by quantiles of PRS or clinical biomarkers such as C-reactive protein (CRP), and alanine aminotransferase (ALT). The BMI linked with the lowest risk of all-cause and cause-specific mortality outcomes (“nadir value”) was then compared across quantiles (“Cox meta-regression model”). Our results were replicated using the non-linear mendelian randomization (NLMR) model to assess causality. Results: The nadir value for the BMI–all-cause mortality relationship differed across percentiles of BMI PRS, suggesting inter-individual variation in optimal BMI based on genetics (p = 0.005). There was a difference of 1.90 kg/m2 in predicted optimal BMI between individuals in the top and bottom 5th BMI PRS percentile. Individuals having above and below median TG (p = 1.29×10-4), CRP (p = 7.92 × 10-5), and ALT (p = 2.70 × 10-8) levels differed in nadir for this relationship. There was no difference in the computed nadir between the Cox meta-regression or NLMR models (p = 0.102). Conclusions: The impact of BMI on mortality is heterogenous due to individual genetic and clinical biomarker level differences. Although we cannot confirm that are results are causal, genetics and clinical biomarkers have potential use for making more tailored BMI recommendations for patients. / Thesis / Master of Science (MSc) / The World Health Organization (WHO) recommends targeting a body mass index (BMI) between 18.5 - 24.9 kg/m2 for optimal health. However, this recommendation does not take into account individual differences in genetics or biology. Our project aimed to determine whether the optimal BMI, or the BMI associated with the lowest risk of mortality, varies due to genetic or biological variation. Analyses were conducted across 387,692 individuals. We divided participants into groups according to genetic risk for obesity or clinical biomarker profile. Our results show that the optimal BMI varies according to genetic or biomarker profile. WHO recommendations do not account for this variation, as the optimal BMI can fall under the normal 18.5 - 24.9 kg/m2 or overweight 25.0 – 29.0 kg/m2 WHO BMI categories depending on individual genetic or biomarker profile. Thus, there is potential for using genetic and/or biomarker profiles to make more precise BMI recommendations for patients.
|
67 |
Multi-omics approaches to sickle cell disease heterogeneityIlboudo, Yann 10 1900 (has links)
La drépanocytose est une maladie causée par une seule mutation dans le gène de la bêta-globine. Les complications liées à la maladie se manifestent sur le plan génétique, épigénique, transcriptionnel, et métabolique. Les approches intégratives des technologies de séquençage à haut-débit permettent de comprendre le mécanisme pathologique et de découvrir des thérapies en lien avec la maladie. Dans cette thèse, j’intègre divers jeux de données omiques et j’applique des méthodes statistiques pour élaborer de nouvelles hypothèses et analyser les données.
Dans les deux premières études, je combine les résultats des études d'association pangénomique d'hémoglobine fœtale (HbF) et des globules rouges denses déshydratés (DRBC) avec l'expression génique, l'interaction chromatinienne, les bases de données relatives aux maladies et les cibles médicamenteuses sélectionnées par des experts. Cette approche intégrative a révélé trois nouveaux loci sur le chromosome 10 (BICC1), le chromosome 19 (KLF1) et le chromosome 22 (CECR2) comme régulateurs de l'HbF. Pour l’étude sur la densité de globules rouges, quatre cibles médicamenteuses (BCL6, LRRC32, KNCJ14 et LETM1) ont été identifiées comme des modulateurs potentiels de la sévérité.
Dans la troisième étude, j’intégre la métabolomique à la génomique pour établir une relation causale entre la L-glutamine et les crises douleurs en utilisant la randomisation mendélienne. En outre, nous avons identifié 66 biomarqueurs pour 6 complications liées à la drépanocytose et le débit de filtration glomérulaire estimé (DFGe). Enfin, dans la dernière étude j’ai appliqué une approche de clustering aux métabolites que j’ai ensuite combiné aux données de génotype. J’ai découvert des changements métabolomiques mettant en évidence des familles de métabolites impliqués dans les dysfonctionnements rénaux et hépatiques, en plus de confirmer le rôle d'une classe d'acides gras dans la formation en faucille des globules rouges. Ce travail met en évidence l'importance des approches multi-omiques pour découvrir de nouveaux mécanismes biologiques et étudier les maladies humaines. / Sickle cell disease is a monogenic disorder caused by a point mutation in the beta-globin gene. The complications related to the disease are characterized by a broad spectrum of distinct genetic, epigenetic, transcriptional, and metabolomic states. Integrative high-throughput technologies approaches to sickle cell disease pathophysiology are crucial to understanding complications mechanisms and uncovering therapeutic interventions. In this thesis, I integrate various omics datasets and apply statistical methods to derive new hypotheses and analyze data.
I combine genome-wide association studies results of fetal hemoglobin (HbF) and dehydrated dense red blood cells (DRBC) with gene expression, chromatin interaction, disease-relevant databases, and expert-curated drug targets. This integrative approach revealed three novel loci on chromosome 10 (BICC1), chromosome 19 (KLF1) and chromosome 22 (CECR2) as key modulators of HbF. For DRBC, four drug targets (BCL6, LRRC32, KNCJ14, and LETM1) were identified as potential severity modifiers.
Using mendelian randomization, I integrated metabolomics with genomics in the third study to establish a potential causal relationship between L-glutamine and painful crisis. Additionally, we identified 66 biomarkers for 6 SCD-related complications and estimated glomerular filtration rate (eGFR). Finally, the last study applied a clustering framework to metabolites which I then combined with genotypes. I found specific metabolomics changes highlighting families of metabolites involved in renal and liver dysfunction and confirming the role of a class of fatty acids in red blood cell sickling. This work highlights the importance of multi-omics approaches to unearth new biology and study human diseases.
|
68 |
Algorithms to Integrate Omics Data for Personalized MedicineAyati, Marzieh 31 August 2018 (has links)
No description available.
|
69 |
Investigation des variants génétiques dans la dysfonction endothéliale et le risque de maladies cardiovasculaires.Codina-Fauteux, Valérie-Anne 08 1900 (has links)
No description available.
|
70 |
Development and application of new statistical methods for the analysis of multiple phenotypes to investigate genetic associations with cardiometabolic traitsKonigorski, Stefan 27 April 2018 (has links)
Die biotechnologischen Entwicklungen der letzten Jahre ermöglichen eine immer detailliertere Untersuchung von genetischen und molekularen Markern mit multiplen komplexen Traits. Allerdings liefern vorhandene statistische Methoden für diese komplexen Analysen oft keine valide Inferenz.
Das erste Ziel der vorliegenden Arbeit ist, zwei neue statistische Methoden für Assoziationsstudien von genetischen Markern mit multiplen Phänotypen zu entwickeln, effizient und robust zu implementieren, und im Vergleich zu existierenden statistischen Methoden zu evaluieren. Der erste Ansatz, C-JAMP (Copula-based Joint Analysis of Multiple Phenotypes), ermöglicht die Assoziation von genetischen Varianten mit multiplen Traits in einem gemeinsamen Copula Modell zu untersuchen. Der zweite Ansatz, CIEE (Causal Inference using Estimating Equations), ermöglicht direkte genetische Effekte zu schätzen und testen.
C-JAMP wird in dieser Arbeit für Assoziationsstudien von seltenen genetischen Varianten mit quantitativen Traits evaluiert, und CIEE für Assoziationsstudien von häufigen genetischen Varianten mit quantitativen Traits und Ereigniszeiten. Die Ergebnisse von umfangreichen Simulationsstudien zeigen, dass beide Methoden unverzerrte und effiziente Parameterschätzer liefern und die statistische Power von Assoziationstests im Vergleich zu existierenden Methoden erhöhen können - welche ihrerseits oft keine valide Inferenz liefern.
Für das zweite Ziel dieser Arbeit, neue genetische und transkriptomische Marker für kardiometabolische Traits zu identifizieren, werden zwei Studien mit genom- und transkriptomweiten Daten mit C-JAMP und CIEE analysiert. In den Analysen werden mehrere neue Kandidatenmarker und -gene für Blutdruck und Adipositas identifiziert. Dies unterstreicht den Wert, neue statistische Methoden zu entwickeln, evaluieren, und implementieren. Für beide entwickelten Methoden sind R Pakete verfügbar, die ihre Anwendung in zukünftigen Studien ermöglichen. / In recent years, the biotechnological advancements have allowed to investigate associations of genetic and molecular markers with multiple complex phenotypes in much greater depth. However, for the analysis of such complex datasets, available statistical methods often don’t yield valid inference.
The first aim of this thesis is to develop two novel statistical methods for association analyses of genetic markers with multiple phenotypes, to implement them in a computationally efficient and robust manner so that they can be used for large-scale analyses, and evaluate them in comparison to existing statistical approaches under realistic scenarios. The first approach, called the copula-based joint analysis of multiple phenotypes (C-JAMP) method, allows investigating genetic associations with multiple traits in a joint copula model and is evaluated for genetic association analyses of rare genetic variants with quantitative traits. The second approach, called the causal inference using estimating equations (CIEE) method, allows estimating and testing direct genetic effects in directed acyclic graphs, and is evaluated for association analyses of common genetic variants with quantitative and time-to-event traits.
The results of extensive simulation studies show that both approaches yield unbiased and efficient parameter estimators and can improve the power of association tests in comparison to existing approaches, which yield invalid inference in many scenarios.
For the second goal of this thesis, to identify novel genetic and transcriptomic markers associated with cardiometabolic traits, C-JAMP and CIEE are applied in two large-scale studies including genome- and transcriptome-wide data. In the analyses, several novel candidate markers and genes are identified, which highlights the merit of developing, evaluating, and implementing novel statistical approaches. R packages are available for both methods and enable their application in future studies.
|
Page generated in 0.1313 seconds