31 |
Network-based methods to identify mechanisms of action in disease and drug perturbation profiles using high-throughput genomic dataPham, Lisa M. 24 June 2024 (has links)
In the past decade it has become increasingly clear that a biological response is rarely caused by a single gene or protein. Rather, it is a result of a myriad of biological factors, constituting a systematic network of biological variables that span multiple granularities of biology from gene transcription to cell metabolism. Therefore it has become a significant challenge in the field of bioinformatics to integrate different levels of biology and to think of biological problems from a network perspective. In my thesis, I will discuss three projects that address this challenge.
First, I will introduce two novel methods that integrate quantitative and qualitative biological data in a network approach. My aim in chapters two and three is to combine high-throughput data with biological databases to identify the causal mechanisms of action (MoA), in the form of canonical biological pathways, underlying the data for a given phenotype. In the second chapter, I will introduce an algorithm called Latent Pathway Identification Analysis (LPIA). This algorithm looks for statistically significant evidence of dysregulation in a network of pathways constructed in a manner that explicitly links pathways through their common function in the cell.
In chapter three, I will introduce a new method that focuses on the identification of perturbed pathways from high-throughput gene expression data, which we approach as a task in statistical modeling and inference. We develop a two-level statistical model, where (i) the first level captures the relationship between high-throughput gene expression and biological pathways, and (ii) the second level models the behavior within an underlying network of pathways induced by an unknown perturbation.
In the fourth chapter, I will focus on the integration of high throughput data on two distinct levels of biology to elucidate associations and causal relationships amongst genotype, gene expression and glycemic traits relevant to Type 2 Diabetes. I use the Framingham heart study as well as its extension, the SABRe initiative, to identify genes whose expression may be causally linked to fasting glucose.
|
32 |
Identification of genes associated with intramuscular fat deposition and composition in Nellore breed / Identificação de genes associados à deposição e composição da gordura intramuscular em bovinos da raça NeloreCesar, Aline Silva Mello 03 July 2014 (has links)
The amount and composition of intramuscular fat (IMF) influence the sensory characteristics, nutritional value of beef and human health. The amount of fatty acid and its composition in beef varies by breed, nutrition, sex, age or carcass finishing level. The fat deposition and composition are determined by many genes that participate directly or indirectly in adipogenesis and lipid metabolism. The selection of animals with fat amount and composition suitable for the consumer is complex due to high cost of measurement, the moderate heritability and polygenic traits (many genes are involved with these traits). In the last decade with a great advance in bovine genomics resulted in the complete genome sequencing and the development of high-density chips of SNPs. This scientific advance jointly with technological improvement allowed the identification of genes responsible for important quantitative traits in cattle. This study aimed to identify and characterize genes associated with the deposition and composition of intramuscular fat in Nellore. A genome-wide association study (genome- wide association studies, GWAS) was performed to identify genomic regions associated with traits of interest and positional candidate genes. A total RNA sequencing (RNA-Seq) analysis was applied to transcriptome study of Longissimus dorsi muscle. Three hundred and eighty six Nellore steers were used for the evaluation of lipid content and fatty acid profile of LD, and genotyping with high-density chip SNP (SNP800 Illumina BeadChip). A subset of 14 animals, seven animals for each extremes of genomic estimated values (GEBV) were used to RNA-Seq analysis. Twenty-five genomic regions (1 MB window) were associated with the deposition and composition of intramuscular fat, which explained >= 1 % of the genetic variance. These regions were identified on chromosomes 2, 3, 6, 7, 8, 9, 10, 11, 12, 17, 26 and 27, many of these have not previously been found in other breeds and in these regions important genes were identified. Genomic regions and genes identified and presented here should be contribute to a better understanding of the genetic control of deposition and fat composition in beef cattle, and can be applied in breeding programs for animals that produce a quality and healthy beef to human consumers. / A quantidade e composição da gordura intramuscular (GIM) pode influenciar as características sensoriais, o valor nutricional da carne bovina e na saúde humana. O perfil dos seus ácidos graxos pode se apresentar de maneira diversificada conforme a genética, o manejo e a nutrição dos animais de origem. A deposição e composição da gordura são determinadas por muitos genes que participam direta ou indiretamente da adipogênese e do metabolismo lipídico. A seleção de animais com teor e composição de gordura adequado para o consumidor é complexa pela difícil mensuração destas características, pela moderada herdabilidade e pelo desconhecimento dos genes envolvidos. Na última década, presenciamos um grande avanço na área da genômica bovina que resultou no sequenciamento completo do genoma e no desenvolvimento de chips de alta densidade de SNP. Este progresso científico, aliado aos avanços tecnológicos de equipamentos, resultou na identificação de genes responsáveis pela determinação de características quantitativas de interesse científico e comercial na bovinocultura. Este estudo teve como objetivo identificar e caracterizar genes associados à deposição e composição de gordura intramuscular em bovinos Nelore. Para este fim foi conduzido um estudo de associação genômica (Genome-wide association studies, GWAS) para identificar regiões genômicas associadas às características de interesse e identificar genes candidatos posicionais. Para o estudo de expressão diferencial foi conduzido um estudo do transcriptoma a partir do sequenciamento de RNA total (RNA-Seq) do músculo Longissimus dorsi. Foram utilizados 386 Nelores para a avaliação do teor de lipídeos total e perfil de ácidos graxos do músculo LD e, genotipagem com chip de alta densidade de SNP (Illumina SNP800 BeadChip). Um subconjunto de 14 animais, sendo sete animais de cada extremo para os valores genômicos estimados (GEBV) foi utilizado para o estudo de RNA-Seq. Foram encontradas 25 regiões genômicas (intervalos de 1 MB) associadas com deposição e composição de gordura intramuscular, as quais explicaram >= 1% da variância genética. Estas regiões foram identificadas nos cromossomos 2, 3, 6, 7, 8, 9, 10, 11, 12, 17, 26 e 27, muitas destas não foram previamente detectadas em outras raças. Nestas regiões foram identificados importantes genes e podem ajudar no entendimento da base genética envolvida na deposição e composição de gordura. As regiões genômicas e genes aqui identificados e apresentados contribuem para um melhor entendimento do controle genético da deposição e composição de gordura em gado de corte e ainda podem ser aplicados em programas de seleção genética de animais que produzam carne com qualidade e com perfil de gordura saudável ao homem.
|
33 |
Mapeamento associativo para múltiplos ambientes e múltiplos locos, visando tolerância à seca em milho / Multi-environment-multi-locus association mapping for drought tolerance in maizeAnoni, Carina de Oliveira 08 April 2016 (has links)
A seca é um dos estresses abióticos mais importantes na cultura do milho, o qual ocasiona reduções significativas na produção de grãos. A arquitetura genética da tolerância à seca é complexa, fazendo-se necessária a melhor compreensão desse caráter. Estudos envolvendo mapeamento associativo são úteis por explorarem a variação genética de caracteres quantitativos e, adicionalmente, levam em conta informações acerca de genótipos, ambientes e interações genótipo por ambiente (G × E). Ao considerar efeitos de G × E em modelos de mapeamento associativo há possibilidade de identificar regiões no genoma associadas à condições e ambientes específicos. Este trabalho teve como objetivo detectar associações relacionadas à tolerância à seca em milho por meio de um modelo de mapeamento associativo para múltiplos ambientes e múltiplos locos, o qual permitiu distinguir associações com efeitos ambiente-específico daquelas com efeitos principais e de interação associação por ambiente (QEI). O painel associativo foi composto por 190 linhagens, classificadas de acordo com os grupos heteróticos quanto ao tipo de grão. Marcadores SNPs (∼500k) foram utilizados para a genotipagem do painel associativo. Duas linhagens (L228-3 e L3) foram usadas como testadores comuns e os híbridos obtidos foram avaliados em duas localidades (Janaúba-MG e Teresina-PI), dois anos agrícolas (2010 e 2011), sob duas condições de tratamento (irrigado e não irrigado). Ao total, consideraram-se seis caracteres: peso de grãos, intervalo de florescimento, florescimento feminino e masculino, altura de planta e de espiga. Consideraram-se dois grupos de mapeamento, agrupados de acordo com os testadores utilizados. SNPs foram úteis para testar associações ao longo do genoma do milho e investigar o relacionamento genético entre indivíduos. O modelo de mapeamento associativo, com inclusão de informações sobre interação G × E, detectou o total de 179 associações, e o maior número de associações foram relacionadas aos caracteres de florescimento. A maioria das associações (168) apresentaram QEI significativo, sendo que o tamanho e a magnitude desses efeitos distinguiram-se de acordo com o ambiente em avaliação. Apenas o caráter florescimento feminino não apresentou associações com efeitos estáveis ao longo dos ambientes em estudo. A detecção de algumas associações em posições próximas do genoma evidenciam possíveis efeitos de pleiotropia. Algumas associações foram co-localizadas em regiões do genoma do milho relacionadas à tolerância à seca, sendo que algumas dessas associações estavam envolvidas a fatores pertencentes à vias metabólicas de interesse. O presente estudo forneceu informações úteis para a compreensão da base genética da tolerância à seca em milho sob os ambientes específicos em avaliação. / Drought is a severe stress factor in maize production and causes significant reduction in grain yield. Genetic architecture of drought tolerance is complex and a better understanding of this trait is required. Association mapping studies are useful to explore quantitative traits and simultaneously account for genetic backgrounds including genotype, environment and genotype-by-environment (G × E) interactions. By accounting for G × E into association mapping models it is possible to identify regions associated with specific environment and conditions. The main goal of this study was detect significant associations related to drought tolerance in maize via multi-environment-multi- locus association mapping model, distinguishing information about specific-environment effects from main and association-by-environment (QEI) effects. Our association panel was composed by 190 inbred lines classified according to heterotic groups. The panel was genotyped with ∼500K SNPs. Two inbred lines (L228-3 and L3) were used as common testers and testcrosses were assessed in two locations (Janaúba-MG e Teresina-PI), two years (2010 and 2011), under two treatment conditions (well-watered and water-stressed). A total of six traits were evaluated including grain yield, anthesis-silking interval, female and male flowering time, plant and ear height. Two mapping groups were considered, grouped by common testers. SNPs were used to test significant association along the maize genome and also to account for population structure and relatedness coefficient. Our mapping model detected a total of 179 associations and the highest number of associations were related to the flowering time measures. The most associations (168) showed significant QEI and the size and magnitude of those effects were distinguished by environment conditions. Only female flowering trait did not show stable effects across all environments. Mapped associations in nearby positions indicate plausible pleiotropy effects. Some associations were co-located in maize genome regions related to metabolic pathway factors. Our study support the detection of significant associations along the maize genome and contributes for understanding of the genetic basis of drought tolerance in maize.
|
34 |
Identification of genes associated with intramuscular fat deposition and composition in Nellore breed / Identificação de genes associados à deposição e composição da gordura intramuscular em bovinos da raça NeloreAline Silva Mello Cesar 03 July 2014 (has links)
The amount and composition of intramuscular fat (IMF) influence the sensory characteristics, nutritional value of beef and human health. The amount of fatty acid and its composition in beef varies by breed, nutrition, sex, age or carcass finishing level. The fat deposition and composition are determined by many genes that participate directly or indirectly in adipogenesis and lipid metabolism. The selection of animals with fat amount and composition suitable for the consumer is complex due to high cost of measurement, the moderate heritability and polygenic traits (many genes are involved with these traits). In the last decade with a great advance in bovine genomics resulted in the complete genome sequencing and the development of high-density chips of SNPs. This scientific advance jointly with technological improvement allowed the identification of genes responsible for important quantitative traits in cattle. This study aimed to identify and characterize genes associated with the deposition and composition of intramuscular fat in Nellore. A genome-wide association study (genome- wide association studies, GWAS) was performed to identify genomic regions associated with traits of interest and positional candidate genes. A total RNA sequencing (RNA-Seq) analysis was applied to transcriptome study of Longissimus dorsi muscle. Three hundred and eighty six Nellore steers were used for the evaluation of lipid content and fatty acid profile of LD, and genotyping with high-density chip SNP (SNP800 Illumina BeadChip). A subset of 14 animals, seven animals for each extremes of genomic estimated values (GEBV) were used to RNA-Seq analysis. Twenty-five genomic regions (1 MB window) were associated with the deposition and composition of intramuscular fat, which explained >= 1 % of the genetic variance. These regions were identified on chromosomes 2, 3, 6, 7, 8, 9, 10, 11, 12, 17, 26 and 27, many of these have not previously been found in other breeds and in these regions important genes were identified. Genomic regions and genes identified and presented here should be contribute to a better understanding of the genetic control of deposition and fat composition in beef cattle, and can be applied in breeding programs for animals that produce a quality and healthy beef to human consumers. / A quantidade e composição da gordura intramuscular (GIM) pode influenciar as características sensoriais, o valor nutricional da carne bovina e na saúde humana. O perfil dos seus ácidos graxos pode se apresentar de maneira diversificada conforme a genética, o manejo e a nutrição dos animais de origem. A deposição e composição da gordura são determinadas por muitos genes que participam direta ou indiretamente da adipogênese e do metabolismo lipídico. A seleção de animais com teor e composição de gordura adequado para o consumidor é complexa pela difícil mensuração destas características, pela moderada herdabilidade e pelo desconhecimento dos genes envolvidos. Na última década, presenciamos um grande avanço na área da genômica bovina que resultou no sequenciamento completo do genoma e no desenvolvimento de chips de alta densidade de SNP. Este progresso científico, aliado aos avanços tecnológicos de equipamentos, resultou na identificação de genes responsáveis pela determinação de características quantitativas de interesse científico e comercial na bovinocultura. Este estudo teve como objetivo identificar e caracterizar genes associados à deposição e composição de gordura intramuscular em bovinos Nelore. Para este fim foi conduzido um estudo de associação genômica (Genome-wide association studies, GWAS) para identificar regiões genômicas associadas às características de interesse e identificar genes candidatos posicionais. Para o estudo de expressão diferencial foi conduzido um estudo do transcriptoma a partir do sequenciamento de RNA total (RNA-Seq) do músculo Longissimus dorsi. Foram utilizados 386 Nelores para a avaliação do teor de lipídeos total e perfil de ácidos graxos do músculo LD e, genotipagem com chip de alta densidade de SNP (Illumina SNP800 BeadChip). Um subconjunto de 14 animais, sendo sete animais de cada extremo para os valores genômicos estimados (GEBV) foi utilizado para o estudo de RNA-Seq. Foram encontradas 25 regiões genômicas (intervalos de 1 MB) associadas com deposição e composição de gordura intramuscular, as quais explicaram >= 1% da variância genética. Estas regiões foram identificadas nos cromossomos 2, 3, 6, 7, 8, 9, 10, 11, 12, 17, 26 e 27, muitas destas não foram previamente detectadas em outras raças. Nestas regiões foram identificados importantes genes e podem ajudar no entendimento da base genética envolvida na deposição e composição de gordura. As regiões genômicas e genes aqui identificados e apresentados contribuem para um melhor entendimento do controle genético da deposição e composição de gordura em gado de corte e ainda podem ser aplicados em programas de seleção genética de animais que produzam carne com qualidade e com perfil de gordura saudável ao homem.
|
35 |
Identificação de novos genes e SNPs relacionados ao mal de Parkinson e doenças relacionadas através de GWASCARVALHO, Rebecca Cristina Linhares de 25 February 2015 (has links)
Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2016-04-05T14:57:05Z
No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
Dissertação - Rebecca Cristina Linhares de Carvalho, 2015.pdf: 1343136 bytes, checksum: 87551351118cac05985078a5ee318616 (MD5) / Made available in DSpace on 2016-04-05T14:57:05Z (GMT). No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
Dissertação - Rebecca Cristina Linhares de Carvalho, 2015.pdf: 1343136 bytes, checksum: 87551351118cac05985078a5ee318616 (MD5)
Previous issue date: 2015-02-25 / FACEPE / Parkinsonismo é uma síndrome neurológica em que os neurônios que normalmente produzem
o hormônio chamado dopamina se deterioram, causando a perda de controle progressivo
do movimento (ex. bradicinesia, a rigidez muscular, o temor de repouso e os reflexos posturais
prejudicados). Um mal com sintomas semelhantes ao parkinsonismo é a síndrome ScansWithout
Evidence of Dopaminergic Deficits (SWEDDs), na qual os pacientes não apresentam evidências
de déficit de dopamina. As causas de parkinsonismo primário como a doença de Parkinson (DP),
bem como SWEDDs, não são completamente conhecidos. Os estudos de associações no genoma
completo (Genome-wide association studies - GWAS) têm proporcionado ganhos tangíveis para a
compreensão da arquitetura genética de doenças complexas, trazendo contribuições consistentes
e importantes para DP. GWASs foram realizados no passado com os dados de DP com resultados
que influenciaram fortemente desenvolvimentos posteriores.
Nesse trabalho, nós desenvolvemos um estudo sobre fatores genéticos que possam
contribuir para o entendimento da ocorrência da DP e SWEDDs. Para isso, nós usamos o
conjunto de ferramentas de análise de associação envolvendo estudo de caso-controle do método
PLINK para executar GWASs, a fim de identificar SNPs que estão associados à DP e SWEDDs.
Para fixar o nível de significância dos nossos resultados, nós optamos por usar somente dados
reais fornecidos por Parkinson’s Progression Markers Initiative (PPMI). O PPMI é um consórcio
internacional projetado para identificar biomarcadores de progressão da DP, tanto para melhorar
a compreensão da etiologia da doença, como para fornecer ferramentas cruciais para aumentar a
probabilidade de sucesso na elaboração de novos ensaios terapêuticos para DP.
Na análise de associação, feita com dados genótipos de três grupos de indivíduos (indivíduos
saudáveis, indivíduos com DP e indivíduos com SWEDDs), recuperamos SNPs que
mostram forte ligação com PD e SWEDDs, alguns deles já associados na literatura científica
com a DP ou a outras doenças degenerativas. Mas, também encontramos cerca de 60 SNPs que
não estão relatados na literatura, que mostram evidências de serem fortemente relacionadas com
a propensão para DP ou SWEDDs. Estes resultados apresentam alvos promissores para futuros estudos genômicos e podem contribuir para o entendimento da ocorrência da DP e SWEDDs.
Curiosamente, embora SWEDDs seja uma doença clinicamente ligada a DP por uma série de
sintomas comuns, os SNPs recuperados com as melhores classificações a partir do conjunto
de dados DP não faziam parte do conjunto de dados SWEDDs, e vice-versa, o que sugere que
esses dois conjuntos de marcadores poderiam ser mais cuidadosamente explorados nos estudos
genômicos como SNPs comuns de interesse para as duas doenças. / Parkinsonism is a neurological disorder neurons that normally produce the hormone
called dopamine deteriorate, causing progressive loss of movement control (e.g. bradykinesia,
muscle rigidity, tremor at rest, and impaired postural reflexes). A syndrome with similar
symptoms is Scans Without Evidence of Dopaminergic Deficits (SWEDDs), in which the
patients not present evidence of dopaminergic deficits. The causes of primary parkinsonism,
or Parkinson’s disease (PD), as well as SWEDDs, are not completely known. Genome-wide
association studies (GWASs) have provided substantial contribution to the understanding of
the architecture of complex diseases, bringing consistent and important contributions to PD.
GWASs have been performed in the past with PD data with results that strongly influenced later
developments.
In this work, a study of genetic factors that contributes to the set of tools of association
analysis for a better understanding of occurency of PD and SWEDDs. To that end, the set of tools
of association analysis is involved in a study case for the PLINK method to execute GWASs with
the objective of identifying SNPs which are associated to DP and SweDDs. To determine the
level of significance of our results, only real data provided by Parkinson’s Progression Markers
Initiative (PPMI) was used. PPMI is an international consortium created to indetify biomarkers
of DP progression, to better comprehend the etiology of this disease and to provide key tools to
increase the probability of success in the development of new therapeutic trials for PD.
The association analysis, done with genotype data from three groups of individuals
(healthy, affected by PD and affected by SWEDDs), SNPS that have shown strong connection to
PD and SWEDDs were recovered, some of them are already linked in the scientific literature
to PD or other degenerative diseases. But, we also have found about 60 SNPs that are not
reported in the literature, which show evidence to be strongly related to the propensity to PD or
SWEDDs. These results are promising targets for future genomic studies and may contribute to
the understanding of the occurrence of PD and SWEDDs. Interestingly, although SWEDDs is a
disorder clinically linked to PD by a series of common symptoms, the top ranked SNPs recovered from the PD dataset were not part of the SWEDDs dataset and conversely, that suggests that
those two sets of markers could be more carefully explored in the genomic studies as common
SNPs of interest for the two diseases.
|
36 |
Population structure and genome-wide association in the malaria vectors Anopheles gambiae and Anopheles coluzzii / Structure de population et large génome association dans les vecteurs de malaria Anopheles gambiae et Anopheles coluzziiRedmond, Seth 23 February 2015 (has links)
Malgré le succès des insecticides pour le contrôle du paludisme, la transmission continue dans la plupart des pays d’Afrique sub-saharienne. La recherche pour de nouveaux moyens de contrôle (plus spécialement la modification génétique des populations vecteurs), ou l'utilisation plus efficace des contrôles actuels, vont nécessiter une recherche sur les structures de populations de moustiques et les processus d’immunisation qui importent pour la transmission du Plasmodium chez les moustiques sauvages. Par ailleurs, l’utilisation des techniques d’association génomique ‘GWAS’ est basée sur un réelle compréhension des structures des populations.Ma thèse inclura une description detaillée du système immunitaire du moustique, basée sur la recherche actuelle et des comparaisons génomiques; ainsi que des descriptions des principales voies immunitaires, et des gênes potentiels mal-caracterisés qui peuvent être trouvés dans une étude GWAS. Ainsi qu’une description des connaissances actuelles des structures des populations, dont la speciation du gambiae / coluzzii, et les effets des grandes variations structurelles.Je présenterai le développement d’un nouveau moyen d'identification des variations structurelles ; utilisant les techniques d’ “apprentissage automatique” permettant d'identifier les karyotypes directement à partir des séquences haut-débit, menant à des résultats d’une précision sans précédent.Je présenterai également la première vraie cartographie génomique du ‘tout-génome’ du moustique. Les colonies sont fondées par des moustiques sauvages; les fondateurs sont controlées par strates, incluant également des sous-espèces et variations structurelles majeures. Avec ces colonies une méthode innovant de cartographie est utilisée: dans un premier temps, une identification des grandes régions au sein des groupes phenotypées par la perte de hétérogénéité; puis dans un second temps, le génotypage individuel ‘Sequenom’ sera utilisé pour une cartographie exacte. Cette méthode est utilisée pour l’identification d’une région avec un effet phenotypique sur la prévalence des infections dans la nature.Enfin, je suggèrerai comment ces techniques peuvent être importantes à l’avenir pour l’application du contrôle génomique dans la nature. / Despite successes in the use of insecticides in the control of malaria, malaria transmission continues in much of sub-saharan Africa. The search for novel methods of control (in particular genetic modification of vector populations), or of superior implementation of the currently available methods will require both greater knowledge of the population structure of the mosquito, and of the immune processes that are important in the wild. It is important to note that the mapping of novel immune genes, via genome wide association studies (GWAS) is predicated on a firm understanding of the population structure.My thesis will include a detailed description of the mosquito innate immune system based on current research and comparative genomics; this will illustrate the major pathways that might be employed in the anti-malarial response, and some potential uncharacterised genes that might be implicated in any GWAS study. It will also include a summary of what is known about the mosquito’s population structure, in particular the gambiae / coluzzii speciation event and the implication of chromosomal inversions in the speciation process.I will present the development of a novel approach to the identification of chromosomal inversions; using machine-learning techniques in order to call inversion karyotypes directly from sequence, leading to calls of unprecedented accuracy.I will also present the first truly genome-wide association study to have been performed in the mosquito. Strata-controlled populations of mosquitoes were derived from the wild, including restriction on the basis of subspecies and chromosomal inversion. A two-stage mapping design was then devised in which loss-of-heterozygosity is used to identify broad regions in phenotype pools, before fine-resolution mapping by Sequenom genotyping in individuals. This was used to identify a novel locus with a phenotypic effect on infection prevalence.Finally I will describe how these techniques and findings could be important in the future application of genetic control in the wild.
|
37 |
Genomweite Untersuchung einer sorbischen Kohorte zur Identifikation neuer mit dem Lipidstoffwechsel assoziierter PolymorphismenFörster, Julia 29 November 2012 (has links)
Lipide erfüllen als Energiespeicher und Element verschiedener Verbindungen für den Organismus lebenswichtige Funktionen. Verändert sich ihre Konzentration im Blut spricht man von Dyslipidämien. Diese stellen, insbesondere als Risikofaktor für kardiovaskuläre Erkrankungen, vor allem in Industrienationen ein bedeutendes gesundheitsökonomisches Problem dar.
Verschiedene Studien identifizierten in den letzten Jahren zahlreiche Genloci, die den Lipidstoffwechsel beeinflussen. Darunter befinden sich Gene wie APOC1, CETP oder LPL, die aufgrund ihrer Funktionalität eindeutig dem Fettstoffwechsel zuzuordnen sind. Zusätzlich gerieten bisher unbekannte Loci in den Gegenstand der Forschung. Zusammen erklären diese Loci derzeit lediglich 25 - 30 % der phänotypischen Ausprägung.
In der vorliegenden Studie wurde mittels genomweiter Assoziationsstudien (GWAS) nach weiteren, potentiell im Zusammenhang mit den Merkmalen HDL-, LDL-Cholesterol und Triglyceriden stehenden Genloci gesucht. Dazu wurde in einer eigenständigen Population, den Sorben aus der Oberlausitz, eine genomweite Assoziationsstudie (N = 839) durchgeführt. Es zeigten sich 13 signifikant assoziierte Loci mit einem P-Wert < 10 5. Anschließend wurden SNPs mit einem P-Wert < 0,01 in der sorbischen Kohorte für eine Metaanalyse mit den Daten der Probanden der Diabetes Genetics Initiative (N ~ 2600) ausgewählt. So konnten 21 Genloci mit einem kombinierten P-Wert < 10 4 bestätigt werden. Von diesen wurden 5 neu identifizierte, bisher nicht publizierte Varianten in der unabhängigen Metabolisches Syndrom Berlin Potsdam Kohorte (N = 2000) repliziert. Mit einem P-Wert von 1,78x10 7 zeigte die Variante rs8135828 im THOC5 Gen für den Parameter HDL-Cholesterol die stärkste Assoziation in der kombinierten Metaanalyse aller 3 Kohorten.
Weitere Analysen sind notwendig, um die Funktionen und den Regulationsmechanismus der identifizierten SNPs auf den Fettstoffwechsel genau zu verstehen.
|
38 |
The role of common genetic variation in model polygenic and monogenic traitsLango Allen, Hana January 2010 (has links)
The aim of this thesis is to explore the role of common genetic variation, identified through genome-wide association (GWA) studies, in human traits and diseases, using height as a model polygenic trait, type 2 diabetes as a model common polygenic disease, and maturity onset diabetes of the young (MODY) as a model monogenic disease. The wave of the initial GWA studies, such as the Wellcome Trust Case-Control Consortium (WTCCC) study of seven common diseases, substantially increased the number of common variants associated with a range of different multifactorial traits and diseases. The initial excitement, however, seems to have been followed by some disappointment that the identified variants explain a relatively small proportion of the genetic variance of the studied trait, and that only few large effect or causal variants have been identified. Inevitably, this has led to criticism of the GWA studies, mainly that the findings are of limited clinical, or indeed scientific, benefit. Using height as a model, Chapter 2 explores the utility of GWA studies in terms of identifying regions that contain relevant genes, and in answering some general questions about the genetic architecture of highly polygenic traits. Chapter 3 takes this further into a large collaborative study and the largest sample size in a GWA study to date, mainly focusing on demonstrating the biological relevance of the identified variants, even when a large number of associated regions throughout the genome is implicated by these associations. Furthermore, it shows examples of different features of the genetic architecture, such as allelic heterogeneity and pleiotropy. Chapter 4 looks at the predictive value and, therefore, clinical utility, of variants found to associate with type 2 diabetes, a common multifactorial disease that is increasing in prevalence despite known environmental risk factors. This is a disease where knowledge of the genetic risk has potentially substantial clinical relevance. Finally, Chapter 5 approaches the monogenic-polygenic disease bridge in the direction opposite to that approached in the past: most studies have investigated genes mutated in monogenic diseases as candidates for harboring common variants predisposing to related polygenic diseases. This chapter looks at the common type 2 diabetes variants as modifiers of disease onset in patients with a monogenic but clinically heterogeneous disease, maturity onset diabetes of the young (MODY).
|
39 |
The prediction of HLA genotypes from next generation sequencing and genome scan dataFarrell, John J. 22 January 2016 (has links)
Genome-wide association studies have very successfully found highly significant disease associations with single nucleotide polymorphisms (SNP) in the Major Histocompatibility Complex for adverse drug reactions, autoimmune diseases and infectious diseases. However, the extensive linkage disequilibrium in the region has made it difficult to unravel the HLA alleles underlying these diseases. Here I present two methods to comprehensively predict 4-digit HLA types from the two types of experimental genome data widely available.
The Virtual SNP Imputation approach was developed for genome scan data and demonstrated a high precision and recall (96% and 97% respectively) for the prediction of HLA genotypes. A reanalysis of 6 genome-wide association studies using the HLA imputation method identified 18 significant HLA allele associations for 6 autoimmune diseases: 2 in ankylosing spondylitis, 2 in autoimmune thyroid disease, 2 in Crohn's disease, 3 in multiple sclerosis, 2 in psoriasis and 7 in rheumatoid arthritis. The EPIGEN consortium also used the Virtual SNP Imputation approach to detect a novel association of HLA-A*31:01 with adverse reactions to carbamazepine.
For the prediction of HLA genotypes from next generation sequencing data, I developed a novel approach using a naïve Bayes algorithm called HLA-Genotyper. The validation results covered whole genome, whole exome and RNA-Seq experimental designs in the European and Yoruba population samples available from the 1000 Genomes Project. The RNA-Seq data gave the best results with an overall precision and recall near 0.99 for Europeans and 0.98 for the Yoruba population. I then successfully used the method on targeted sequencing data to detect significant associations of idiopathic membranous nephropathy with HLA-DRB1*03:01 and HLA-DQA1*05:01 using the 1000 Genomes European subjects as controls.
Using the results reported here, researchers may now readily unravel the association of HLA alleles with many diseases from genome scans and next generation sequencing experiments without the expensive and laborious HLA typing of thousands of subjects. Both algorithms enable the analysis of diverse populations to help researchers pinpoint HLA loci with biological roles in infection, inflammation, autoimmunity, aging, mental illness and adverse drug reactions.
|
40 |
Meta-analysis strategies for heterogeneous studies in genome-wide association studiesHong, Jaeyoung 21 June 2016 (has links)
Meta-analysis is a statistical technique that combines results from multiple independent studies to make inferences about parameters of interest. Although it is popular for parameter estimation and hypothesis testing, meta-analytic approaches that incorporate heterogeneous studies have not been fully developed. For heterogeneous studies, we do not expect all of the studies to have the same true underlying effect and the use of the fixed-effects model in a meta-analysis in this situation violates the assumption of homogeneity of effect size. Heterogeneity among studies can arise from multiple sources such as differences in populations by ancestry, differences in study designs, and different impacts of environmental exposures on the effect of the variable of interest. In this thesis, we introduce an analytic strategy and statistical models for meta-analysis of potentially heterogeneous studies. First, we propose a two-stage clustering approach to account for heterogeneity in trans-ethnic meta-analysis of genome-wide association studies (GWAS). Specifically, we cluster studies in the two-stage approach using cohort-specific genetic information prior to meta-analysis to account for between-cluster heterogeneity as well as to bolster within-cluster homogeneity. An extensive simulation study shows that this approach improves power and diminishes computational intensity compared to existing methods for trans-ethnic meta-analysis. Next, under a meta-regression framework, we develop a likelihood ratio test (LRT) statistic to accommodate multiple random effects. We allow multiple sources of heterogeneity in terms of study characteristics and model the heterogeneities as random effects. We show that the proposed LRT maintains a similar or higher power than other existing methods in a simulation study especially when heterogeneity exists. We apply this new approach to meta-analyze genome-wide association data. Lastly, we derive a score test in the same context as our proposed new LRT and show the substantial advantage of the score test in computational efficiency compared to the new LRT. The introduced strategy and methodologies can effectively and efficiently aggregate the evidence from potentially heterogeneous studies in statistical genetics and other research areas.
|
Page generated in 0.0343 seconds