• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 93
  • 38
  • 27
  • 18
  • 6
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 248
  • 71
  • 58
  • 47
  • 45
  • 45
  • 38
  • 33
  • 30
  • 30
  • 26
  • 23
  • 22
  • 21
  • 18
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Genetics of human obesity in the post-genome wide association study era

Li, Aihua January 2016 (has links)
Obesity has more than doubled worldwide since 1980 and it has become the focus of public health due to a wide range of serious complications. It is believed to be a complex disorder triggered by multiple genes, environmental factors and their interactions. The total number of single nucleotide polymorphisms (SNPs) associated with adult body mass index (BMI) at genome-wide significance level (P<5×10-8) has recently increased to 136. However, these genome-wide association studies (GWAS) have been conducted primarily in populations of European ancestry. This thesis aims to: 1) investigate whether these BMI SNPs are also associated with BMI in other ethnicities (South Asian, East Asian, African, Latino American and Native American) using a multi-ethnic prospective EpiDREAM cohort study; 2) explore the parental and child genetic contributions to obesity-related traits in children from birth to 5 years in the FAMILY cohort; 3) examine the maternal and child genetic contribution of BMI SNPs to maternal gestational weight gain (GWG) and postpartum weight retention in the FAMILY cohort. The major findings are: 1) most BMI susceptibility genes identified in Europeans are also associated with BMI in other five ethnicities. The effects of some SNPs and BMI genetic risk score (GRS) were modified by ethnicity; 2) SNPs contributing to adult BMI start to exert their effect at birth and in early childhood. Parent-of-origin effects may occur in a limited subset of obesity predisposing SNPs; and 3) there is no association between maternal and child GRS and GWG. But there is a genetic link between pre-pregnancy BMI variation and offspring birth weight and maternal postpartum weight retention. Taken together, these findings indicate that GWAS of specific ethnic group, children, birth weight and GWG are necessary to look for novel variants and alternative pathways influencing the development of obesity. / Thesis / Doctor of Philosophy (PhD) / Obesity is a chronic disorder triggered by multiple genes, environmental factors and their interactions. Currently most of the common genetic alterations, called single nucleotide polymorphisms (SNPs), associated with adult body mass index (BMI) were identified in populations of European ancestry. This thesis aims to: 1) investigate whether these BMI-associated SNPs are also associated with BMI in other ethnic groups; 2) explore the parental and child genetic contributions in children from birth to 5 years; 3) examine the maternal and child genetic contribution to maternal gestational weight gain (GWG) and postpartum weight retention. The major findings are: 1) BMI SNPs identified in Europeans are partially generalizable to other five ethnicities; 2) The collective SNPs contributing to adult BMI start to exert their effect at birth and in early childhood; and 3) There is a genetic link between pre-pregnancy BMI and offspring birth weight and maternal postpartum weight retention.
2

Post-GWAS bioinformatics and functional analysis of disease susceptibility loci

Martin, Paul January 2017 (has links)
Genome-wide association studies (GWAS) have been tremendously successful in identifying genetic variants associated with complex diseases, such as rheumatoid arthritis (RA). However, the majority of these associations lie outside traditional protein coding regions and do not necessarily represent the causal effect. Therefore, the challenges post-GWAS are to identify causal variants, link them to target genes and explore the functional mechanisms involved in disease. The aim of the work presented here is to use high level bioinformatics to help address these challenges. There is now an increasing amount of experimental data generated by several large consortia with the aim of characterising the non-coding regions of the human genome, which has the ability to refine and prioritise genetic associations. However, whilst being publicly available, manually mining and utilising it to full effect can be prohibitive. I developed an automated tool, ASSIMILATOR, which quickly and effectively facilitated the mining and rapid interpretation of this data, inferring the likely functional consequence of variants and informing further investigation. This was used in a large extended GWAS in RA which assessed the functional impact of associated variants at the 22q12 locus, showing evidence that they could affect gene regulation. Environmental factors, such as vitamin D, can also affect gene regulation, increasing the risk of disease but are generally not incorporated into most GWAS. Vitamin D deficiency is common in RA and can regulate genes through vitamin D response elements (VDREs). I interrogated a large, publicly available VDRE ChIP-Seq dataset using a permutation testing approach to test for VDRE enrichment in RA loci. This study was the first comprehensive analysis of VDREs and RA associated variants and showed that they are enriched for VDREs, suggesting an involvement of vitamin D in RA.Indeed, evidence suggests that disease associated variants effect gene regulation through enhancer elements. These can act over large distances through physical interactions. A newly developed technique, Capture Hi-C, was used to identify regions of the genome which physically interact with associated variants for four autoimmune diseases. This study showed the complex physical interactions between genetic elements, which could be mediated by regions associated with disease. This work is pivotal in fully characterising genetic associations and determining their effect on disease. Further work has re-defined the 6q23 locus, a region associated with multiple diseases, resulting in a major re-evaluation of the likely causal gene in RA from TNFAIP3 to IL20RA, a druggable target, illustrating the huge potential of this research. Furthermore, it has been used to study the genetic associations unique to multiple sclerosis in the same region, showing chromatin interactions which support previously implicated genes and identify novel candidates. This could help improve our understanding and treatment of the disease. Bioinformatics is fundamental to fully exploit new and existing datasets and has made many positive impacts on our understanding of complex disease. This empowers researchers to fully explore disease aetiology and to further the discovery of new therapies.
3

Genome-wide association study to find SNPs associated with circulating levels of the protein FGF-21

Risslén, Rebecca January 2020 (has links)
Colorectal cancer (CRC) is one of the most common types of cancer globally. In Sweden, every year over 6000 individuals are diagnosed with CRC making it the fourth most common form of cancer in the country. The symptoms of the disease occur late in its development, therefore diagnosis is often delayed, which has a negative effect on mortality. Once an individual starts to experience symptoms, a colonoscopy is performed to examine the colon and set a diagnosis. However, colonoscopy is straining for the individual and costly for the health care system. Therefore, a complementary risk screening method is needed to help identify high-risk individuals. Two separate studies have shown that individuals who develop CRC also have increased levels of the fibroblast protein (FGF-21). Thus, there is an interest in potentially using FGF-21 as a risk screening marker in a blood test for filtering out the high-risk individuals of colorectal cancer. However, it is not known whether FGF-21 is part of the causal pathway leading to CRC development or only a marker of increased risk. Therefore, more work is needed to better understand the role of FGF-21 in CRC disease. This study represents the first step in identifying if FGF-21 has any causal role in CRC. To do this I have tried to identify single genetic variants (so-called SNPs) in the human DNA that are associated with circulating levels of FGF-21 by conducting a genome-wide association study (GWAS). The genome and protein data used in the GWAS originated from 131 individuals participating in the Västerbotten Intervention Programme. Preliminary results showed no significant SNPs among the study subjects when correcting for multiple tests at a significance level of 5%. Although there were no significant findings I did find several indications of potential associations and the small size of the dataset might explain why they did not reach significance. The analytical pipeline I have created as part of this project will be used in a larger dataset where it will be possible to both verify potential associations from this study and hopefully identify other interesting SNPs. Any confirmed findings will in the future be used in a Mendelian Randomization study where the association between having SNPs that increase your levels of FGF-21 and the risk of CRC will be assessed. If such an association could be confirmed it would indicate that FGF-21 plays a causal role in CRC development.
4

Associação genômica ampla para caracteres relacionados à eficiência no uso de nitrogênio em linhagens de milho tropical / Genome-wide association for characters related to nitrogen use efficiency in tropical maize lines

Morosini, Julia Silva 10 January 2017 (has links)
A expansão dos locais e da época de cultivo do milho faz com que uma fração significativa da produção do cereal ocorra em condições de estresses abióticos. Grande parte desse cenário edafoclimático remete ao cultivo do milho safrinha, semeado entre janeiro e março e atualmente responsável por 65% da produção total. Dentre os diversos tipos de estresses abióticos, a deficiência de nitrogênio é comum e crítica nos solos brasileiros. Evidencia-se, portanto, a necessidade do desenvolvimento de genótipos mais eficientes no uso de nitrogênio, resultando em benefícios econômicos e ambientais. Caracteres morfofisiológicos podem auxiliar no processo seletivo de genótipos superiores para essas condições, como a mensuração do sistema radicular e a análise da taxa fotossintética. Nesse contexto, o objetivo do trabalho foi identificar regiões do genoma do milho tropical associadas ao comprimento de raiz, à fluorescência de clorofila e ao índice de resposta da planta ao estresse por nitrogênio. Para isto, foram avaliadas 64 linhagens endogâmicas de milho tropical em baixa e alta disponibilidade nitrogênio no solo e em dois locais de cultivo na Região de Piracicaba-SP, nas safras 2014/15 e 2015/16. Foram considerados o comprimento de raiz, a eficiência fotossintética do fotossistema II e o Índice de Tolerância à Baixa disponibilidade de nitrogênio (ITBN). As linhagens foram genotipadas com a plataforma Affymetrix&reg; Axiom&reg; Maize Genotyping Array de 616.201 marcadores SNP. A qualidade da informação genômica foi controlada pelos procedimentos Minor Allele Frequency e Call Rate e pela eliminação de heterozigotos. Os valores genotípicos foram preditos por meio de equações de modelos mistos do tipo REML/BLUP. Os dados de marcadores moleculares e fenotípicos foram analisados por Associação Genômica Ampla (GWAS). Verificou-se maior comprimento radicular em condições de baixa disponibilidade de nitrogênio. No total, sete marcadores significativos foram identificados, sendo quatro referentes ao ITBN, dois referentes ao comprimento de raiz em disponibilidade ideal e um referente ao comprimento de raiz em disponibilidade baixa do nutriente. Entre os principais processos biológicos identificados através da anotação funcional, estão o controle e regulação da transcrição, detectado para todos os caracteres avaliados, e a síntese de Guanosina Monofosfato Sintetase (GMP), enzima diretamente envolvida na disponibilização e reciclagem de nitrogênio. Também foi observada coincidência de região cromossômica entre marcadores significativos identificados e QTL potencialmente relacionados com a eficiência no uso de nitrogênio já reportados na literatura. Conclui-se que a técnica GWAS apresenta eficiência na detecção de marcadores associados aos caracteres de interesse, neste caso evidenciando processos e funções celulares relacionados aos diferentes processos da síntese e reciclagem de nitrogênio. / The expansion of locals and season period of maize crop makes a significant portion of the cereal production to occur under abiotic stress conditions. Much of this environmental scenario refers to the second growing season maize, sowed between January and March and responsible for 65% of total production currently. Among the different types of abiotic stresses, nitrogen deficiency is common and critical in Brazilian soils. Therefore, it becomes necessary to develop genotypes more efficient in nitrogen use, resulting in economic and environmental benefits. Morphophysiological characters may assist in the selective process of superior genotypes for these conditions, such as root system measurement and photosynthetic ratio analysis. In this context, this project aimed to identify tropical maize genomic regions associated with root morphological character, physiological parameter of photosystem II, and with the plant response index to nitrogen stress. To this end, 64 tropical maize inbreeding lines contrasting for nitrogen use efficiency were evaluated in low and ideal soil nitrogen availability in two cultivation sites in the region of Piracicaba-SP, in seasons 2014/15 and 2015/16. The characters root total length, chlorophyll fluorescence, and Low Nitrogen Tolerance index (LNTI) were considered. The lines were genotyped SNP using the Affymetrix&reg; Axiom&reg; Maize Genotyping Array with 616,201 SNP. The quality of the genomic information was controlled by Minor Allele Frequency and Call Rate procedures and by the elimination of heterozygous loci. The genotypic values were predicted using REML/BLUP mixed model equations. Genome-Wide Association Studies (GWAS) was performed to analyze molecular and phenotypic data. Greater root length under low availability nitrogen conditions was verified. In total, seven significant markers were identified, four referring to LNTI, two referring to root length under ideal nitrogen availability and one referring to root length under low nitrogen availability. Among the main biological processes identified trough functional annotation are the transcription control and regulation, detected to all evaluated characters, and the synthesis of Guanosine Monophosphate Synthetase (GMP), enzyme directly involved in the provision and recycling of nitrogen. It was also observed coincidence of chromosomal region between significant markers identified and QTLs potentially related in nitrogen use efficiency previously reported. As conclusion, GWAS technique shows efficiency in the detection of markers associated to the characters in focus, in this evidencing cellular processes and functions associated to the different process of nitrogen synthesis and recycling.
5

Mineração de dados de anemia falciforme e priapismo / Sickle cell disease and priapism data mining

Ozahata, Mina Cintho 02 July 2019 (has links)
O avanço de novas tecnologias tem conduzido à geração de grandes volumes de dados biológicos, provenientes, por exemplo, de sequenciamento de genomas, expressão de genes e proteínas, estrutura de proteínas e RNAs, análise de imagens, formulários eletrônicos e exames médicos. Com o intuito de transformar esses volumosos conjuntos de dados brutos em informação e conhecimento que sejam compreensíveis e interpretáveis, técnicas de mineração de dados têm sido aplicadas no estudos de diversos processos biológicos, como a predição de genes, funções de genes, fenótipos, módulos regulatórios, estrutura de proteínas, função de proteínas e descoberta de interações moleculares. Cada conjunto de dados tem suas particularidades, demandando o emprego de distintas metodologias de análises e algoritmos de reconhecimento de padrões, como Florestas Aleatórias, Redes Neurais, Deep Learning, Modelo Oculto de Markov, Máquina de Vetores de Suporte, K-médias e Análise de Componentes Principais. A escolha do algoritmo a ser utilizado é influenciada por fatores como o tipo dos dados, a forma como são gerados, sua natureza, suas características e o objetivo do estudo. Assim, este trabalho teve como objetivo explorar técnicas de reconhecimento de padrões e estatística aplicadas a um conjunto de dados biológicos envolvendo pacientes com anemia falciforme, para extração de informação e conhecimento sobre os processos, fenômenos e sistemas biológicos envolvidos na doença. Foram realizadas análises de um conjunto de dados diverso, proveniente de registros clínicos, entrevistas com pacientes, exames clínicos e sequenciamento de polimorfismos de nucleotídeo único. Os dados demandam diferentes abordagens de análises, exploração e revelação da estrutura de dados intrínseca. Em uma análise inicial, foram aplicados algoritmos de reconhecimento de padrões a dados clínicos de pacientes com anemia falciforme, com o objetivo de obter grupos contendo pacientes similares. Os algoritmos PCAMix, PAM e TwoStep clustering foram capazes de gerar grupos homogêneos de pacientes, sendo que estes grupos apresentam distintas características clínicas e diferentes níveis de gravidade da doença quando comparados entre si. Os resultados indicam que características como idade, níveis de bilirrubina, histórico de transfusões, dor aguda da anemia falciforme, síndrome torácica aguda, acidente vascular cerebral, infarto cerebral silencioso, ataque isquêmico transitório, úlcera de pernas, moyamoya, ferritina, contagem de reti- culócitos, retinopatias, ataques epiléticos e hemossiderose transfusional são importantes para a definição de grupos homogêneos de pacientes, que apresentem distintos níveis de gravidade de anemia falciforme quando comparados entre si. Adicionalmente à análise de agrupamento, o conjunto de pacientes com histórico de priapismo, uma das complicações da anemia falciforme, foi estudado. O objetivo desta análise foi caracterizar clinicamente os pacientes com histórico de priapismo, e investigar fatores genéticos que alterassem o risco da doença. Observou-se que o priapismo ocorreu mais frequentemente em pacientes com genótipo HbSS, estando associado a idades mais avançadas e à ocorrência de hipertensão pulmonar e necrose avascular. Dois novos SNPs foram associados à ocorrência de priapismo, bem como houve indicativo de replicação da associação do gene TGFBR3 ao risco da doença. / Technology has been producing large biological datasets of genome sequences, gene and protein expression, RNA and protein structure, images, electronic questionnaires and laboratory test results. In order to extract information and knowledge from these large datasets, data mining techniques have been used in the investigation of a wide range of biological processes, with the goal of predicting gene, gene function, phenotype, regulatory modules, molecular interaction, protein function and protein structure. Each dataset has different characteristics and demands the application of different statistical methodologies and pattern recognition algorithms, such as Random Forests, Neural Networks, Deep Learning, Markov Hidden Model, Support Vector Machine, K-means and Principal Component Analysis. The choice of the algorithm depends on data type, data generation, data characteristics and goal of the study. Therefore, the goal of this work was to explore pattern recognition and statistical techniques in a biological dataset on sickle cell disease patients, in order to extract information and knowledge about the biological systems, processes and mechanisms associated with the disease. A diverse dataset was analyzed, containing data from medical records, patient interviews, laboratory tests and single nucleotide polymorphisms. The dataset requires a variety of analysis approaches, in order to explore and reveal the hidden data structure. In an initial investigation, pattern recognition algorithms were used in the analysis of clinical data from sickle cell patients, in order to obtain clusters containing similar patients. PCAMix, PAM and TwoStep clustering algorithms generated homogeneous clusters of patients that display different clinical characteristics and different levels of disease severity. The results show that age, bilirubin levels, transfusion history, vaso-occlusive pain episodes, acute chest syndrome, infarctive stroke, hemorrhagic stroke, ischemic attack, leg ulcers, moyamoya, ferritin, reticulocyte count, retinopathy, seizures and transfusional hemosiderosis are important to define homogeneous patient clusters, with distinct levels of sickle cell severity. Additionally, the patients with history of priapism, a sickle cell related complication, were studied. The goal of the study was to characterize patients with priapism history and investigate genetic factors that modify the risks of the disease. Priapism more frequently occurred among patients with HbSS genotype and was associated with older age and occurrence of pulmonary hypertension and avascular necrosis. Two novel SNPs were associated with priapism and there was evidence of replication of a previously reported association of TGFBR3 with priapism risk.
6

Genomweite Untersuchung einer sorbischen Kohorte zur Identifikation neuer mit dem Lipidstoffwechsel assoziierter Polymorphismen

Förster, Julia 09 January 2013 (has links) (PDF)
Lipide erfüllen als Energiespeicher und Element verschiedener Verbindungen für den Organismus lebenswichtige Funktionen. Verändert sich ihre Konzentration im Blut spricht man von Dyslipidämien. Diese stellen, insbesondere als Risikofaktor für kardiovaskuläre Erkrankungen, vor allem in Industrienationen ein bedeutendes gesundheitsökonomisches Problem dar. Verschiedene Studien identifizierten in den letzten Jahren zahlreiche Genloci, die den Lipidstoffwechsel beeinflussen. Darunter befinden sich Gene wie APOC1, CETP oder LPL, die aufgrund ihrer Funktionalität eindeutig dem Fettstoffwechsel zuzuordnen sind. Zusätzlich gerieten bisher unbekannte Loci in den Gegenstand der Forschung. Zusammen erklären diese Loci derzeit lediglich 25 - 30 % der phänotypischen Ausprägung. In der vorliegenden Studie wurde mittels genomweiter Assoziationsstudien (GWAS) nach weiteren, potentiell im Zusammenhang mit den Merkmalen HDL-, LDL-Cholesterol und Triglyceriden stehenden Genloci gesucht. Dazu wurde in einer eigenständigen Population, den Sorben aus der Oberlausitz, eine genomweite Assoziationsstudie (N = 839) durchgeführt. Es zeigten sich 13 signifikant assoziierte Loci mit einem P-Wert < 10 5. Anschließend wurden SNPs mit einem P-Wert < 0,01 in der sorbischen Kohorte für eine Metaanalyse mit den Daten der Probanden der Diabetes Genetics Initiative (N ~ 2600) ausgewählt. So konnten 21 Genloci mit einem kombinierten P-Wert < 10 4 bestätigt werden. Von diesen wurden 5 neu identifizierte, bisher nicht publizierte Varianten in der unabhängigen Metabolisches Syndrom Berlin Potsdam Kohorte (N = 2000) repliziert. Mit einem P-Wert von 1,78x10 7 zeigte die Variante rs8135828 im THOC5 Gen für den Parameter HDL-Cholesterol die stärkste Assoziation in der kombinierten Metaanalyse aller 3 Kohorten. Weitere Analysen sind notwendig, um die Funktionen und den Regulationsmechanismus der identifizierten SNPs auf den Fettstoffwechsel genau zu verstehen.
7

In vitro and field based evaluation for grain mold resistance and its impact on quality traits in sorghum [Sorghum bicolor (L.) Moench]

Tomar, Sandeep Singh January 1900 (has links)
Master of Science / Department of Agronomy / Ramasamy Perumal / Tesfaye Tesso / Grain mold (GM) is an important biotic constraint limiting yield and market value of sorghum grains. It results in kernel discoloration and deterioration. Such kernels have reduced seed viability, low food and feed quality. Breeding for grain mold resistance is challenging because of the complex nature of host-pathogen-environment interactions. This complex task could be made simpler by utilizing molecular markers. Utilization of marker resources may help to find genomic regions associated with grain mold resistance. In this study, three sets of field and laboratory based experiments were performed which will help in finding potential grain mold pathogens responsible for kernel deterioration in the studied environment and search for genotypes with better kernel quality and grain mold resistance. In the first part of the study, in vitro screening of 44 grain mold resistant sorghum genotypes developed and released by Texas A & M AgriLife Research. This study was aimed at identifying sources resistance to grain mold infection through laboratory screening. The result revealed that genotypes Tx3371, Tx3373, Tx3374, Tx3376, Tx3407, Tx3400, and Tx3402 were have high level of resistance and were identified as potential sources of grain mold resistance as each showed minimal fungal infection and higher grain quality traits. The second experiment was performed to optimize surface sterilization protocol for the extraction of fungal pathogens from the kernel surface (pericarp) and to study the effect of bleach percentage and time period on pathogen extraction. Seven treatments using sterilized double distilled water (0 % bleach (v/v)) and different bleach (NaOCl) concentrations (2.5, 5, 7.5, 10, 12.5 and 15 %) were used with a time interval of 2.5, 5, 7.5 and 10 min. Optimized surface sterilization in the range of 7.5 to 15 % bleach (v/v) for 7.5 to 10 min resulted least contamination and fungal genera isolation from the surface of the kernel. The third study was aimed at characterizing genotypes (sorghum association panel) for grain mold pathogen F. thapsinum and by using genome wide association (GWA) tool in order to find genomic regions associated with grain mold resistance. We studied the effect of different agronomic and panicle architecture traits on grain mold incidence and severity. Effects of grain mold on kernel quality traits were also studied. We reported two loci associated with grain mold resistance. Based on first year field screening results, 46 genotypes having grain mold ratings 1-5 (1 = < 1% panicle kernel molded; 5 = > 50% panicle kernel molded) were selected for a detailed study aimed at understanding grain mold x fungal pathogen interactions to physical and chemical kernel traits. Seed germination test, vigor index, and tetrazolium viability test were performed to study effect of grain mold infection on kernel viability and vigor. Alternaria, Fusarium thapsinum, F. verticillioides and F. proliferatum were the main fungal genera isolated from bisected kernels. Based on two year screening, SC623, SC67, SC621, SC947 and SC1494 were most resistant based on both PGMR and TGMR rating while SC370, SC833, SC1484, and SC1077 showed the most susceptible reaction and this was consistent for individual location analysis. SC309, SC213, SC833, SC971 and SC1047 are genotypes having identified loci for grain mold resistance.
8

Understanding the genetic basis of complex polygenic traits through Bayesian model selection of multiple genetic models and network modeling of family-based genetic data

Bae, Harold Taehyun 12 March 2016 (has links)
The global aim of this dissertation is to develop advanced statistical modeling to understand the genetic basis of complex polygenic traits. In order to achieve this goal, this dissertation focuses on the development of (i) a novel methodology to detect genetic variants with different inheritance patterns formulated as a Bayesian model selection problem, (ii) integration of genetic data and non-genetic data to dissect the genotype-phenotype associations using Bayesian networks with family-based data, and (iii) an efficient technique to model the family-based data in the Bayesian framework. In the first part of my dissertation, I present a coherent Bayesian framework for selection of the most likely model from the five genetic models (genotypic, additive, dominant, co-dominant, and recessive) used in genetic association studies. The approach uses a polynomial parameterization of genetic data to simultaneously fit the five models and save computations. I provide a closed-form expression of the marginal likelihood for normally distributed data, and evaluate the performance of the proposed method and existing methods through simulated and real genome-wide data sets. The second part of this dissertation presents an integrative analytic approach that utilizes Bayesian networks to represent the complex probabilistic dependency structure among many variables from family-based data. I propose a parameterization that extends mixed effects regression models to Bayesian networks by using random effects as additional nodes of the networks to model the between-subjects correlations. I also present results of simulation studies to compare different model selection metrics for mixed models that can be used for learning BNs from correlated data and application of this methodology to real data from a large family-based study. In the third part of this dissertation, I describe an efficient way to account for family structure in Bayesian inference Using Gibbs Sampling (BUGS). In linear mixed models, a random effects vector has a variance-covariance matrix whose dimension is as large as the sample size. However, a direct handling of this multivariate normal distribution is not computationally feasible in BUGS. Therefore, I propose a decomposition of this multivariate normal distribution into univariate normal distributions using singular value decomposition, and implementation in BUGS is presented.
9

Estudo de seleção genômica para características de produção e qualidade do leite de búfalas / Genomic selection studies for production and quality traits of milk buffaloes

Barros, Camila da Costa [UNESP] 21 July 2017 (has links)
Submitted by CAMILA DA COSTA BARROS null (mila_costabarros@hotmail.com) on 2017-08-17T13:24:57Z No. of bitstreams: 1 Tese_Camila - FINAL.docx: 737000 bytes, checksum: e40825f52af0f1575f9e2f615ec5d0b8 (MD5) / Rejected by Luiz Galeffi (luizgaleffi@gmail.com), reason: Solicitamos que realize uma nova submissão seguindo as orientações abaixo: A versão final da dissertação/tese deve ser submetida no formato PDF (Portable Document Format). O arquivo PDF não deve estar protegido e a dissertação/tese deve estar em um único arquivo, inclusive os apêndices e anexos, se houver. Inserir o número do processo de financiamento FAPESP nos agradecimentos da tese/dissertação Por favor, corrija as informações e realize uma nova submissão. Agradecemos a compreensão on 2017-08-23T17:39:46Z (GMT) / Submitted by CAMILA DA COSTA BARROS null (mila_costabarros@hotmail.com) on 2017-08-23T18:30:22Z No. of bitstreams: 1 Tese_Camila_reposit.pdf: 1109917 bytes, checksum: e1e4366545eb2571fb7432f8e1635b49 (MD5) / Approved for entry into archive by Luiz Galeffi (luizgaleffi@gmail.com) on 2017-08-23T18:49:36Z (GMT) No. of bitstreams: 1 barros_cc_dr_jabo.pdf: 1109917 bytes, checksum: e1e4366545eb2571fb7432f8e1635b49 (MD5) / Made available in DSpace on 2017-08-23T18:49:36Z (GMT). No. of bitstreams: 1 barros_cc_dr_jabo.pdf: 1109917 bytes, checksum: e1e4366545eb2571fb7432f8e1635b49 (MD5) Previous issue date: 2017-07-21 / Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) / Objetivou-se com o presente trabalho comparar diferentes métodos Bayesianos de predição genômica para as características de produção de leite (PL) e as porcentagens de gordura (%G) e proteína (%P) no leite de búfalas e, realizar um estudo de associação genômica ampla, a fim de identificar regiões cromossômicas e genes possivelmente relacionados às mesmas, utilizando informações de indivíduos genotipados e não genotipados. O número de animais com fenótipo foi 3.355, o arquivo de pedigree continha 15.495 animais, dos quais 322 foram genotipados com o 90 K Axiom® Buffalo Genotyping array. Os seguintes critérios de controle de qualidade dos SNPs foram utilizados: MAF < 0,05; Call Rate < 0,95 e Equilíbrio de Hardy-Weinberg p-value < 10-6. Em relação à amostra foi considerado call rate <0,90. Para as predições genômicas, os seguintes modelos Bayesianos foram utilizados: Bayes A (BA), Bayes B (BB), Bayes C (BC) e Bayes LASSO (BL). O fenótipo corrigido para os efeitos fixos (Y*) foi utilizado como variável resposta nas análises genômicas. A habilidade de predição dos diferentes modelos foi avaliada usando o método leave-one-out de validação cruzada. As acurácias de predição foram calculadas através da correlação de Pearson entre o valor genético genômico estimado (GEBV) e a variável resposta (Y*) para cada modelo e característica avaliados. Em relação ao estudo de associação genômica ampla, um processo iterativo foi realizado para calcular os pesos dos marcadores em função do quadrado dos efeitos dos SNPs e das frequências alélicas (ssGWAS). Em geral, todos os modelos Bayesianos demonstraram semelhantes acurácias de predição, variando de 0,41 a 0,42, 0,38 a 0,39 e 0,39 a 0,40 para a PL, %G e %P, respectivamente. Portanto, os métodos BA, BB, BC e BL podem ser utilizados nas predições dos efeitos dos SNPs, obtendo-se, praticamente, as mesmas acurácias de predição. Os dez SNPs de maiores efeitos para a PL, %G e %P explicaram 7,48, 9,94 e 6,56% da proporção da variância genética, respectivamente. Os resultados do ssGWAS revelaram regiões cromossômicas e genes que podem estar relacionados com as características analisadas. Tais regiões e genes identificados poderão contribuir para o melhor entendimento sobre a influência dos mesmos nas características produção de leite e as porcentagens de gordura e proteína no leite de búfalas. / The aim of this study was to compare different Bayesian methods of genomic prediction for milk yield (MY), fat (%F) and protein (%P) percentages in dairy buffaloes in Brazil, and to perform a genome-wide association study for the purpose of identify chromosomal regions and genes possibly related to the these traits, using information from genotyped and non-genotyped individuals. The number of animals with phenotype was 3,355, the pedigree file contained 15,495 animals, of which 322 were genotyped. The animals were genotyped using a 90K SNP panel (Axiom® Buffalo Genotyping Array). The following criteria for quality control of SNPs were used: MAF < 0.05, Call Rate < 0.95 and Hardy-Weinberg Equilibrium p-value < 10-6 . In relation to the sample, a Call Rate <0.90 was used. Four methods for genomic prediction were used: Bayes A (BA), Bayes B (BB), Bayes C (BC) and Bayes LASSO (BL). Phenotypes for the fixed effects (Y*) were used as response variables. The predictive ability of the different models was evaluated using a leave-one-out cross-validation approach. The prediction accuracy was calculated by Pearson's correlation between estimated genomic genetic value (GEBV) and response variable (Y*) for each model. In relation to genome-wide association studies, an iterative process was performed to derive SNP weights as function of squares of SNP effects and allele frequencies (ssGWAS). In general, all Bayesian models showed similar prediction accuracy, ranging from 0.41 to 0.42, 0.38 to 0.39 and 0,39 to 0,40 for MY, %F and %P, respectively. Therefore, the methods BA, BB, BC and BL can be used in the predictions of the effects of SNPs, obtaining, practically, the same prediction accuracy. The proportions of variance explained by the top 10 SNPs for MY, %F and %P were: 7.48, 9.94 and 6.56%, respectively. The results of ssGWAS revealed chromosomal regions and genes that may be related with the analyzed traits. These regions and genes may contribute to a better understanding of their influence on milk yield and fat and protein percentages in buffalo milk. / FAPESP: 2013/24427-3 / FAPESP: 2015/18614-0
10

An Association Study Revealed Substantial Effects of Dominance, Epistasis and Substance Dependence Co-Morbidity on Alcohol Dependence Symptom Count

Chen, Gang, Zhang, Futao, Xue, Wenda, Wu, Ruyan, Xu, Haiming, Wang, Kesheng, Zhu, Jun 01 November 2017 (has links)
Alcohol dependence is a complex disease involving polygenes, environment and their interactions. Inadequate consideration of these interactions may have hampered the progress on genome-wide association studies of alcohol dependence. By using the dataset of the Study of Addiction: Genetics and Environment with 3838 subjects, we conducted a genome-wide association studies of alcohol dependence symptom count (ADSC) with a full genetic model considering additive, dominance, epistasis and their interactions with ethnicity, as well as conditions of co-morbid substance dependence. Twenty quantitative trait single nucleotide polymorphisms (QTSs) showed highly significant associations with ADSC, including four previously reported genes (ADH1C, PKNOX2, CPE and KCNB2) and the reported intergenic rs1363605, supporting the overall validity of the analysis. Two QTSs within or near ADH1C showed very strong association in a dominance inheritance mode and increased the phenotype value of ADSC when the effect of co-morbid opiate or marijuana dependence was controlled. Highly significant association was also identified in variants within four novel genes (RGS6, FMN1, NRM and BPTF), two non-coding RNA and two epistasis loci. QTS rs7616413, located near PTPRG encoding a protein tyrosine phosphatase receptor, interacted with rs10090742 within ANGPT1 encoding a protein tyrosine phosphatase in an additive × additive or dominance × additive manner. The detected QTSs contributed to about 20 percent of total heritability, in which dominance and epistasis effects accounted for over 50 percent. These results demonstrated that perturbations arising from gene–gene interaction and conditions of co-morbidity substantially influence the genetic architecture of complex trait.

Page generated in 0.0313 seconds