• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7
  • Tagged with
  • 9
  • 9
  • 6
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Imputing Genotypes Using Regularized Generalized Linear Regression Models

Griesman, Joshua 14 June 2012 (has links)
As genomic sequencing technologies continue to advance, researchers are furthering their understanding of the relationships between genetic variants and expressed traits (Hirschhorn and Daly, 2005). However, missing data can significantly limit the power of a genetic study. Here, the use of a regularized generalized linear model, denoted GLMNET is proposed to impute missing genotypes. The method aimed to address certain limitations of earlier regression approaches in regards to genotype imputation, particularly multicollinearity among predictors. The performance of GLMNET-based method is compared to the performance of the phase-based method fastPHASE. Two simulation settings were evaluated: a sparse-missing model, and a small-panel expan- sion model. The sparse-missing model simulated a scenario where SNPs were missing in a random fashion across the genome. In the small-panel expansion model, a set of test individuals that were only genotyped at a small subset of the SNPs of the large panel. Each imputation method was tested in the context of two data-sets: Canadian Holstein cattle data and human HapMap CEU data. Although the proposed method was able to perform with high accuracy (>90% in all simulations), fastPHASE per- formed with higher accuracy (>94%). However, the new method, which was coded in R, was able to impute genotypes with better time efficiency than fastPHASE and this could be further improved by optimizing in a compiled language.
2

Assessing the Impact of Genotype Imputation on Meta-analysis of Genetic Association Studies

Omondi, Emmanuel 28 July 2014 (has links)
In this thesis,we study how a meta-analysis of genetic association studies is influenced by the degree of genotype imputation uncertainty in the studies combined and the size of meta-analysis. We consider the fixed effect meta-analysis model to evaluate the accuracy and efficiency of imputation-based meta-analysis results under different levels of imputation accuracy. We also examine the impact of genotype imputation on the between-study heterogeneity and type 1 error in the random effects meta-analysis model. Simulation results reaffirm that meta-analysis boosts the power of detecting genetic associations compared to individual study results. However, the power deteriorates with increasing uncertainty in imputed genotypes. Genotype imputation affects a random effects meta-analysis in a non-obvious way as estimation of between-study heterogeneity and interpretation of association results depend heavily on the number of studies combined. We propose an adjusted fixed effect meta-analysis approach for adding imputation-based studies to a meta-analysis of existing typed studies in a controlled way to improve precision and reliability. The proposed method should help in designing an effective meta-analysis study.
3

Genome-wide Genotype Imputation-Aspects of Quality, Performance and Practical Implementation

Roshyara, Nab Raj 06 August 2020 (has links)
Finding a relation between a particular phenotype and genotype is one of the central themes in medical genetics. Single-nucleotide polymorphisms are easily assessable markers allowing genome wide association (GWA) studies and meta-analysis. Hundreds of such analyses were performed in the last decades. Even though several tools for such analyses are available, an efficient SNP-data transformation tool was tool was necessary. We developed a data management tool fcGENE which allows us easy transformation of genetic data into different formats required by different GWA tools. Genotype imputation which is a common technique in GWA, allows us to study the relationship of a phenotype at markers that are missing and even at completely un-typed markers. Moreover this technique helps us to infer both common and rare variants that are not directly typed. We studied different aspects of the imputation processes especially focussing on its accuracy. More specifically, our focus lied on the impact of pre-imputation filtering on the accuracy of imputation results. To measure the imputation accuracy, we defined two new statistical sores, which allowed us the comparison between imputed and true genotypes directly. Our direct comparison between the true and imputed genotypes showed that strict quality filtering of SNPs prior to imputation process may be detrimental. We further studied the impact of differently selected reference panels from publicly available projects like HapMap and 1000 genome projects on the imputation quality. More specifically, we analysed the relationship between genetic distance of the reference and the resulting imputation quality. For this purpose, we considered different summary statistics of population differentiation (e.g. Reich’s , Nei’s and other modified scores) between the study data set and the reference panel used in imputation processes. In the third analysis, we compared two basic trends of using reference panels in imputation process: (1) use of genetically best-matched reference panel, and (2) use of an admixed reference panel that allows the use of individual reference panel from all possible type of populations, and let the software itself select the optimal references in a piece-wise manner or as complete sequences of SNPs for each individual separately. We have analysed in detail the performance of different imputation software and also the accuracy of the imputation processes in both cases. We found that the current trend of using software with admixed reference panel in all cases is not always the best strategy. Prior to imputation process, phasing of study data sets by using an external reference panel is also a common trend especially when it comes to the imputation of large datasets. We studied the performance of different imputation frameworks with or without pre-phasing. It turned out that pre-phasing clearly reduces the imputation quality for medium-sized data sets.:Table of Contents List of Tables IV List of Figures V 1 Overview of the Thesis 1 1.1 Abstract 1 1.2 Outlines 4 2 Introduction 5 2.1 Basics of genetics 5 2.1.1 Phenotype, genotype and haplotype 5 2.1.2 Hardy-Weinberg law 6 2.1.3 Linkage disequilibrium 6 2.1.4 Genome-wide association analysis 7 2.2 Phasing of Genotypes 7 2.3 Genotype imputation 8 2.3.1 Tools for Imputing genotype data 9 2.3.2 Reference panels 9 3 Results 11 3.1 Detailed Abstracts 11 3.1.1 First Research Paper 11 3.1.2 Second Research Paper 14 3.1.3 Third Research Paper 17 3.1.4 Fourth Research Paper 19 3.2 Discussion and Conclusion 22 4 Published Articles 27 4.1 First Research Paper 27 4.1.1 Supplementary Information 34 4.2 Second Research Paper 51 4.2.1 Supplementary Information 62 4.3 Third Research Paper 69 4.3.1 Supplementary Information 85 4.4 Fourth Research Paper 97 4.4.1 Supplementary Information 109 5 Zusammenfassung der Arbeit 117 6 Bibliography 120 7 Eigene Publikationen 124 8 Darstellung des eigenen Beitrags 125 8.1 First Research Paper 125 8.2 Second Research Paper 126 8.3 Third Research Paper 127 8.4 Fourth Research Paper 128 9 Erklärung über die eigenständige Abfassung der Arbeit 129 10 Danksagung 130 11 Curriculum Vitae 131 List of Tables IV List of Figures V 1 Overview of the Thesis 1 1.1 Abstract 1 1.2 Outlines 4 2 Introduction 5 2.1 Basics of genetics 5 2.1.1 Phenotype, genotype and haplotype 5 2.1.2 Hardy-Weinberg law 6 2.1.3 Linkage disequilibrium 6 2.1.4 Genome-wide association analysis 7 2.2 Phasing of Genotypes 7 2.3 Genotype imputation 8 2.3.1 Tools for Imputing genotype data 8 2.3.2 Reference panels 8 3 Results 8 3.1 Detailed Abstracts 8 3.1.1 First Research Paper 8 3.1.2 Second Research Paper 8 3.1.3 Third Research Paper 8 3.1.4 Fourth Research Paper 8 3.2 Discussion and Conclusion 8 4 Published Articles 8 4.1 First Research Paper 8 4.1.1 Supplementary Information 8 4.2 Second Research Paper 8 4.2.1 Supplementary Information 8 4.3 Third Research Paper 8 4.3.1 Supplementary Information 8 4.4 Fourth Research Paper 8 4.4.1 Supplementary Information 8 5 Zusammenfassung der Arbeit 8 6 Bibliography 8 7 Eigene Publikationen 8 8 Erklärung über die eigenständige Abfassung der Arbeit 8 9 Danksagung 8 10 Curriculum Vitae 8
4

Studies on genomic prediction for carcass traits in Japanese Black cattle / 黒毛和種の枝肉形質を対象としたゲノミック予測に関する研究

Ogawa, Shinichiro 23 March 2017 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(農学) / 甲第20427号 / 農博第2212号 / 新制||農||1048(附属図書館) / 学位論文||H29||N5048(農学部図書室) / 京都大学大学院農学研究科応用生物科学専攻 / (主査)准教授 谷口 幸雄, 教授 今井 裕, 教授 廣岡 博之 / 学位規則第4条第1項該当 / Doctor of Agricultural Science / Kyoto University / DFAM
5

Novel Statistical Methods for Multiple-variant Genetic Association Studies with Related Individuals

Guan, Ting 09 July 2018 (has links)
Genetic association studies usually include related individuals. Meanwhile, high-throughput sequencing technologies produce data of multiple genetic variants. Due to linkage disequilibrium (LD) and familial relatedness, the genotype data from such studies often carries complex correlations. Moreover, missing values in genotype usually lead to loss of power in genetic association tests. Also, repeated measurements of phenotype and dynamic covariates from longitudinal studies bring in more opportunities but also challenges in the discovery of disease-related genetic factors. This dissertation focuses on developing novel statistical methods to address some challenging questions remaining in genetic association studies due to the aforementioned reasons. So far, a lot of methods have been proposed to detect disease-related genetic regions (e.g., genes, pathways). However, with multiple-variant data from a sample with relatedness, it is critical to account for the complex genotypic correlations when assessing genetic contribution. Recognizing the limitations of existing methods, in the first work of this dissertation, the Adaptive-weight Burden Test (ABT) --- a score test between a quantitative trait and the genotype data with complex correlations --- is proposed. ABT achieves higher power by adopting data-driven weights, which make good use of the LD and relatedness. Because the null distribution has been successfully derived, the computational simplicity of ABT makes it a good fit for genome-wide association studies. Genotype missingness commonly arises due to limitations in genotyping technologies. Imputation of the missing values in genotype usually improves quality of the data used in the subsequent association test and thus increases power. Complex correlations, though troublesome, provide the opportunity to proper handling of genotypic missingness. In the second part of this dissertation, a genotype imputation method is developed, which can impute the missingness in multiple genetic variants via the LD and the relatedness. The popularity of longitudinal studies in genetics and genomics calls for methods deliberately designed for repeated measurements. Therefore, a multiple-variant genetic association test for a longitudinal trait on samples with relatedness is developed, which treats the longitudinal measurements as observations of functions and thus takes into account the time factor properly. / PHD
6

Applications of the Illumina BovineSNP50 BeadChip in Genetic Improvement of Beef Cattle

Lu, Duc 12 November 2012 (has links)
The release of the Illumina BovineSNP50 BeadChip in late 2007 has drawn attention from cattle breeders around the world to develop breeding programs that leverage association of these single nucleotide polymorphism (SNP) with economically important quantitative trait loci (QTL). In that context this project has come to study applications of the SNP panel in beef cattle. Analysis of linkage disequilibrium (LD) existing in Angus, Charolais, and crossbred animals revealed the pattern of LD within each breed group, as well as the persistence of LD phase between pairs of the breed groups. This is important for genomic selection where SNP are trained in one population and used to predict breeding value for animals in another population. Detection of chromosome regions potentially carrying QTL or causative mutations affecting the phenotypic variation in economically important traits was presented at individual SNP and haplotype levels. There were 269 SNP associated (P<0.001) with birth weight (BWT), weaning weight (WWT), average daily gain (ADG), dry matter intake (DMI), mid-test metabolic weight (MMWT), residual feed intake (RFI). They explained 1.64% - 8.06% of the phenotypic variation in these traits. There were 520 SNP associated (P<0.001) with carcass quality traits, namely hot carcass weight, back fat thickness, ribeye area, marbling scores, lean yield grade by Beef Improvement Federation, steak tenderness, and six rib dissection traits. These SNP explained 1.90 - 5.89% of the phenotypic variance of the traits. Many of the significant SNP were located on chromosome 6. Six haplotypes were found associated (P<0.05) with ADG, DMI, and RFI. In order for genomic selection to happen in beef cattle, higher density SNP panels should be made available at low genotyping cost. However, the cost of genotyping animals for high density SNP chip is still high, thus genotype imputation has come to practice. The last chapter of this thesis compared two approaches presently used in genotype imputation, investigated factors affecting imputation accuracy, as well as the impact of imputation accuracy on genomic estimated breeding value (GEBV). It proved that the highest possible accuracy of GEBV is attainable with sufficiently large groups of reference animals. / Ontario Ministry of Agriculture, Food and Rural Affairs. Ontario Cattlemen’s Association. Ontario Farm Innovation Program. Agriculture and Agri-Food Canada’s Growing Forward Program. Agriculture Adaptation Council. Ontario Research and Development Program. MITACS Accelerate. Beef Improvement Opportunities.
7

Impact of pre-imputation SNP-filtering on genotype imputation results

Roshyara, Nab Raj, Kirsten, Holger, Horn, Katrin, Ahnert, Peter, Scholz, Markus 10 September 2014 (has links) (PDF)
Background: Imputation of partially missing or unobserved genotypes is an indispensable tool for SNP data analyses. However, research and understanding of the impact of initial SNP-data quality control on imputation results is still limited. In this paper, we aim to evaluate the effect of different strategies of pre-imputation quality filtering on the performance of the widely used imputation algorithms MaCH and IMPUTE. Results: We considered three scenarios: imputation of partially missing genotypes with usage of an external reference panel, without usage of an external reference panel, as well as imputation of ompletely un-typed SNPs using an external reference panel. We first created various datasets applying different SNP quality filters and masking certain percentages of randomly selected high-quality SNPs. We imputed these SNPs and compared the results between the different filtering scenarios by using established and newly proposed measures of imputation quality. While the established measures assess certainty of imputation results, our newly proposed measures focus on the agreement with true genotypes. These measures showed that pre-imputation SNP-filtering might be detrimental regarding imputation quality. Moreover, the strongest drivers of imputation quality were in general the burden of missingness and the number of SNPs used for imputation. We also found that using a reference panel always improves imputation quality of partially missing genotypes. MaCH performed slightly better than IMPUTE2 in most of our scenarios. Again, these results were more pronounced when using our newly defined measures of imputation quality. Conclusion: Even a moderate filtering has a detrimental effect on the imputation quality. Therefore little or no SNP filtering prior to imputation appears to be the best strategy for imputing small to moderately sized datasets. Our results also showed that for these datasets, MaCH performs slightly better than IMPUTE2 in most scenarios at the cost of increased computing time.
8

Two Optimization Problems in Genetics : Multi-dimensional QTL Analysis and Haplotype Inference

Nettelblad, Carl January 2012 (has links)
The existence of new technologies, implemented in efficient platforms and workflows has made massive genotyping available to all fields of biology and medicine. Genetic analyses are no longer dominated by experimental work in laboratories, but rather the interpretation of the resulting data. When billions of data points representing thousands of individuals are available, efficient computational tools are required. The focus of this thesis is on developing models, methods and implementations for such tools. The first theme of the thesis is multi-dimensional scans for quantitative trait loci (QTL) in experimental crosses. By mating individuals from different lines, it is possible to gather data that can be used to pinpoint the genetic variation that influences specific traits to specific genome loci. However, it is natural to expect multiple genes influencing a single trait to interact. The thesis discusses model structure and model selection, giving new insight regarding under what conditions orthogonal models can be devised. The thesis also presents a new optimization method for efficiently and accurately locating QTL, and performing the permuted data searches needed for significance testing. This method has been implemented in a software package that can seamlessly perform the searches on grid computing infrastructures. The other theme in the thesis is the development of adapted optimization schemes for using hidden Markov models in tracing allele inheritance pathways, and specifically inferring haplotypes. The advances presented form the basis for more accurate and non-biased line origin probabilities in experimental crosses, especially multi-generational ones. We show that the new tools are able to reconstruct haplotypes and even genotypes in founder individuals and offspring alike, based on only unordered offspring genotypes. The tools can also handle larger populations than competing methods, resolving inheritance pathways and phase in much larger and more complex populations. Finally, the methods presented are also applicable to datasets where individual relationships are not known, which is frequently the case in human genetics studies. One immediate application for this would be improved accuracy for imputation of SNP markers within genome-wide association studies (GWAS). / eSSENCE
9

Impact of pre-imputation SNP-filtering on genotype imputation results

Roshyara, Nab Raj, Kirsten, Holger, Horn, Katrin, Ahnert, Peter, Scholz, Markus January 2014 (has links)
Background: Imputation of partially missing or unobserved genotypes is an indispensable tool for SNP data analyses. However, research and understanding of the impact of initial SNP-data quality control on imputation results is still limited. In this paper, we aim to evaluate the effect of different strategies of pre-imputation quality filtering on the performance of the widely used imputation algorithms MaCH and IMPUTE. Results: We considered three scenarios: imputation of partially missing genotypes with usage of an external reference panel, without usage of an external reference panel, as well as imputation of ompletely un-typed SNPs using an external reference panel. We first created various datasets applying different SNP quality filters and masking certain percentages of randomly selected high-quality SNPs. We imputed these SNPs and compared the results between the different filtering scenarios by using established and newly proposed measures of imputation quality. While the established measures assess certainty of imputation results, our newly proposed measures focus on the agreement with true genotypes. These measures showed that pre-imputation SNP-filtering might be detrimental regarding imputation quality. Moreover, the strongest drivers of imputation quality were in general the burden of missingness and the number of SNPs used for imputation. We also found that using a reference panel always improves imputation quality of partially missing genotypes. MaCH performed slightly better than IMPUTE2 in most of our scenarios. Again, these results were more pronounced when using our newly defined measures of imputation quality. Conclusion: Even a moderate filtering has a detrimental effect on the imputation quality. Therefore little or no SNP filtering prior to imputation appears to be the best strategy for imputing small to moderately sized datasets. Our results also showed that for these datasets, MaCH performs slightly better than IMPUTE2 in most scenarios at the cost of increased computing time.

Page generated in 0.087 seconds