• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • Tagged with
  • 4
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Statistical methods for genotype microarray data on large cohorts of individuals

O'Connell, Jared Michael January 2014 (has links)
Genotype microarrays assay hundreds of thousands of genetic variants on an individual's genome. The availability of this high throughput genotyping capability has transformed the field of genetics over the past decade by enabling thousands of individuals to be rapidly assayed. This has lead to the discovery of hundreds of genetic variants that are associated with disease and other phenotypes in genome wide association studies (GWAS). These data have also brought with them a number of new statistical and computational challenges. This thesis deals with two primary analysis problems involving microarray data; genotype calling and haplotype inference. Genotype calling involves converting the noisy bivariate fluorescent signals generated by microarray data into genotype values for each genetic variant and individual. Poor quality genotype calling can lead to false positives and loss of power in GWAS so this is an important task. We introduce a new genotype calling method that is highly accurate and has the novel capability of fusing microarray data with next-generation sequencing data for greater accuracy and fewer missing values. Our new method compares favourably to other available genotype calling software. Haplotype inference (or phasing) involves deconvolving these genotypes into the two inherited parental chromosomes for an individual. The development of phasing methods has been a fertile field for statistical genetics research for well over ten years. Depending on the demography of a cohort, different phasing methods may be more appropriate than others. We review the popular offerings and introduce a new approach to try and unify two distinct problems; the phasing of extended pedigrees and the phasing of unrelated individuals. We conduct an extensive comparison of phasing methods on real and simulated data. Finally we demonstrate some preliminary results on extending methodology to sample sizes in the tens of thousands.
2

Genotipagem de poliplóides: um modelo de urnas e bolas / Polyploid genotyping: an urn model

Faria Junior, Silvio Rodrigues de 30 May 2012 (has links)
Desde os primórdios da agricultura e pecuária, o homem seleciona indivíduos com características desejáveis para reprodução e aumento da proporção de novos indivíduos com tais qualidades. Com o conhecimento da estrutura de DNA e o advento da engenharia genética, a identificação e caracterização de espécies e indivíduos conta com novas tecnologias para auxiliar no desenvolvimento de novas variedades de plantas e animais para diversos fins. Tais tecnologias envolvem procedimentos bioquímicos e físicos cada vez mais apurados que produzem medidas cada vez mais precisas, um exemplo disso são as técnicas que empregam a espectometria de massa para comparar polimorfismos de base única (SNPs). Nas plantas é comum a ocorrência de poliploidia, que consiste na presença de mais de dois cromossomos num mesmo grupo de homologia. A determinação do nível de ploidia é fundamental para a correta genotipagem e por consequência maior eficiência no estudo e aprimoramento genético de plantas. Neste trabalho caracterizamos o fenômeno da poliploidia com modelos probabilísticos de urnas e bolas, propondo um método eficiente e adequado de simulação, assim como uma técnica simples para inferir níveis de ploidia e classificar amostras bialélicas aproveitando características geométricas do problema. Análises de dados simulados e reais provenientes de um experimento de cana-de-açúcar foram realizadas com diferentes medidas de separação entre agrupamentos e diferentes condições experimentais. Para os dados reais, métodos gráficos descritivos evidenciam a corretude e coerência do método proposto, que pode ser generalizado para a genotipagem de locos multialélicos poliplóides. Encerramos o trabalho comparando nossos resultados com a abordagem SuperMASSA [Serang2012] que trouxe excelentes resultados ao problema. Todo código desenvolvido em linguagem R está disponibilizado com o texto. / Since the beginnings of agriculture and livestock, the man selects individuals with desirable characteristics to breed and increase the proportion of new individuals with such qualities. With knowledge of the DNA structure and the advent of genetic engineering, the identification and characterization of individual species can make use of new technologies to help develop new varieties of plants and animals for many purposes. These technologies involve complex biochemical and physical procedures that produce even more accurated measures, like techniques that employ mass spectrometry to compare single nucleotide polymorphisms (SNPs). In plants it is common the occurrence of polyploidy, which is the presence of more than two chromosomes in the same group of homology. The determination of polyploidy level is essential for correct SNPs genotype calling and therefore greater efficiency in the study and genetic improvement of plants. In this work we characterize the phenomenon of poliploidy with probabilistic urns and balls models, proposing an efficient and appropriate method of simulation, as well as a simple technique to infer ploydy levels and classify biallelic samples accurately taking advantage of geometrical characteristics of the problem. Analysis of simulated and real data from an experiment of sugarcane were conducted with different measures of separation between groups and different experimental conditions. For the actual data, descriptive graphical methods show the correctness and consistency of the proposed method, which can be generalized to multi-allelic loci genotyping polyploid. We end our work comparing our results with the SuperMASSA [Serang2012] approach that brought excellent results to the problem. All code developed in language R were provided with the text.
3

Genotipagem de poliplóides: um modelo de urnas e bolas / Polyploid genotyping: an urn model

Silvio Rodrigues de Faria Junior 30 May 2012 (has links)
Desde os primórdios da agricultura e pecuária, o homem seleciona indivíduos com características desejáveis para reprodução e aumento da proporção de novos indivíduos com tais qualidades. Com o conhecimento da estrutura de DNA e o advento da engenharia genética, a identificação e caracterização de espécies e indivíduos conta com novas tecnologias para auxiliar no desenvolvimento de novas variedades de plantas e animais para diversos fins. Tais tecnologias envolvem procedimentos bioquímicos e físicos cada vez mais apurados que produzem medidas cada vez mais precisas, um exemplo disso são as técnicas que empregam a espectometria de massa para comparar polimorfismos de base única (SNPs). Nas plantas é comum a ocorrência de poliploidia, que consiste na presença de mais de dois cromossomos num mesmo grupo de homologia. A determinação do nível de ploidia é fundamental para a correta genotipagem e por consequência maior eficiência no estudo e aprimoramento genético de plantas. Neste trabalho caracterizamos o fenômeno da poliploidia com modelos probabilísticos de urnas e bolas, propondo um método eficiente e adequado de simulação, assim como uma técnica simples para inferir níveis de ploidia e classificar amostras bialélicas aproveitando características geométricas do problema. Análises de dados simulados e reais provenientes de um experimento de cana-de-açúcar foram realizadas com diferentes medidas de separação entre agrupamentos e diferentes condições experimentais. Para os dados reais, métodos gráficos descritivos evidenciam a corretude e coerência do método proposto, que pode ser generalizado para a genotipagem de locos multialélicos poliplóides. Encerramos o trabalho comparando nossos resultados com a abordagem SuperMASSA [Serang2012] que trouxe excelentes resultados ao problema. Todo código desenvolvido em linguagem R está disponibilizado com o texto. / Since the beginnings of agriculture and livestock, the man selects individuals with desirable characteristics to breed and increase the proportion of new individuals with such qualities. With knowledge of the DNA structure and the advent of genetic engineering, the identification and characterization of individual species can make use of new technologies to help develop new varieties of plants and animals for many purposes. These technologies involve complex biochemical and physical procedures that produce even more accurated measures, like techniques that employ mass spectrometry to compare single nucleotide polymorphisms (SNPs). In plants it is common the occurrence of polyploidy, which is the presence of more than two chromosomes in the same group of homology. The determination of polyploidy level is essential for correct SNPs genotype calling and therefore greater efficiency in the study and genetic improvement of plants. In this work we characterize the phenomenon of poliploidy with probabilistic urns and balls models, proposing an efficient and appropriate method of simulation, as well as a simple technique to infer ploydy levels and classify biallelic samples accurately taking advantage of geometrical characteristics of the problem. Analysis of simulated and real data from an experiment of sugarcane were conducted with different measures of separation between groups and different experimental conditions. For the actual data, descriptive graphical methods show the correctness and consistency of the proposed method, which can be generalized to multi-allelic loci genotyping polyploid. We end our work comparing our results with the SuperMASSA [Serang2012] approach that brought excellent results to the problem. All code developed in language R were provided with the text.
4

Interrogation of Nucleic Acids by Parallel Threading

Pettersson, Erik January 2007 (has links)
Advancements in the field of biotechnology are expanding the scientific horizon and a promising era is envisioned with personalized medicine for improved health. The amount of genetic data is growing at an ever-escalating pace due to the availability of novel technologies that allow massively parallel sequencing and whole-genome genotyping, that are supported by the advancements in computer science and information technologies. As the amount of information stored in databases throughout the world is growing and our knowledge deepens, genetic signatures with significant importance are discovered. The surface of such a set in the data mining process may include causative- or marker single nucleotide polymorphisms (SNPs), revealing predisposition to disease, or gene expression signatures, profiling a pathological state. When targeting a reduced set of signatures in a large number of samples for diagnostic- or fine-mapping purposes, efficient interrogation and scoring require appropriate preparations. These needs are met by miniaturized and parallelized platforms that allow a low sample and template consumption. This doctoral thesis describes an attempt to tackle some of these challenges by the design and implementation of a novel assay denoted Trinucleotide Threading (TnT). The method permits multiplex amplification of a medium size set of specific loci and was adapted to genotyping, gene expression profiling and digital allelotyping. Utilizing a reduced number of nucleotides permits specific amplification of targeted loci while preventing the generation of spurious amplification products. This method was applied to genotype 96 individuals for 75 SNPs. In addition, the accuracy of genotyping from minute amounts of genomic DNA was confirmed. This procedure was performed using a robotic workstation running custom-made scripts and a software tool was implemented to facilitate the assay design. Furthermore, a statistical model was derived from the molecular principles of the genotyping assay and an Expectation-Maximization algorithm was chosen to automatically call the generated genotypes. The TnT approach was also adapted to profiling signature gene sets for the Swedish Human Protein Atlas Program. Here 18 protein epitope signature tags (PrESTs) were targeted in eight different cell lines employed in the program and the results demonstrated high concordance rates with real-time PCR approaches. Finally, an assay for digital estimation of allele frequencies in large cohorts was set up by combining the TnT approach with a second-generation sequencing system. Allelotyping was performed by targeting 147 polymorphic loci in a genomic pool of 462 individuals. Subsequent interrogation was carried out on a state-of-the-art massively parallelized Pyrosequencing instrument. The experiment generated more than 200,000 reads and with bioinformatic support, clonally amplified fragments and the corresponding sequence reads were converted to a precise set of allele frequencies. / QC 20100813

Page generated in 0.088 seconds