• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Bayesian Hidden Markov Models for finding DNA Copy Number Changes from SNP Genotyping Arrays

Kowgier, Matthew 31 August 2012 (has links)
DNA copy number variations (CNVs), which involve the deletion or duplication of subchromosomal segments of the genome, have become a focus of genetics research. This dissertation develops Bayesian HMMs for finding CNVs from single nucleotide polymorphism (SNP) arrays. A Bayesian framework to reconstruct the DNA copy number sequence from the observed sequence of SNP array measurements is proposed. A Markov chain Monte Carlo (MCMC) algorithm, with a forward-backward stochastic algorithm for sampling DNA copy number sequences, is developed for estimating model parameters. Numerous versions of Bayesian HMMs are explored, including a discrete-time model and different models for the instantaneous transition rates of change among copy number states of a continuous-time HMM. The most general model proposed makes no restrictions and assumes the rate of transition depends on the current state, whereas the nested model fixes some of these rates by assuming that the rate of transition is independent of the current state. Each model is assessed using a subset of the HapMap data. More general parameterizations of the transition intensity matrix of the continuous-time Markov process produced more accurate inference with respect to the length of CNV regions. The observed SNP array measurements are assumed to be stochastic with distribution determined by the underlying DNA copy number. Copy-number-specific distributions, including a non-symmetric distribution for the 0-copy state (homozygous deletions) and mixture distributions for 2-copy state (normal), are developed and shown to be more appropriate than existing implementations which lead to biologically implausible results. Compared to existing HMMs for SNP array data, this approach is more flexible in that model parameters are estimated from the data rather than set to a priori values. Measures of uncertainty, computed as simulation-based probabilities, can be determined for putative CNVs detected by the HMM. Finally, the dissertation concludes with a discussion of future work, with special attention given to model extensions for multiple sample analysis and family trio data.
2

Bayesian Hidden Markov Models for finding DNA Copy Number Changes from SNP Genotyping Arrays

Kowgier, Matthew 31 August 2012 (has links)
DNA copy number variations (CNVs), which involve the deletion or duplication of subchromosomal segments of the genome, have become a focus of genetics research. This dissertation develops Bayesian HMMs for finding CNVs from single nucleotide polymorphism (SNP) arrays. A Bayesian framework to reconstruct the DNA copy number sequence from the observed sequence of SNP array measurements is proposed. A Markov chain Monte Carlo (MCMC) algorithm, with a forward-backward stochastic algorithm for sampling DNA copy number sequences, is developed for estimating model parameters. Numerous versions of Bayesian HMMs are explored, including a discrete-time model and different models for the instantaneous transition rates of change among copy number states of a continuous-time HMM. The most general model proposed makes no restrictions and assumes the rate of transition depends on the current state, whereas the nested model fixes some of these rates by assuming that the rate of transition is independent of the current state. Each model is assessed using a subset of the HapMap data. More general parameterizations of the transition intensity matrix of the continuous-time Markov process produced more accurate inference with respect to the length of CNV regions. The observed SNP array measurements are assumed to be stochastic with distribution determined by the underlying DNA copy number. Copy-number-specific distributions, including a non-symmetric distribution for the 0-copy state (homozygous deletions) and mixture distributions for 2-copy state (normal), are developed and shown to be more appropriate than existing implementations which lead to biologically implausible results. Compared to existing HMMs for SNP array data, this approach is more flexible in that model parameters are estimated from the data rather than set to a priori values. Measures of uncertainty, computed as simulation-based probabilities, can be determined for putative CNVs detected by the HMM. Finally, the dissertation concludes with a discussion of future work, with special attention given to model extensions for multiple sample analysis and family trio data.

Page generated in 0.0289 seconds