• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Poisson multiscale methods for high-throughput sequencing data

Xing, Zhengrong 21 December 2016 (has links)
<p> In this dissertation, we focus on the problem of analyzing data from high-throughput sequencing experiments. With the emergence of more capable hardware and more efficient software, these sequencing data provide information at an unprecedented resolution. However, statistical methods developed for such data rarely tackle the data at such high resolutions, and often make approximations that only hold under certain conditions. </p><p> We propose a model-based approach to dealing with such data, starting from a single sample. By taking into account the inherent structure present in such data, our model can accurately capture important genomic regions. We also present the model in such a way that makes it easily extensible to more complicated and biologically interesting scenarios. </p><p> Building upon the single-sample model, we then turn to the statistical question of detecting differences between multiple samples. Such questions often arise in the context of expression data, where much emphasis has been put on the problem of detecting differential expression between two groups. By extending the framework for a single sample to incorporate additional group covariates, our model provides a systematic approach to estimating and testing for such differences. We then apply our method to several empirical datasets, and discuss the potential for further applications to other biological tasks. </p><p> We also seek to address a different statistical question, where the goal here is to perform exploratory analysis to uncover hidden structure within the data. We incorporate the single-sample framework into a commonly used clustering scheme, and show that our enhanced clustering approach is superior to the original clustering approach in many ways. We then apply our clustering method to a few empirical datasets and discuss our findings. </p><p> Finally, we apply the shrinkage procedure used within the single-sample model to tackle a completely different statistical issue: nonparametric regression with heteroskedastic Gaussian noise. We propose an algorithm that accurately recovers both the mean and variance functions given a single set of observations, and demonstrate its advantages over state-of-the art methods through extensive simulation studies.</p>
2

Bayesian lasso| An extension for genome-wide association study

Joo, LiJin 24 March 2017 (has links)
<p>In genome-wide association study (GWAS), variable selection has been used for prioritizing candidate single-nucleotide polymorphism (SNP). Relating densely located SNPs to a complex trait, we need a method that is robust under various genetic architectures, yet is sensitive enough to detect the marginal difference between null and non-null factors. For this problem, ordinary Lasso produced too many false positives, and Bayesian Lasso by Gibbs samplers became too conservative when selection criterion was posterior credible sets. My proposals to improve Bayesian Lasso include two aspects: To use stochastic approximation, variational Bayes for increasing computational efficiency and to use a Dirichlet-Laplace prior for separating small effects from nulls better. Both a double exponential prior of Bayesian Lasso and a Dirichlet-Laplace prior have a global-local mixture representation, and variational Bayes can effectively handle the hierarchies of a model due to the mixture representation. In the analysis of simulated and real sequencing data, the proposed methods showed meaningful improvements on both efficiency and accuracy.

Page generated in 0.2922 seconds