• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 118
  • 61
  • 21
  • 20
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 267
  • 267
  • 69
  • 67
  • 59
  • 58
  • 52
  • 39
  • 36
  • 32
  • 31
  • 30
  • 30
  • 29
  • 28
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

New Methods of Variable Selection and Inference on High Dimensional Data

Ren, Sheng January 2017 (has links)
No description available.
132

Essays on High-dimensional Nonparametric Smoothing and Its Applications to Asset Pricing

Wu, Chaojiang 25 October 2013 (has links)
No description available.
133

A Robust Adaptive Autonomous Approach to Optimal Experimental Design

GU, Hairong January 2016 (has links)
No description available.
134

Models for heterogeneous variable selection

Gilbride, Timothy J. 19 May 2004 (has links)
No description available.
135

Bayesian and Semi-Bayesian regression applied to manufacturing wooden products

Tseng, Shih-Hsien 08 January 2008 (has links)
No description available.
136

Topics in Sparse Inverse Problems and Electron Paramagnetic Resonance Imaging

Som, Subhojit 27 October 2010 (has links)
No description available.
137

Analysis of Sparse Sufficient Dimension Reduction Models

Withanage, Yeshan 16 September 2022 (has links)
No description available.
138

STATISTICAL METHODS FOR VARIABLE SELECTION IN THE CONTEXT OF HIGH-DIMENSIONAL DATA: LASSO AND EXTENSIONS

Yang, Xiao Di 10 1900 (has links)
<p>With the advance of technology, the collection and storage of data has become routine. Huge amount of data are increasingly produced from biological experiments. the advent of DNA microarray technologies has enabled scientists to measure expressions of tens of thousands of genes simultaneously. Single nucleotide polymorphism (SNP) are being used in genetic association with a wide range of phenotypes, for example, complex diseases. These high-dimensional problems are becoming more and more common. The "large p, small n" problem, in which there are more variables than samples, currently a challenge that many statisticians face. The penalized variable selection method is an effective method to deal with "large p, small n" problem. In particular, The Lasso (least absolute selection and shrinkage operator) proposed by Tibshirani has become an effective method to deal with this type of problem. the Lasso works well for the covariates which can be treated individually. When the covariates are grouped, it does not work well. Elastic net, group lasso, group MCP and group bridge are extensions of the Lasso. Group lasso enforces sparsity at the group level, rather than at the level of the individual covariates. Group bridge, group MCP produces sparse solutions both at the group level and at the level of the individual covariates within a group. Our simulation study shows that the group lasso forces complete grouping, group MCP encourages grouping to a rather slight extent, and group bridge is somewhere in between. If one expects that the proportion of nonzero group members to be greater than one-half, group lasso maybe a good choice; otherwise group MCP would be preferred. If one expects this proportion to be close to one-half, one may wish to use group bridge. A real data analysis example is also conducted for genetic variation (SNPs) data to find out the associations between SNPs and West Nile disease.</p> / Master of Science (MSc)
139

Variable Selection and Supervised Dimension Reduction for Large-Scale Genomic Data with Censored Survival Outcomes

Spirko, Lauren Nicole January 2017 (has links)
One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes, providing insight into the disease's process. With the rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of thousands of genes and proteins resulting in enormous data sets where the number of genomic variables (covariates) is far greater than the number of subjects. It is also typical for such data sets to have a high proportion of censored observations. Methods based on univariate Cox regression are often used to select genes related to survival outcome. However, the Cox model assumes proportional hazards (PH), which is unlikely to hold for each gene. When applied to genes exhibiting some form of non-proportional hazards (NPH), these methods could lead to an under- or over-estimation of the effects. In this thesis, we develop methods that will directly address t / Statistics
140

Bayesian Modeling of Complex High-Dimensional Data

Huo, Shuning 07 December 2020 (has links)
With the rapid development of modern high-throughput technologies, scientists can now collect high-dimensional complex data in different forms, such as medical images, genomics measurements. However, acquisition of more data does not automatically lead to better knowledge discovery. One needs efficient and reliable analytical tools to extract useful information from complex datasets. The main objective of this dissertation is to develop innovative Bayesian methodologies to enable effective and efficient knowledge discovery from complex high-dimensional data. It contains two parts—the development of computationally efficient functional mixed models and the modeling of data heterogeneity via Dirichlet Diffusion Tree. The first part focuses on tackling the computational bottleneck in Bayesian functional mixed models. We propose a computational framework called variational functional mixed model (VFMM). This new method facilitates efficient data compression and high-performance computing in basis space. We also propose a new multiple testing procedure in basis space, which can be used to detect significant local regions. The effectiveness of the proposed model is demonstrated through two datasets, a mass spectrometry dataset in a cancer study and a neuroimaging dataset in an Alzheimer's disease study. The second part is about modeling data heterogeneity by using Dirichlet Diffusion Trees. We propose a Bayesian latent tree model that incorporates covariates of subjects to characterize the heterogeneity and uncover the latent tree structure underlying data. This innovative model may reveal the hierarchical evolution process through branch structures and estimate systematic differences between groups of samples. We demonstrate the effectiveness of the model through the simulation study and a brain tumor real data. / Doctor of Philosophy / With the rapid development of modern high-throughput technologies, scientists can now collect high-dimensional data in different forms, such as engineering signals, medical images, and genomics measurements. However, acquisition of such data does not automatically lead to efficient knowledge discovery. The main objective of this dissertation is to develop novel Bayesian methods to extract useful knowledge from complex high-dimensional data. It has two parts—the development of an ultra-fast functional mixed model and the modeling of data heterogeneity via Dirichlet Diffusion Trees. The first part focuses on developing approximate Bayesian methods in functional mixed models to estimate parameters and detect significant regions. Two datasets demonstrate the effectiveness of proposed method—a mass spectrometry dataset in a cancer study and a neuroimaging dataset in an Alzheimer's disease study. The second part focuses on modeling data heterogeneity via Dirichlet Diffusion Trees. The method helps uncover the underlying hierarchical tree structures and estimate systematic differences between the group of samples. We demonstrate the effectiveness of the method through the brain tumor imaging data.

Page generated in 0.108 seconds