Global ETD Search

1	Consistent bi-level variable selection via composite group bridge penalized regression Seetharaman, Indu January 1900 (has links) Master of Science / Department of Statistics / Kun Chen / We study the composite group bridge penalized regression methods for conducting bilevel variable selection in high dimensional linear regression models with a diverging number of predictors. The proposed method combines the ideas of bridge regression (Huang et al., 2008a) and group bridge regression (Huang et al., 2009), to achieve variable selection consistency in both individual and group levels simultaneously, i.e., the important groups and the important individual variables within each group can both be correctly identi ed with probability approaching to one as the sample size increases to in nity. The method takes full advantage of the prior grouping information, and the established bi-level oracle properties ensure that the method is immune to possible group misidenti cation. A related adaptive group bridge estimator, which uses adaptive penalization for improving bi-level selection, is also investigated. Simulation studies show that the proposed methods have superior performance in comparison to many existing methods. Bi-level variable selection High-dimensional data Oracle property Penalized regression Sparse models Statistics (0463)
2	High-dimensional inference of ordinal data with medical applications Jiao, Feiran 01 May 2016 (has links) Ordinal response variables abound in scientific and quantitative analyses, whose outcomes comprise a few categorical values that admit a natural ordering, so that their values are often represented by non-negative integers, for instance, pain score (0-10) or disease severity (0-4) in medical research. Ordinal variables differ from rational variables in that its values delineate qualitative rather than quantitative differences. In this thesis, we develop new statistical methods for variable selection in a high-dimensional cumulative link regression model with an ordinal response. Our study is partly motivated by the needs for exploring the association structure between disease phenotype and high-dimensional medical covariates. The cumulative link regression model specifies that the ordinal response of interest results from an order-preserving quantization of some latent continuous variable that bears a linear regression relationship with a set of covariates. Commonly used error distributions in the latent regression include the normal distribution, the logistic distribution, the Cauchy distribution and the standard Gumbel distribution (minimum). The cumulative link model with normal (logit, Gumbel) errors is also known as the ordered probit (logit, complementary log-log) model. While the likelihood function has a closed-form solution for the aforementioned error distributions, its strong nonlinearity renders direct optimization of the likelihood to sometimes fail. To mitigate this problem and to facilitate extension to penalized likelihood estimation, we proposed specific minorization-maximization (MM) algorithms for maximum likelihood estimation of a cumulative link model for each of the preceding 4 error distributions. Penalized ordinal regression models play a role when variable selection needs to be performed. In some applications, covariates may often be grouped according to some meaningful way but some groups may be mixed in that they contain both relevant and irrelevant variables, i.e., whose coefficients are non-zero and zero, respectively. Thus, it is pertinent to develop a consistent method for simultaneously selecting relevant groups and the relevant variables within each selected group, which constitutes the so-called bi-level selection problem. We have proposed to use a penalized maximum likelihood approach with a composite bridge penalty to solve the bi-level selection problem in a cumulative link model. An MM algorithm was developed for implementing the proposed method, which is specific to each of the 4 error distributions. The proposed approach is shown to enjoy a number of desirable theoretical properties including bi-level selection consistency and oracle properties, under suitable regularity conditions. Simulations demonstrate that the proposed method enjoys good empirical performance. We illustrated the proposed methods with several real medical applications. bi-level variable selection composite bridge penalty cumulative link model lung image data MM algorithm penalized maximum likelihood

Search results

Consistent bi-level variable selection via composite group bridge penalized regression

High-dimensional inference of ordinal data with medical applications