Global ETD Search

Return to search

Variable selection in high dimensional semi-varying coefficient models

With the development of computing and sampling technologies, high dimensionality has become an important characteristic of commonly used science data, such as some data from bioinformatics, information engineering, and the social sciences. The varying coefficient model is a flexible and powerful statistical model for exploring dynamic patterns in many scientific areas. It is a natural extension of classical parametric models with good interpretability, and is becoming increasingly popular in data analysis. The main objective of thesis is to apply the varying coefficient model to analyze high dimensional data, and to investigate the properties of regularization methods for high-dimensional varying coefficient models. We first discuss how to apply local polynomial smoothing and the smoothly clipped absolute deviation (SCAD) penalized methods to estimate varying coefficient models when the dimension of the model is diverging with the sample size. Based on the nonconcave penalized method and local polynomial smoothing, we suggest a regularization method to select significant variables from the model and estimate the corresponding coefficient functions simultaneously. Importantly, our proposed method can also identify constant coefficients at same time. We investigate the asymptotic properties of our proposed method and show that it has the so called “oracle property.” We apply the nonparametric independence Screening (NIS) method to varying coefficient models with ultra-high-dimensional data. Based on the marginal varying coefficient model estimation, we establish the sure independent screening property under some regular conditions for our proposed sure screening method. Combined with our proposed regularization method, we can systematically deal with high-dimensional or ultra-high-dimensional data using varying coefficient models. The nonconcave penalized method is a very effective variable selection method. However, maximizing such a penalized likelihood function is computationally challenging, because the objective functions are nondifferentiable and nonconcave. The local linear approximation (LLA) and local quadratic approximation (LQA) are two popular algorithms for dealing with such optimal problems. In this thesis, we revisit these two algorithms. We investigate the convergence rate of LLA and show that the rate is linear. We also study the statistical properties of the one-step estimate based on LLA under a generalized statistical model with a diverging number of dimensions. We suggest a modified version of LQA to overcome its drawback under high dimensional models. Our proposed method avoids having to calculate the inverse of the Hessian matrix in the modified Newton Raphson algorithm based on LQA. Our proposed methods are investigated by numerical studies and in a real case study in Chapter 5.

Regression analysis

Statistical methods

Identifer	oai:union.ndltd.org:hkbu.edu.hk/oai:repository.hkbu.edu.hk:etd_oa-1010
Date	06 September 2013
Creators	Chen, Chi
Publisher	HKBU Institutional Repository
Source Sets	Hong Kong Baptist University
Language	English
Detected Language	English
Type	text
Format	application/pdf
Source	Open Access Theses and Dissertations
Rights	The author retains all rights to this work. The author has signed an agreement granting HKBU a non-exclusive license to archive and distribute their thesis.

Page generated in 0.0022 seconds

Variable selection in high dimensional semi-varying coefficient models

Description

Links & Downloads

Tags

Additional Fields