Global ETD Search

1	INFORMATIONAL INDEX AND ITS APPLICATIONS IN HIGH DIMENSIONAL DATA Yuan, Qingcong 01 January 2017 (has links) We introduce a new class of measures for testing independence between two random vectors, which uses expected difference of conditional and marginal characteristic functions. By choosing a particular weight function in the class, we propose a new index for measuring independence and study its property. Two empirical versions are developed, their properties, asymptotics, connection with existing measures and applications are discussed. Implementation and Monte Carlo results are also presented. We propose a two-stage sufficient variable selections method based on the new index to deal with large p small n data. The method does not require model specification and especially focuses on categorical response. Our approach always improves other typical screening approaches which only use marginal relation. Numerical studies are provided to demonstrate the advantages of the method. We introduce a novel approach to sufficient dimension reduction problems using the new measure. The proposed method requires very mild conditions on the predictors, estimates the central subspace effectively and is especially useful when response is categorical. It keeps the model-free advantage without estimating link function. Under regularity conditions, root-n consistency and asymptotic normality are established. The proposed method is very competitive and robust comparing to existing dimension reduction methods through simulations results. Categorical variable Distance Independence Sufficient variable selection Sufficient dimension reduction Applied Statistics Biostatistics Multivariate Analysis Statistical Methodology Statistical Theory
2	TRANSFORMS IN SUFFICIENT DIMENSION REDUCTION AND THEIR APPLICATIONS IN HIGH DIMENSIONAL DATA Weng, Jiaying 01 January 2019 (has links) The big data era poses great challenges as well as opportunities for researchers to develop efficient statistical approaches to analyze massive data. Sufficient dimension reduction is such an important tool in modern data analysis and has received extensive attention in both academia and industry. In this dissertation, we introduce inverse regression estimators using Fourier transforms, which is superior to the existing SDR methods in two folds, (1) it avoids the slicing of the response variable, (2) it can be readily extended to solve the high dimensional data problem. For the ultra-high dimensional problem, we investigate both eigenvalue decomposition and minimum discrepancy approaches to achieve optimal solutions and also develop a novel and efficient optimization algorithm to obtain the sparse estimates. We derive asymptotic properties of the proposed estimators and demonstrate its efficiency gains compared to the traditional estimators. The oracle properties of the sparse estimates are derived. Simulation studies and real data examples are used to illustrate the effectiveness of the proposed methods. Wavelet transform is another tool that effectively detects information from time-localization of high frequency. Parallel to our proposed Fourier transform methods, we also develop a wavelet transform version approach and derive the asymptotic properties of the resulting estimators. Central subspace Fourier transform Predictors hypothesis tests Sufficient dimension reduction Sufficient variable selection Wavelet transform Multivariate Analysis Statistical Methodology Statistical Models
3	A NEW INDEPENDENCE MEASURE AND ITS APPLICATIONS IN HIGH DIMENSIONAL DATA ANALYSIS Ke, Chenlu 01 January 2019 (has links) This dissertation has three consecutive topics. First, we propose a novel class of independence measures for testing independence between two random vectors based on the discrepancy between the conditional and the marginal characteristic functions. If one of the variables is categorical, our asymmetric index extends the typical ANOVA to a kernel ANOVA that can test a more general hypothesis of equal distributions among groups. The index is also applicable when both variables are continuous. Second, we develop a sufficient variable selection procedure based on the new measure in a large p small n setting. Our approach incorporates marginal information between each predictor and the response as well as joint information among predictors. As a result, our method is more capable of selecting all truly active variables than marginal selection methods. Furthermore, our procedure can handle both continuous and discrete responses with mixed-type predictors. We establish the sure screening property of the proposed approach under mild conditions. Third, we focus on a model-free sufficient dimension reduction approach using the new measure. Our method does not require strong assumptions on predictors and responses. An algorithm is developed to find dimension reduction directions using sequential quadratic programming. We illustrate the advantages of our new measure and its two applications in high dimensional data analysis by numerical studies across a variety of settings. High dimensional data analysis Independence Reproducing Kernel Hilbert Space Sufficient Dimension Reduction Sufficient Variable Selection Categorical Data Analysis Multivariate Analysis Statistics and Probability

1

Page generated in 0.0922 seconds