Multivariate linear regression methods are widely used statistical tools in data analysis, and were developed when some response variables are studied simultaneously, in which our aim is to study the relationship between predictor variables and response variables through the regression coefficient matrix. The rapid improvements of information technology have brought us a large number of large-scale data, but also brought us great challenges in data processing. When dealing with high dimensional data, the classical least squares estimation is not applicable in multivariate linear regression analysis. In recent years, some approaches have been developed to deal with high-dimensional data problems, among which dimension reduction is one of the main approaches. In some literature, random projection methods were used to reduce dimension in large datasets. In Chapter 2, a new random projection method, with low-rank matrix approximation, is proposed to reduce the dimension of the parameter space in high-dimensional multivariate linear regression model. Some statistical properties of the proposed method are studied and explicit expressions are then derived for the accuracy loss of the method with Gaussian random projection and orthogonal random projection. These expressions are precise rather than being bounds up to constants.
In multivariate regression analysis, reduced rank regression is also a dimension reduction method, which has become an important tool for achieving dimension reduction goals due to its simplicity, computational efficiency and good predictive performance. In practical situations, however, the performance of the reduced rank estimator is not satisfactory when the predictor variables are highly correlated or the ratio of signal to noise is small. To overcome this problem, in Chapter 3, we incorporate matrix projections into reduced rank regression method, and then develop reduced rank regression estimators based on random projection and orthogonal projection in high-dimensional multivariate linear regression models. We also propose a consistent estimator of the rank of the coefficient matrix and achieve prediction performance bounds for the proposed estimators based on mean squared errors.
Envelope technology is also a popular method in recent years to reduce estimative and predictive variations in multivariate regression, including a class of methods to improve the efficiency without changing the traditional objectives. Variable selection is the process of selecting a subset of relevant features variables for use in model construction. The purpose of using this technology is to avoid the curse of dimensionality, simplify models to make them easier to interpret, shorten training time and reduce overfitting. In Chapter 4, we combine envelope models and a group variable selection method to propose an envelope-based sparse
reduced rank regression estimator in high-dimensional multivariate linear regression models, and then establish its consistency, asymptotic normality and oracle property.
Tensor data are in frequent use today in a variety of fields in science and engineering. Processing tensor data is a practical but challenging problem. Recently, the prevalence of tensor data has resulted in several envelope tensor versions. In Chapter 5, we incorporate envelope technique into tensor regression analysis and propose a partial tensor envelope model, which leads to a parsimonious version for tensor response regression when some predictors are of special interest, and then consistency and asymptotic normality of the coefficient estimators are proved. The proposed method achieves significant gains in efficiency compared to the standard tensor response regression model in terms of the estimation of the coefficients for the selected predictors.
Finally, in Chapter 6, we summarize the work carried out in the thesis, and then suggest some problems of further research interest. / Dissertation / Doctor of Philosophy (PhD)
Identifer | oai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/25988 |
Date | January 2020 |
Creators | Guo, Wenxing |
Contributors | Balakrishnan, Narayanaswamy, Mathematics and Statistics |
Source Sets | McMaster University |
Language | English |
Detected Language | English |
Type | Thesis |
Page generated in 0.0021 seconds