• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Dimension Reduction and Covariance Structure for Multivariate Data, Beyond Gaussian Assumption

Maadooliat, Mehdi 2011 August 1900 (has links)
Storage and analysis of high-dimensional datasets are always challenging. Dimension reduction techniques are commonly used to reduce the complexity of the data and obtain the informative aspects of datasets. Principal Component Analysis (PCA) is one of the commonly used dimension reduction techniques. However, PCA does not work well when there are outliers or the data distribution is skewed. Gene expression index estimation is an important problem in bioinformatics. Some of the popular methods in this area are based on the PCA, and thus may not work well when there is non-Gaussian structure in the data. To address this issue, a likelihood based data transformation method with a computationally efficient algorithm is developed. Also, a new multivariate expression index is studied and the performance of the multivariate expression index is compared with the commonly used univariate expression index. As an extension of the gene expression index estimation problem, a general procedure that integrates data transformation with the PCA is developed. In particular, this general method can handle missing data and data with functional structure. It is well-known that the PCA can be obtained by the eigen decomposition of the sample covariance matrix. Another focus of this dissertation is to study the covariance (or correlation) structure under the non-Gaussian assumption. An important issue in modeling the covariance matrix is the positive definiteness constraint. The modified Cholesky decomposition of the inverse covariance matrix has been considered to address this issue in the literature. An alternative Cholesky decomposition of the covariance matrix is considered and used to construct an estimator of the covariance matrix under multivariate-t assumption. The advantage of this alternative Cholesky decomposition is the decoupling of the correlation and the variances.

Page generated in 0.1013 seconds