Return to search

High-dimensional statistical data integration

archives@tulane.edu / Modern biomedical studies often collect multiple types of high-dimensional data on a common set of objects. A representative model for the integrative analysis of multiple data types is to decompose each data matrix into a low-rank common-source matrix generated by latent factors shared across all data types, a low-rank distinctive-source matrix corresponding to each data type, and an additive noise matrix. We propose a novel decomposition method, called the decomposition-based generalized canonical correlation analysis, which appropriately defines those matrices by imposing a desirable orthogonality constraint on distinctive latent factors that aims to sufficiently capture the common latent factors. To further delineate the common and distinctive patterns between two data types, we propose another new decomposition method, called the common and distinctive pattern analysis. This method takes into account the common and distinctive information between the coefficient matrices of the common latent factors. We develop consistent estimation approaches for both proposed decompositions under high-dimensional settings, and demonstrate their finite-sample performance via extensive simulations. We illustrate the superiority of proposed methods over the state of the arts by real-world data examples obtained from The Cancer Genome Atlas and Human Connectome Project. / 1 / Zhe Qu

  1. tulane:106916
Identiferoai:union.ndltd.org:TULANE/oai:http://digitallibrary.tulane.edu/:tulane_106916
Date January 2019
ContributorsQu, Zhe (author), Hyman, James (Thesis advisor), School of Science & Engineering Mathematics (Degree granting institution)
PublisherTulane University
Source SetsTulane University
LanguageEnglish
Detected LanguageEnglish
TypeText
Formatelectronic, pages:  151
Rights12 months, Copyright is in accordance with U.S. Copyright law.

Page generated in 0.0026 seconds