Global ETD Search

1	Dimension reduction methods for nonlinear association analysis with applications to omics data Wu, Peitao 06 November 2021 (has links) With advances in high-throughput techniques, the availability of large-scale omics data has revolutionized the fields of medicine and biology, and has offered a better understanding of the underlying biological mechanisms. However, the high-dimensionality and the unknown association structure between different data types make statistical integration analyses challenging. In this dissertation, we develop three dimensionality reduction methods to detect nonlinear association structure using omics data. First, we propose a method for variable selection in a nonparametric additive quantile regression framework. We enforce a network regularization to incorporate information encoded by known networks. To account for nonlinear associations, we approximate the additive functional effect of each predictor with the expansion of a B-spline basis. We implement the group Lasso penalty to achieve sparsity. We define the network-constrained penalty by regulating the difference between the effect functions of any two linked genes (predictors) in the network. Simulation studies show that our proposed method performs well in identifying truly associated genes with fewer falsely associated genes than alternative approaches. Second, we develop a canonical correlation analysis (CCA)-based method, canonical distance correlation analysis (CDCA), and leverage the distance correlation to capture the overall association between two sets of variables. The CDCA allows untangling linear and nonlinear dependence structures. Third, we develop the sparse CDCA (sCDCA) method to achieve sparsity and improve result interpretability by adding penalties on the loadings from the CDCA. The sCDCA method can be applied to data with large dimensionality and small sample size. We develop iterative majorization-minimization-based coordinate descent algorithms to compute the loadings in the CDCA and sCDCA methods. Simulation studies show that the proposed CDCA and sCDCA approaches have better performance than classical CCA and sparse CCA (sCCA) in nonlinear settings and have similar performance in linear association settings. We apply the proposed methods to the Framingham Heart Study (FHS) to identify body mass index associated genes, the association structure between metabolic disorders and metabolite profiles, and a subset of metabolites and their associated type 2 diabetes (T2D)-related genes. / 2023-11-05T00:00:00Z Biostatistics Data integration Distance correlation Lasso penalization MM algorithm Quantile regression Second order cone programming
2	Analyse statistique de données fonctionnelles à structures complexes Adjogou, Adjobo Folly Dzigbodi 05 1900 (has links) No description available. Données longitudinales Partitionnement fonctionnel Classification non supervisée Modèles de mélange pour classification Analyse des données fonctionnelles Algorithme EM Statistique bayésienne Longitudinal data Functional clustering Model-based clustering Functional data analysis EM algorithm Bayesian framework Sparse longitudinal data Gene expression Mixture student PRRSV Lasso penalization

Search results

Dimension reduction methods for nonlinear association analysis with applications to omics data

Analyse statistique de données fonctionnelles à structures complexes