acase@tulane.edu / Advanced omics technologies have been generating abundant multi-ethnic multi-omics data, including DNA sequences, methylations, gene expressions, and copious clinical traits. Such big data pose unprecedented challenges due to the high complexity of heterogeneous networks between biomarkers. Heteroscedasticity (aka, dispersion heterogeneity of trait residuals) is a common phenomenon in multi-omics data mining. It can be caused by interactions such as gene×gene, gene×enviroment, linkage disequilibrium (LD) between marker loci, and pleiotropic traits as well. Especially, it occurs in the data mining of the multi-omics data of admixed individuals subjects due to broad admixture LD and gene×ancestry interactions. Meanwhile, it can be induced by background confounders, e.g., population structure, cryptic relatedness, polygenetic effects, and correlations between residuals of multiple traits. However, existent univariate and multivariate methods neglect all the high-order effects of both test biomarkers and background confounders. This dissertation contributes systematic harmonious signal augmentation methods with applications for distilling high-order information from multiethnic DNA sequences to microarrays. In Chapter I, we proposed a novel harmonious signal augmentation schemes in single-based association tests. The harmonious single-based association test (HSAT) is more powerful then existent single-based methods in both simulations and real data application. In Chapter II we put forth harmonious gene-based association tests (HGAT) to incorporate high-order effects. Within a gene, the importance of a test variant is measured by the signal of marker-wise high-order effects. Leveraging high-order effects of genetic variants has proven to improve power for identifying susceptive genes. By extensive simulations under published designs, the proposed method properly controlled type I error rates and appeared strikingly more powerful than existent prominent gene-based sequence association methods. We apply HGAT methods in homogeneous population and admixed population. There are two parts in Chapter III, the first part introduced integrating informative mean and variance effects to identify differentially expressed (DE) genes. The second part illustrated the application of harmonious integration of mean and high order effects to identify differentially expressed (DE) genes. In summary, this dissertation demonstrated tremendous potential of explicitly distilling informative higher-order effects in big multiethnic multi-level data mining and offered paradigm applications for integrating high-order information resources while effectively calibrating major heteroscedastic confounders. / 1 / Weiwei Ouyang
Identifer | oai:union.ndltd.org:TULANE/oai:http://digitallibrary.tulane.edu/:tulane_75490 |
Date | January 2017 |
Contributors | Ouyang, Weiwei (author), Qin, Huaizhen (Thesis advisor), School of Public Health & Tropical Medicine Biostatistics and Bioinformatics (Degree granting institution) |
Publisher | Tulane University |
Source Sets | Tulane University |
Language | English |
Detected Language | English |
Format | electronic, 167 |
Rights | No embargo, Copyright is in accordance with U.S. Copyright law. |
Page generated in 0.0019 seconds