Spelling suggestions: "subject:"multiomics data"" "subject:"culturomics data""
1 |
Tissue-dependent analysis of common and rare genetic variants for Alzheimer's disease using multi-omics dataPatel, Devanshi 21 January 2021 (has links)
Alzheimer’s disease (AD) is a complex neurodegenerative disease characterized by progressive memory loss and caused by a combination of genetic, environmental, and lifestyle factors. AD susceptibility is highly heritable at 58-79%, but only about one third of the AD genetic component is accounted for by common variants discovered through genome-wide association studies (GWAS). Rare variants may contribute to some of the unexplained heritability of AD and have been demonstrated to contribute to large gene expression changes across tissues, but conventional analytical approaches pose challenges because of low statistical power even for large sample sizes. Recent studies have demonstrated by expression quantitative trait locus (eQTL) analysis that changes in gene expression could play a key role in the pathogenesis of AD. However, regulation of gene expression has been shown to be context-specific (e.g., tissue and cell-types), motivating a context dependent approach to achieve more precise and statistically significant associations. To address these issues, I applied a strategy to identify new AD risk or protective rare variants by examining mutations occurring only in cases or only controls, observing that different mutations in the same gene or variable dose of a mutation may result in distinct dementias. I also evaluated the impact of rare variation on expression at the gene and gene pathway levels in blood and brain tissue, further strengthening the rare variant findings with functional evidence and finding evidence for a large immune and inflammatory component to AD. Lastly, I identified cell-type specific eQTLs in blood and brain tissue to explain underlying genetic associations of common variants in AD, and also discovered additional evidence for the role of myeloid cells in AD risk and potential novel blood and brain AD biomarkers. Collectively, these findings further explain the genetic basis of AD risk and provide insight about mechanisms leading to this disorder. / 2022-01-21T00:00:00Z
|
2 |
Multi-omics Data Integration for Identifying Disease Specific Biological PathwaysLu, Yingzhou 05 June 2018 (has links)
Pathway analysis is an important task for gaining novel insights into the molecular architecture of many complex diseases. With the advancement of new sequencing technologies, a large amount of quantitative gene expression data have been continuously acquired. The springing up omics data sets such as proteomics has facilitated the investigation on disease relevant pathways.
Although much work has previously been done to explore the single omics data, little work has been reported using multi-omics data integration, mainly due to methodological and technological limitations. While a single omic data can provide useful information about the underlying biological processes, multi-omics data integration would be much more comprehensive about the cause-effect processes responsible for diseases and their subtypes.
This project investigates the combination of miRNAseq, proteomics, and RNAseq data on seven types of muscular dystrophies and control group. These unique multi-omics data sets provide us with the opportunity to identify disease-specific and most relevant biological pathways. We first perform t-test and OVEPUG test separately to define the differential expressed genes in protein and mRNA data sets. In multi-omics data sets, miRNA also plays a significant role in muscle development by regulating their target genes in mRNA dataset. To exploit the relationship between miRNA and gene expression, we consult with the commonly used gene library - Targetscan to collect all paired miRNA-mRNA and miRNA-protein co-expression pairs. Next, by conducting statistical analysis such as Pearson's correlation coefficient or t-test, we measured the biologically expected correlation of each gene with its upstream miRNAs and identify those showing negative correlation between the aforementioned miRNA-mRNA and miRNA-protein pairs. Furthermore, we identify and assess the most relevant disease-specific pathways by inputting the differential expressed genes and negative correlated genes into the gene-set libraries respectively, and further characterize these prioritized marker subsets using IPA (Ingenuity Pathway Analysis) or KEGG. We will then use Fisher method to combine all these p-values derived from separate gene sets into a joint significance test assessing common pathway relevance. In conclusion, we will find all negative correlated paired miRNA-mRNA and miRNA-protein, and identifying several pathophysiological pathways related to muscular dystrophies by gene set enrichment analysis.
This novel multi-omics data integration study and subsequent pathway identification will shed new light on pathophysiological processes in muscular dystrophies and improve our understanding on the molecular pathophysiology of muscle disorders, preventing and treating disease, and make people become healthier in the long term. / Master of Science / Identification of biological pathways play a central role in understanding both human health and diseases. A biological pathway is a series of information processing steps via interactions among molecules in a cell that partially determines the phenotype of a cell. Specifically, identifying disease-specific pathway will guide focused studies on complex diseases, thus potentially improve the prevention and treatment of diseases.
To identify disease-specific pathways, it is crucial to develop computational methods and statistical tests that can integrate multi-omics (multiple omes such as genome, proteome, etc) data. Compared to single omics data, multi-omics data will help gaining a more comprehensive understanding on the molecular architecture of disease processes.
In this thesis, we propose a novel data analytics pipeline for multi-omics data integration. We test and apply our method on/to the real proteomics data sets on muscular dystrophy subtypes, and identify several biologically plausible pathways related to muscular dystrophies.
|
3 |
Robust methods for multivariate analysis of correlated genetics and genomics dataSong, Zeyuan 11 February 2025 (has links)
2024 / This dissertation focuses on the development of advanced multivariate analysis methods for the analysis of genetics and genomics data with multiple sources of correlations. The dissertation describes three novel topics: (1) a method to learn partial correlation networks, also known as Gaussian Graphical Models, to analyze multi-omics data (2) a sparse network method to reduce network complexity, and (3) a Genome-Wide Association Study pipeline to analyze genome-wide genotype data in longitudinal and familial settings. In the first part of my dissertation I propose a cluster-based Bootstrap algorithm for learning Gaussian Graphical Models from correlated data. The Bootstrap algorithm is validated to effectively control Type I errors without compromising statistical power compared to alternative solutions through extensive simulations in family-based studies. Additionally the algorithm is applied to learn the partial correlation networks of 47 Polygenic Risk Scores generated from genome-wide genotype data in the Long Life Family Study to unveil the complex relationships of these Polygenic Risk Scores. The second part of the dissertation extends the Bootstrap algorithm to learn sparse Gaussian Graphical Models in correlated data. Simulation studies shows that this extended Bootstrap algorithm maintains control over the Type I errors. By varying the values of the tuning parameter, the dynamic changes of networks reveal their contraction and dissection as edges with small partial correlations are systematically removed. The application of this method in real data analysis identifies meaningful clusters in the dynamic changes of the Polygenic Risk Scores and lipids networks. In the third part, I developed a Nextflow Genome-Wide Association Study pipeline, providing a fully automated analysis tool for managing, analyzing, and visualizing genome-wide genotype data for continuous and binary traits with correlated genetics data. Applying this pipeline to investigate processing speed in the Long Life Family Study leads to the identification of 17 rare protective Single Nucleotide Polymorphisms located in/near Retinoic Acid Receptor Beta and Thyroid Hormone Receptor Beta genes on chromosome 3. These findings shed light on potential mechanisms supporting the preservation of processing speed in aging individuals. / 2026-02-11T00:00:00Z
|
4 |
Beyond hairballs: depicting complexity of a kinase-phosphatase network in the budding yeastAbd-Rabbo, Diala 01 1900 (has links)
No description available.
|
Page generated in 0.0563 seconds