1 |
Multi-omics integration for biomarker discovery andunsupervised subject clusterization. A novel computational methodFiorentino, Giuseppe 08 November 2023 (has links)
The advent of the high throughput era has resulted in rapid growth in the availability of large biological datasets. These massive datasets are organized in public or private repositories, encompassing not only DNA but also multiple biomolecules that represent different layers of biological information. The examination and quantification of one such layer are commonly known as "omics," which include the genome, proteome, transcriptome, and metabolome. Currently, it has become commonplace to conduct association analyses between a single omics and a specific phenotype. This practice has significantly enhanced our comprehension of both biological mechanisms and disease, particularly Mendelian disorders. However, the study of a single omics often fails to capture the entirety of variations
within a multi-layered mechanism, as well as the interplay between different biological layers, thus not accurately characterizing changes in complex disorders and regulatory systems. Hence, the integration of information from multiple omics has emerged as the prevailing approach, leading to the development of computational tools for
conducting multi-omics analyses. These tools are essential for further unraveling the underlying causes of complex diseases. However, the landscape of multi-omics analysis software is highly diverse, offering researchers a wide range of options in terms of purposes, data types, integration methods, and development techniques. This diversity provides tailored pipelines that cater to specific research needs. Yet, it also poses challenges, as the multitude of software options often lacks standardized practices and protocols. Consequently, a universally accepted gold standard is absent, impeding result reproducibility and comparability across different research efforts. To address this issue, we have developed MOUSSE, a novel modular omicsgeneric pipeline for unsupervised data integration. The characteristic of our tool is to use rank-based subject-specific signatures as input to derive from each omics a subject similarity network. This network maintains the informative content of the input data while reducing its size and allows for a graph-based integration of multiple omics. Using the resulting integrated network, the pipeline clusters the subjects andallows researchers to identify biomarkers for each cluster. One aspect that sets MOUSSE apart from other techniques is that it require almost no data preprocessing, making it more robust to noise in the data and more suitable to novel and not yet fully characterized data types. We tested our tool by analyzing ten publicly available benchmark datasets for different types of cancer. Each dataset contained data from three separate omics, namely transcriptome, methylome and miRNAome. The aim of our analysis was two-folded. First, we wanted to demonstrate that MOUSSE was able to identify the different phenotypes of cancers as clusters, second, we aimed to demonstrate that the pipeline was also able to identify biomarkers for each cancer type or progression. Moreover, we compared MOUSSE clustering performance against tenmulti-omics tools tested on the same data, achieving the highest median classification score. Finally, we performed an additional analysis on the biomarkers selected by the pipeline for a selected number of cancer phenotypes, showing that MOUSSE was able to identify the markers underlying disease progression and differential survival rate between cancer phenotypes. Collectively, these results showed that MOUSSE clustering and biomarker identification can be reliable even when the disease is changing. Finally, we successfully compiled and implemented MOUSSE as an R-package. To enhance the pipeline, we incorporated an additional omics dataset. This integration
allowed us to optimize the selection of subject-specific signatures and introduced the capability of iteratively running the tool. This means that users can refine their clustering results while reducing the size of candidates, therefore enhancing the overall effectiveness of the software.
|
2 |
Identification of new functional resistance genes against P. infestans in Solanaceae speciesVan Weymers, Pauline S. M. January 2016 (has links)
Pests and pathogens represent a serious and continuing threat to potato and tomato production worldwide. In this thesis, I have developed a new NB-LRRs probe library accounting for the recent improved annotations of both potato and tomato (Jupe et al., 2013 and Andolfo et al., 2014). The probe library was successfully used to map a late blight resistance in the diploid potato population B3C1HP. Using bulked-segregant resistance gene enrichment and sequencing (RenSeq) analysis in this population, which segregated 1:1 for the phenotype, the resistance was mapped to the lower end of chromosome 9. Furthermore, I developed a novel diagnostic tool, dRenSeq, to screen existing germplasm collection for the presence or absence of known, already characterised disease resistance genes, to prioritise novel resistances for research and breeding. dRenSeq was applied successfully on a set of S. okadae accessions as a proof of concept. The tomato late blight resistance gene Rpi-Ph3 was another focal point in this work, and the use of RenSeq was envisaged to identify Rpi-Ph3. However, another team published the gene (Zhang et al., 2014) and efforts were redirected towards the development of PCR markers to aid marker-assisted selection in breeding programs and to identify the cognate avirulence gene, Avr-Ph3. In addition, the new probe library was assessed in silico to evaluate if it would have enabled the identification of Rpi-Ph3 and homologous sequences. The identification of Avr-Ph3 was established through a large effector screen in an association panel of tomato accessions, co-infiltrations with Rpi-Ph3 in the model Solanaceae plant Nicotiana benthamiana and pathogen assays. The effector screen required the prior establishment of a robust transient expression system in tomato.
|
3 |
Deciphering Gene Regulatory Mechanisms Through Multi-omics IntegrationChen, Duojiao 09 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Complex biological systems are composed of many regulatory components, which can be measured with the advent of genomics technology. Each molecular assay is normally designed to interrogate one aspect of the cell state. However, a comprehensive understanding of the regulatory mechanism requires characterization from multiple levels such as genome, epigenome, and transcriptome. Integration of multi-omics data is urgently needed for understanding the global regulatory mechanism of gene expression. In recent years, single-cell technology offers unprecedented resolution for a deeper characterization of cellular diversity and states. High-quality single-cell suspensions from tissue biopsies are required for single-cell sequencing experiments. Tissue biopsies need to be processed as soon as being collected to avoid gene expression changes and RNA degradation. Although cryopreservation is a feasible solution to preserve freshly isolated samples, its effect on transcriptome profiles still needs to be investigated. Investigation of multi-omics data at the single-cell level can provide new insights into the biological process. In addition to the common method of integrating multi-omics data, it is also capable of simultaneously profiling the transcriptome and epigenome at single-cell resolution, enhancing the power of discovering new gene regulatory interactions. In this dissertation, we integrated bulk RNA-seq with ATAC-seq and several additional assays and revealed the complex mechanisms of ER–E2 interaction with nucleosomes. A comparison analysis was conducted for comparing fresh and frozen multiple myeloma single-cell RNA sequencing data and concluded that cryopreservation is a feasible protocol for preserving cells. We also analyzed the single-cell multiome data for mesenchymal stem cells. With the unified landscape from simultaneously profiling gene expression and chromatin accessibility, we discovered distinct osteogenic differentiation potential of mesenchymal stem cells and different associations with bone disease-related traits. We gained a deeper insight into the underlying gene regulatory mechanisms with this frontier single-cell mutliome sequencing technique.
|
4 |
Revisiting the Neuroprotective Role of 17B-Estradiol (E2): A Multi-Omics Based Analysis of the Rat Brain and SerumZaman, Khadiza 08 1900 (has links)
The ovarian hormone 17β-estradiol (E2) is one of the central regulators of the female reproductive system. E2 is also a pleiotropic regulator since it can exert its non-reproductive role on other organ systems. E2 is neuroprotective, it maintains body's energy homeostasis, participates in various repair mechanism and is required for neural development. However, there is a substantial evidence suggesting that there might be a molecular reprogramming of E2's action when it is supplied exogenously after E2 deprivation. Though the length of E2 deprivation and age has been linked to this phenomenon, the molecular components and how they activate this reprogramming is still elusive. Our main goal was to perform global proteomics and metabolomics study to identify the molecular components and their interaction networks that are being altered in the brain and serum after a short-term E2 treatment following ovariectomy (OVX) in Sprague Dawley rats. One of the strength of our global study is that it gave us extensive information on the brain proteome itself by identification of a wide number of proteins in different brain sections. By analyzing the differentially expressed proteins, our proteomics study revealed 49 different networks to be altered in 7 sections of the brain. Most of the perturbed networks were involved in cell metabolism, neural development, protein synthesis, cellular trafficking and degradation, and several stress response signaling pathways. We assessed the neuroenergetic status of the brain based on E2's response to various energy generating pathways, including glycolysis, TCA cycle, and oxidative phosphorylation, and several signaling pathways. All energetics pathways were shown to be downregulated in E2 treatment, which suggests that E2 exerts its neuroprotective role by restoring energy homeostasis in OVX rat model by regulating complex signaling and metabolic networks. Our second focus was to determine the metabolite response (amino acids and lipids) after E2 treatment in the brain and serum by employing targeted metabolomics study. We have found that in rat brain cortex there was significant upregulation of a wide number of amino acids suggesting alternate route of metabolism. Another alternate explanation is that E2 replacement replenished the amino acid pool in the tissue. Pathway enrichment analysis revealed upregulation of several pathways, including amino sugar metabolism, purine metabolism, and glutathione metabolism. By combining proteomics and metabolomics in two different biological matrices we were able to gather a vast array of information on how E2 replacement after E2 deprivation can confer neuroprotection. Our findings will help to create a foundation of basic science to be used for developing potentially effective hormone therapies.
|
5 |
A data collection programme for improving healthcare in UK human spaceflight venturesCope, H., Deane, C.S., Szewczyk, N.J., Etheridge, T., Williams, P.M., Willis, Craig R.G. 16 August 2023 (has links)
Yes / Over the next decade the number of humans venturing beyond Earth is projected to rapidly increase in both quantity and
diversity. Humans will regularly fly to the International Space Station until it is decommissioned by 2031, will return to the
Moon by 2025 via the Artemis programme, and will fly to space via commercial ventures. Spaceflight presents a hazardous
environment for human health. To understand spaceflight-associated health risks further and to increase safety via advanced
healthcare approaches, including personalised medicine, more data must be collected. Importantly, this data must be derived
from a diverse cohort of participants and a range of mission formats. We propose that the UK should start to consider all
citizens venturing into space as potential participants from which health and biological data could be consensually collected.
Importantly, we believe that this routine data collection programme should adopt a similar strategy to the UK National Health
Service and the UK Biobank, by including "omics" data for scientific and healthcare purposes. We consider how such a
world-leading programme, kick-started via a pilot study, might be realised through appropriate policy design, including which
measures to collect, when to collect them, and unique ethical considerations pertaining to the spacefaring population. / H.C. is supported by the Horizon Centre for Doctoral Training at the University of Nottingham (UKRI grant no. EP/ S023305/1).
|
6 |
Caenorhabditis elegans in microgravity: An omics perspectiveScott, A., Willis, Craig R.G., Muratani, M., Higashitani, A., Etheridge, T., Szewczyk, N.J., Deane, C.S. 16 August 2023 (has links)
Yes / The application of omics to study Caenorhabditis elegans (C. elegans) in the context of spaceflight is increasing, illuminating the wide-ranging biological impacts of spaceflight on physiology. In this review, we highlight the application of omics, including transcriptomics, genomics, proteomics, multi-omics, and integrated omics in the study of spaceflown C. elegans, and discuss the impact, use, and future direction of this branch of research. We highlight the variety of molecular alterations that occur in response to spaceflight, most notably changes in metabolic and neuromuscular gene regulation. These transcriptional features are reproducible and evident across many spaceflown species (e.g., mice and astronauts), supporting the use of C. elegans as a model organism to study spaceflight physiology with translational capital. Integrating tissue-specific, spatial, and multi-omics approaches, which quantitatively link molecular responses to phenotypic adaptations, will facilitate the identification of candidate regulatory molecules for therapeutic intervention and thus represents the next frontiers in C. elegans space omics research.
|
7 |
BioBridge: Bringing Data Exploration to BiologistsBoyd, Joseph 01 May 2014 (has links)
Since the completion of the Human Genome Project in 2003, biologists have become exceptionally good at producing data. Indeed, biological data has experienced a sustained exponential growth rate, putting effective and thorough analysis beyond the reach of many biologists. This thesis presents BioBridge, an interactive visualization tool developed to bring intuitive data exploration to biologists. BioBridge is designed to work on omics style tabular data in general and thus has broad applicability.
This work describes the design and evaluation of BioBridge's Entity View primary visualization as well the accompanying user interface. The Entity View visualization arranges glyphs representing biological entities (e.g. genes, proteins, metabolites) along with related text mining results to provide biological context. Throughout development the goal has been to maximize accessibility and usability for biologists who are not computationally inclined. Evaluations were done with three informal case studies, one of a metabolome dataset and two of microarray datasets.
BioBridge is a proof of concept that there is an underexploited niche in the data analysis ecosystem for tools that prioritize accessibility and usability. The use case studies, while anecdotal, are very encouraging. These studies indicate that BioBridge is well suited for the task of data exploration. With further development, BioBridge could become more flexible and usable as additional use case datasets are explored and more feedback is gathered.
|
8 |
Network-based approaches for multi-omic data integrationXiao, Hui January 2019 (has links)
The advent of advanced high-throughput biological technologies provides opportunities to measure the whole genome at different molecular levels in biological systems, which produces different types of omic data such as genome, epigenome, transcriptome, translatome, proteome, metabolome and interactome. Biological systems are highly dynamic and complex mechanisms which involve not only the within-level functionality but also the between-level regulation. In order to uncover the complexity of biological systems, it is desirable to integrate multi-omic data to transform the multiple level data into biological knowledge about the underlying mechanisms. Due to the heterogeneity and high-dimension of multi-omic data, it is necessary to develop effective and efficient methods for multi-omic data integration. This thesis aims to develop efficient approaches for multi-omic data integration using machine learning methods and network theory. We assume that a biological system can be represented by a network with nodes denoting molecules and edges indicating functional links between molecules, in which multi-omic data can be integrated as attributes of nodes and edges. We propose four network-based approaches for multi-omic data integration using machine learning methods. Firstly, we propose an approach for gene module detection by integrating multi-condition transcriptome data and interactome data using network overlapping module detection method. We apply the approach to study the transcriptome data of human pre-implantation embryos across multiple development stages, and identify several stage-specific dynamic functional modules and genes which provide interesting biological insights. We evaluate the reproducibility of the modules by comparing with some other widely used methods and show that the intra-module genes are significantly overlapped between the different methods. Secondly, we propose an approach for gene module detection by integrating transcriptome, translatome, and interactome data using multilayer network. We apply the approach to study the ribosome profiling data of mTOR perturbed human prostate cancer cells and mine several translation efficiency regulated modules associated with mTOR perturbation. We develop an R package, TERM, for implementation of the proposed approach which offers a useful tool for the research field. Next, we propose an approach for feature selection by integrating transcriptome and interactome data using network-constrained regression. We develop a more efficient network-constrained regression method eGBL. We evaluate its performance in term of variable selection and prediction, and show that eGBL outperforms the other related regression methods. With application on the transcriptome data of human blastocysts, we select several interested genes associated with time-lapse parameters. Finally, we propose an approach for classification by integrating epigenome and transcriptome data using neural networks. We introduce a superlayer neural network (SNN) model which learns DNA methylation and gene expression data parallelly in superlayers but with cross-connections allowing crosstalks between them. We evaluate its performance on human breast cancer classification. The SNN provides superior performances and outperforms several other common machine learning methods. The approaches proposed in this thesis offer effective and efficient solutions for integration of heterogeneous high-dimensional datasets, which can be easily applied to other datasets presenting the similar structures. They are therefore applicable to many fields including but not limited to Bioinformatics and Computer Science.
|
9 |
Metabolic diversity and synthesis of medium chain length polyhydroxyalkanoates by Pseudomonas putida LS46 cultured with biodiesel-derived by-productsFu, Jilagamazhi 06 November 2015 (has links)
The metabolism and physiology of Pseudomonas putida strain LS46 was investigated using biodiesel-derived waste streams as potential low cost substrates for production of medium chain length polyhydroxyalkanoates (mcl-PHA). Proteomic and trranscriptomic analyses were used to correlate specific gene and gene product expression patterns with differences in phenotypes of mcl-PHA biosynthesis by P. putida LS46. Growth and mcl-PHA content of P. putida LS46 were similar in cultures containing biodiesel-derived waste glycerol versus pure glycerol, and mcl-PHA synthesis occurred during stationary phase after nitrogen concentrations in the medium were exhausted. Waste glycerol cultures contained elevated concentrations of heavy metal ions, such as copper, which induced significant changes in gene expression levels related to heavy metal resistance. Several membrane-bound proteins, such as CusABC efflux and CopAB were identified and putatively play a role in regulating cellular copper concentrations. Cultures containing waste free fatty acids synthesized mcl-PHA throughout the exponential growth phase. Protein expression levels of two mcl-PHA synthases were suppressed during exponential phase growth in waste glycerol cultures, putatively via post-transcriptional regulation. Culture specific expression of monomer supplying proteins (PhaJ1 and PhaG), and sets of fatty acid oxidation enzymes were observed, and may have contributed to differences in the composition of polymers synthesized by P. putida LS46 cultured on the two substrates. Expression levels of the majority of mcl-PHA biosynthesis pathway genes were stable during active polymer synthesis in waste glycerol cultures. However, variations in protein expression levels, and in some cases their corresponding mRNAs, were observed in a number of other metabolic patheays, such as glycerol transportation, partial glycolysis, pyruvate metabolism, the TCA cycle, and fatty acid biosynthesis. These data suggest potential regulatory points that may determine carbon flux during mcl-PHA biosynthesis. Evaluation of identified genetic targets in P. putida LS46 that putatively influence mcl-PHA biosynthesis and monomer composition merit further studies. / February 2016
|
10 |
Exploring Archaeal Communities And Genomes Across Five Deep-Sea Brine Lakes Of The Red Sea With A Focus On MethanogensGuan, Yue 15 December 2015 (has links)
The deep-sea hypersaline lakes in the Red Sea are among the most challenging,
extreme, and unusual environments on the planet Earth. Despite their harshness
to life, they are inhabited by diverse and novel members of prokaryotes.
Methanogenesis was proposed as one of the main metabolic pathways that drive
microbial colonization in similar habitats. However, not much is known about the
identities of the methane-producing microbes in the Red Sea, let alone the way in
which they could adapt to such poly extreme environments. Combining a range of
microbial community assessment, cultivation and omics (genomics,
transcriptomics, and single amplified genomics) approaches, this dissertation
seeks to fill these gaps in our knowledge by studying archaeal composition,
particularly methanogens, their genomic capacities and transcriptomic
characteristics in order to elucidate their diversity, function, and adaptation to the
deep-sea brines of the Red Sea. Although typical methanogens are not abundant
in the samples collected from brine pool habitats of the Red Sea, the pilot
cultivation experiment has revealed novel halophilic methanogenic species of the
domain Archaea. Their physiological traits as well as their genomic and
transcriptomic features unveil an interesting genetic and functional adaptive
capacity that allows them to thrive in the unique deep-sea hypersaline
environments in the Red Sea.
|
Page generated in 0.0639 seconds