• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 56
  • 15
  • 7
  • 4
  • 2
  • 1
  • 1
  • Tagged with
  • 106
  • 106
  • 64
  • 25
  • 16
  • 16
  • 15
  • 14
  • 14
  • 13
  • 13
  • 12
  • 12
  • 11
  • 11
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

HAPPI: A Bioinformatics Database Platform Enabling Network Biology Studies

Mamidipalli, SudhaRani 29 June 2006 (has links)
Submitted to the faculty of the informatics Graduate Program in partial fulfillment of the requirements for the degree Master of Science in Bioinformatics in the School of Informatics, Indiana University, May 2006 / The publication of the draft human genome consisting of 30,000 genes is merely the beginning of genome biology. A new way to understand the complexity and richness of molecular and cellular function of proteins in biological processes is through understanding of biological networks. These networks include protein-protein interaction networks, gene regulatory networks, and metabolic networks. In this thesis, we focus on human protein-protein interaction networks using informatics techniques. First, we performed a thorough literature survey to document different experimental methods to detect and collect protein interactions, current public databases that store these interactions, computational software to predict, validate and interpret protein networks. Then, we developed the Human Annotated Protein-Protein Interaction (HAPPI) database to manage a wealth of integrated information related to protein functions, protein-protein functional links, and protein-protein interactions. Approximately 12900 proteins from Swissprot, 57900 proteins from Trembl, 52186 protein-domains from Swisspfam, 4084 gene-pathways from KEGG, 2403190 interactions from STRING and 51207 interactions from OPHID public databases were integrated into a single relational database platform using Oracle 10g on an IU Supercomputing grid. We further assigned a confidence score to each protein interaction pair to help assess the quality and reliability of protein-protein interaction. We hosted the database on the Discovery Informatics and Computing web site, which is now publicly accessible. HAPPI database differs from other protein interaction databases in these following aspects: 1) It focuses on human protein interactions and contains approximately 860000 high-confidence protein interaction records—one of the most complete and reliable sources of human protein interaction today; 2) It includes thorough protein domain, gene and pathway information of interacting proteins, therefore providing a whole view of protein functional information; 3) It contains a consistent ranking score that can be used to gauge the confidence of protein interactions. To show the benefits of HAPPI database, we performed a case study using Insulin Signaling pathway in collaboration with a biology team on campus. We began by taking two sets of proteins that were previously well studied as separate processes, set A and set B. We queried these proteins against the HAPPI database, and derived high-confidence protein interaction data sets annotated with known KEGG pathways. We then organized these protein interactions on a network diagram. The end result shows many novel hub proteins that connect set A or B proteins. Some hub proteins are even novel members outside of any annotated pathway, making them interesting targets to validate for subsequent biological studies.
2

Developing machine learning tools to understand transcriptional regulation in plants

Song, Qi 09 September 2019 (has links)
Abiotic stresses constitute a major category of stresses that negatively impact plant growth and development. It is important to understand how plants cope with environmental stresses and reprogram gene responses which in turn confers stress tolerance. Recent advances of genomic technologies have led to the generation of much genomic data for the model plant, Arabidopsis. To understand gene responses activated by specific external stress signals, these large-scale data sets need to be analyzed to generate new insight of gene functions in stress responses. This poses new computational challenges of mining gene associations and reconstructing regulatory interactions from large-scale data sets. In this dissertation, several computational tools were developed to address the challenges. In Chapter 2, ConSReg was developed to infer condition-specific regulatory interactions and prioritize transcription factors (TFs) that are likely to play condition specific regulatory roles. Comprehensive investigation was performed to optimize the performance of ConSReg and a systematic recovery of nitrogen response TFs was performed to evaluate ConSReg. In Chapter 3, CoReg was developed to infer co-regulation between genes, using only regulatory networks as input. CoReg was compared to other computational methods and the results showed that CoReg outperformed other methods. CoReg was further applied to identified modules in regulatory network generated from DAP-seq (DNA affinity purification sequencing). Using a large expression dataset generated under many abiotic stress treatments, many regulatory modules with common regulatory edges were found to be highly co-expressed, suggesting that target modules are structurally stable modules under abiotic stress conditions. In Chapter 4, exploratory analysis was performed to classify cell types for Arabidopsis root single cell RNA-seq data. This is a first step towards construction of a cell-type-specific regulatory network for Arabidopsis root cells, which is important for improving current understanding of stress response. / Doctor of Philosophy / Abiotic stresses constitute a major category of stresses that negatively impact plant growth and development. It is important to understand how plants cope with environmental stresses and reprogram gene responses which in turn confers stress tolerance to plants. Genomics technology has been used in past decade to generate gene expression data under different abiotic stresses for the model plant, Arabidopsis. Recent new genomic technologies, such as DAP-seq, have generated large scale regulatory maps that provide information regarding which gene has the potential to regulate other genes in the genome. However, this technology does not provide context specific interactions. It is unknown which transcription factor can regulate which gene under a specific abiotic stress condition. To address this challenge, several computational tools were developed to identify regulatory interactions and co-regulating genes for stress response. In addition, using single cell RNA-seq data generated from the model plant organism Arabidopsis, preliminary analysis was performed to build model that classifies Arabidopsis root cell types. This analysis is the first step towards the ultimate goal of constructing cell-typespecific regulatory network for Arabidopsis, which is important for improving current understanding of stress response in plants.
3

Inference of gene regulatory networks for Mus musculus by incorporating network motifs from yeast.

Weishaupt, Holger January 2007 (has links)
In recent time particular interest has been drawn to the inference of gene regulatory networks from microarray gene expression data. But despite major improvements with data based methods, the network reconstruction from expression data alone still presents a computationally complex (NP-hard) problem. In this work it is incorporated additional information – regulatory motifs from yeast, when inferring a gene regulatory network for mouse genes. It was put forward the hypothesis that regulatory patterns analogous to these motifs are present in the set of mouse genes and can be identified by comparing yeast and mouse genes in terms of sequence similarity or Gene Ontology (The Gene Ontology Consortium 2000) annotations. In order to examine this hypothesis, small permutations of genes with high similarity to such yeast gene regulatory motifs were first tested against simple data-driven regulatory networks by means of consistency with the expression data. And secondly, using the best scored interactions provided by these permutations it were then inferred networks for the whole set of mouse genes. The results showed that individual permutations of genes with a high similarity to a given yeast motif did not perform better than low scored motifs and that complete networks, which were inferred from regulatory interactions provided by permutations, did also neither show any noticeable improvement over the corresponding data-driven network nor a high consistency with the expression data at all. It was therefore found that the hypothesis failed, i.e. neither the use of sequence similarity nor searching for identical functional annotations between mouse and yeast genes allowed to identify sets of genes that showed a high consistency with the expression data or would have allowed for an improved gene regulatory network inference.
4

Transcriptional co-regulation of microRNAs and protein-coding genes

Webber, Aaron January 2013 (has links)
This thesis was presented by Aaron Webber on the 4th December 2013 for the degree of Doctor of Philosophy from the University of Manchester. The title of this thesis is ‘Transcriptional co-regulation of microRNAs and protein-coding genes’. The thesis relates to gene expression regulation within humans and closely related primate species. We have investigated the binding site distributions from publically available ChIP-seq data of 117 transcription regulatory factors (TRFs) within the human genome. These were mapped to cis-regulatory regions of two major classes of genes,  20,000 genes encoding proteins and  1500 genes encoding microRNAs. MicroRNAs are short 20 - 24 nt noncoding RNAs which bind complementary regions within target mRNAs to repress translation. The complete collection of ChIP-seq binding site data is related to genomic associations between protein-coding and microRNA genes, and to the expression patterns and functions of both gene types across human tissues. We show that microRNA genes are associated with highly regulated protein-coding gene regions, and show rigorously that transcriptional regulation is greater than expected, given properties of these protein-coding genes. We find enrichment in developmental proteins among protein-coding genes hosting microRNA sequences. Novel subclasses of microRNAs are identified that lie outside of protein-coding genes yet may still be expressed from a shared promoter region with their protein-coding neighbours. We show that such microRNAs are more likely to form regulatory feedback loops with the transcriptional regulators lying in the upstream protein-coding promoter region. We show that when a microRNA and a TRF regulate one another, the TRF is more likely to sometimes function as a repressor. As in many studies, the data show that microRNAs lying downstream of particular TRFs target significantly many genes in common with these TRFs. We then demonstrate that the prevalence of such TRF/microRNA regulatory partnerships relates directly to the variation in mRNA expression across human tissues, with the least variable mRNAs having the most significant enrichment in such partnerships. This result is connected to theory describing the buffering of gene expression variation by microRNAs. Taken together, our study has demonstrated significant novel linkages between the transcriptional TRF and post-transcriptional microRNA-mediated regulatory layers. We finally consider transcriptional regulators alone, by mapping these to genes clustered on the basis of their expression patterns through time, within the context of CD4+ T cells from African green monkeys and Rhesus macaques infected with Simian immunodeficiency virus (SIV). African green monkeys maintain a functioning immune system despite never clearing the virus, while in rhesus macaques, the immune system becomes chronically stimulated leading to pathogenesis. Gene expression clusters were identified characterizing the natural and pathogenic host systems. We map transcriptional regulators to these expression clusters and demonstrate significant yet unexpected co-binding by two heterodimers (STAT1:STAT2 and BATF:IRF4) over key viral response genes. From 34 structural families of TRFs, we demonstrate that bZIPs, STATs and IRFs are the most frequently perturbed upon SIV infection. Our work therefore contributes to the characterization of both natural and pathogenic SIV infections, with longer term implications for HIV therapeutics.
5

The Potential Power of Dynamics in Epistasis Analysis

Awdeh, Aseel January 2015 (has links)
Inferring regulatory relationships between genes, including the direction and the nature of influence between them, is the foremost problem in the field of genetics. One classical approach to this problem is epistasis analysis. Broadly speaking, epistasis analysis infers the regulatory relationships between a pair of genes in a genetic pathway by considering the patterns of change in an observable trait resulting from single and double deletion of genes. More specifically, a “surprising” situation occurs when the phenotype of a double mutant has a similar, aggravating or alleviating effect compared to the phenotype resulting from the single deletion of either one of the genes. As useful as this broad approach has been, there are limits to its ability to discriminate alternative pathway structures, meaning it is not always possible to infer the relationship between the genes. Here, we explore the possibility of dynamic epistasis analysis. In addition to performing genetic perturbations, we drive a genetic pathway with a dynamic, time-varying upstream signal, where the phenotypic consequence is measured at each time step. We explore the theoretical power of dynamic epistasis analysis by conducting an identifiability analysis of Boolean models of genetic pathways, comparing static and dynamic approaches. We also explore the identifiability of individual links in the pathway. Through these evaluations, we quantify how helpful the addition of dynamics is. We believe that a dynamic input in addition to epistasis analysis is a powerful tool to discriminate between different networks. Our primary findings show that the use of a dynamic input signal alone, without genetic perturbations, appears to be very weak in comparison with the more traditional genetic approaches based on the deletion of genes. However, the combination of dynamical input with genetic perturbations is far more powerful than the classical epistasis analysis approach. In all cases, we find that even relatively simple input dynamics with gene deletions greatly increases the power of epistasis analysis to discriminate alternative network structures and to confidently identify individual links in a network. Our positive results show the potential value of dynamics in epistasis analysis.
6

Inference of gene regulatory networks for Mus musculus by incorporating network motifs from yeast.

Weishaupt, Holger January 2007 (has links)
<p>In recent time particular interest has been drawn to the inference of gene regulatory networks from microarray gene expression data. But despite major improvements with data based methods, the network reconstruction from expression data alone still presents a computationally complex (NP-hard) problem. In this work it is incorporated additional information – regulatory motifs from yeast, when inferring a gene regulatory network for mouse genes. It was put forward the hypothesis that regulatory patterns analogous to these motifs are present in the set of mouse genes and can be identified by comparing yeast and mouse genes in terms of sequence similarity or Gene Ontology (The Gene Ontology Consortium 2000) annotations.</p><p>In order to examine this hypothesis, small permutations of genes with high similarity to such yeast gene regulatory motifs were first tested against simple data-driven regulatory networks by means of consistency with the expression data. And secondly, using the best scored interactions provided by these permutations it were then inferred networks for the whole set of mouse genes.</p><p>The results showed that individual permutations of genes with a high similarity to a given yeast motif did not perform better than low scored motifs and that complete networks, which were inferred from regulatory interactions provided by permutations, did also neither show any noticeable improvement over the corresponding data-driven network nor a high consistency with the expression data at all.</p><p>It was therefore found that the hypothesis failed, i.e. neither the use of sequence similarity nor searching for identical functional annotations between mouse and yeast genes allowed to identify sets of genes that showed a high consistency with the expression data or would have allowed for an improved gene regulatory network inference.</p>
7

Comparisons of statistical modeling for constructing gene regulatory networks

Chen, Xiaohui 11 1900 (has links)
Genetic regulatory networks are of great importance in terms of scientific interests and practical medical importance. Since a number of high-throughput measurement devices are available, such as microarrays and sequencing techniques, regulatory networks have been intensively studied over the last decade. Based on these high-throughput data sets, statistical interpretations of these billions of bits are crucial for biologist to extract meaningful results. In this thesis, we compare a variety of existing regression models and apply them to construct regulatory networks which span trancription factors and microRNAs. We also propose an extended algorithm to address the local optimum issue in finding the Maximum A Posterjorj estimator. An E. coli mRNA expression microarray data set with known bona fide interactions is used to evaluate our models and we show that our regression networks with a properly chosen prior can perform comparably to the state-of-the-art regulatory network construction algorithm. Finally, we apply our models on a p53-related data set, NCI-60 data. By further incorporating available prior structural information from sequencing data, we identify several significantly enriched interactions with cell proliferation function. In both of the two data sets, we select specific examples to show that many regulatory interactions can be confirmed by previous studies or functional enrichment analysis. Through comparing statistical models, we conclude from the project that combining different models with over-representation analysis and prior structural information can improve the quality of prediction and facilitate biological interpretation. Keywords: regulatory network, variable selection, penalized maximum likelihood estimation, optimization, functional enrichment analysis.
8

Comparisons of statistical modeling for constructing gene regulatory networks

Chen, Xiaohui 11 1900 (has links)
Genetic regulatory networks are of great importance in terms of scientific interests and practical medical importance. Since a number of high-throughput measurement devices are available, such as microarrays and sequencing techniques, regulatory networks have been intensively studied over the last decade. Based on these high-throughput data sets, statistical interpretations of these billions of bits are crucial for biologist to extract meaningful results. In this thesis, we compare a variety of existing regression models and apply them to construct regulatory networks which span trancription factors and microRNAs. We also propose an extended algorithm to address the local optimum issue in finding the Maximum A Posterjorj estimator. An E. coli mRNA expression microarray data set with known bona fide interactions is used to evaluate our models and we show that our regression networks with a properly chosen prior can perform comparably to the state-of-the-art regulatory network construction algorithm. Finally, we apply our models on a p53-related data set, NCI-60 data. By further incorporating available prior structural information from sequencing data, we identify several significantly enriched interactions with cell proliferation function. In both of the two data sets, we select specific examples to show that many regulatory interactions can be confirmed by previous studies or functional enrichment analysis. Through comparing statistical models, we conclude from the project that combining different models with over-representation analysis and prior structural information can improve the quality of prediction and facilitate biological interpretation. Keywords: regulatory network, variable selection, penalized maximum likelihood estimation, optimization, functional enrichment analysis.
9

Transcriptional Network Analysis During Early Differentiation Reveals a Role for Polycomb-like 2 in Mouse Embryonic Stem Cell Commitment

Walker, Emily 11 January 2012 (has links)
We used mouse embryonic stem cells (ESCs) as a model to study the mechanisms that regulate stem cell fate. Using gene expression analysis during a time course of differentiation, we identified 281 candidate regulators of ESC fate. To integrate these candidate regulators into the known ESC transcriptional network, we incorporated promoter occupancy data for OCT4, NANOG and SOX2. We used shRNA knockdown studies followed by a high-content fluorescence imaging assay to test the requirement of our predicted regulators in maintaining self-renewal. We further integrated promoter occupancy data for Polycomb group (PcG) proteins, EED and PHC1 to identify 43 transcriptional networks in which we predict that OCT4 and NANOG co-operate with EED and PHC1 to influence the expression of multiple developmental regulators. Next, we turned our focus to the PcG protein PCL2 which we identified as being bound by both OCT4 and NANOG and down-regulated during differentiation. PcG proteins are conserved epigenetic transcriptional repressors that control numerous developmental gene expression programs. Using multiple biochemical strategies, we demonstrated that PCL2 associates with Polycomb Repressive Complex 2 (PRC2) in mouse ESCs, a complex that exerts its effect on gene expression through H3K27me3. Although PCL2 was not required for global histone methylation, it was required at specific target regions to maintain proper levels of H3K27me3. Knockdown of Pcl2 in ESCs resulted in heightened self-renewal characteristics and defects in differentiation. Integration of global gene expression and promoter occupancy analyses allowed us to identify PCL2 and PRC2 transcriptional targets and draft regulatory networks. We describe the role of PCL2 in both modulating transcription of ESC self-renewal genes in undifferentiated ESCs as well as developmental regulators during early commitment and differentiation.
10

Transcriptional Network Analysis During Early Differentiation Reveals a Role for Polycomb-like 2 in Mouse Embryonic Stem Cell Commitment

Walker, Emily 11 January 2012 (has links)
We used mouse embryonic stem cells (ESCs) as a model to study the mechanisms that regulate stem cell fate. Using gene expression analysis during a time course of differentiation, we identified 281 candidate regulators of ESC fate. To integrate these candidate regulators into the known ESC transcriptional network, we incorporated promoter occupancy data for OCT4, NANOG and SOX2. We used shRNA knockdown studies followed by a high-content fluorescence imaging assay to test the requirement of our predicted regulators in maintaining self-renewal. We further integrated promoter occupancy data for Polycomb group (PcG) proteins, EED and PHC1 to identify 43 transcriptional networks in which we predict that OCT4 and NANOG co-operate with EED and PHC1 to influence the expression of multiple developmental regulators. Next, we turned our focus to the PcG protein PCL2 which we identified as being bound by both OCT4 and NANOG and down-regulated during differentiation. PcG proteins are conserved epigenetic transcriptional repressors that control numerous developmental gene expression programs. Using multiple biochemical strategies, we demonstrated that PCL2 associates with Polycomb Repressive Complex 2 (PRC2) in mouse ESCs, a complex that exerts its effect on gene expression through H3K27me3. Although PCL2 was not required for global histone methylation, it was required at specific target regions to maintain proper levels of H3K27me3. Knockdown of Pcl2 in ESCs resulted in heightened self-renewal characteristics and defects in differentiation. Integration of global gene expression and promoter occupancy analyses allowed us to identify PCL2 and PRC2 transcriptional targets and draft regulatory networks. We describe the role of PCL2 in both modulating transcription of ESC self-renewal genes in undifferentiated ESCs as well as developmental regulators during early commitment and differentiation.

Page generated in 0.1267 seconds