Spelling suggestions: "subject:"egulatory network"" "subject:"eegulatory network""
71 |
Inférence de réseaux causaux à partir de données interventionnelles / Causal network inference from intervention dataMonneret, Gilles 15 February 2018 (has links)
L'objet de cette thèse est l'utilisation de données transcriptomiques actuelles dans le but d'en inférer un réseau de régulation génique. Ces données sont souvent complexes, et en particulier des données d'interventions peuvent être présente. L'utilisation de la théorie de la causalité permet d'utiliser ces interventions afin d'obtenir des réseaux causaux acycliques. Je questionne la notion d'acyclicité, puis en m'appuyant sur cette théorie, je propose plusieurs algorithmes et/ou améliorations à des techniques actuelles permettant d'utiliser ce type de données particulières. / The purpose of this thesis is the use of current transcriptomic data in order to infer a gene regulatory network. These data are often complex, and in particular intervention data may be present. The use of causality theory makes it possible to use these interventions to obtain acyclic causal networks. I question the notion of acyclicity, then based on this theory, I propose several algorithms and / or improvements to current techniques to use this type of data.
|
72 |
Integration of TP53, DREAM, MMB-FOXM1 and RB-E2F target gene analyses identifies cell cycle gene regulatory networksFischer, Martin, Grossmann, Patrick, Padi, Megha, DeCaprio, James A. January 2016 (has links)
Cell cycle (CC) and TP53 regulatory networks are frequently deregulated in cancer. While numerous genome-wide studies of TP53 and CC-regulated genes have been performed, significant variation between studies has made it difficult to assess regulation of any given gene of interest. To overcome the limitation of individual studies, we developed a meta-analysis approach to identify high confidence target genes that reflect their frequency of identification in independent datasets. Gene regulatory networks were generated by comparing differential expression of TP53 and CC-regulated genes with chromatin
immunoprecipitation studies for TP53, RB1, E2F, DREAM, B-MYB, FOXM1 and MuvB. RNA-seq data from p21-null cells revealed that gene downregulation by TP53 generally requires p21 (CDKN1A). Genes downregulated by TP53 were also identified as CC genes bound by the DREAM complex. The transcription factors RB, E2F1 and E2F7 bind to a subset of DREAM target genes that function in G1/S of the CC while B-MYB, FOXM1 and MuvB control G2/M gene expression. Our approach yields high confidence ranked target gene maps for TP53, DREAM, MMB-FOXM1 and RB-E2F and enables prediction and distinction of CC regulation. A web-based atlas at www.targetgenereg.org enables assessing the regulation of any human gene of interest.
|
73 |
Revealing the Structure and Evolution of a Fruit Fly Gene Regulatory Network by Varied Genetic ApproachesHughes, Jesse T. January 2021 (has links)
No description available.
|
74 |
MAMMALIAN TESTIS-DETERMINING FACTOR SRY HAS EVOLVED TO THE EDGE OF AMBIGUITYChen, Yen-Shan 23 August 2013 (has links)
No description available.
|
75 |
Estimating Gene Regulatory Activity using Mathematical OptimizationTrescher, Saskia 28 September 2020 (has links)
Die Regulation der Genexpression ist einer der wichtigsten zellulären Prozesse und steht in Zusammenhang mit der Entstehung diverser Krankheiten. Regulationsmechanismen können mit einer Vielzahl von Methoden experimentell untersucht werden, zugleich erfordert die Integration der Datensätze in umfassende Modelle stringente rechnergestützte Methoden. Ein Teil dieser Methoden modelliert die genomweite Genexpression als (lineares) Gleichungssystem über die Aktivität und Beziehungen von Transkriptionsfaktoren (TF), Genen und anderen Faktoren und optimiert die Parameter, sodass die gemessenen Expressionsintensitäten möglichst genau wiedergegeben werden. Trotz ihrer gemeinsamen Wurzeln in der mathematischen Optimierung unterscheiden sich die Methoden stark in der Art der integrierten Daten, im für ihre Anwendung notwendigen Hintergrundwissen, der Granularität des Regulationsmodells, des konkreten Paradigmas zur Lösung des Optimierungsproblems, und der zur Evaluation verwendeten Datensätze.
In dieser Arbeit betrachten wir fünf solcher Methoden und stellen einen qualitativen und quantitativen Vergleich auf. Unsere Ergebnisse zeigen, dass die Überschneidungen der Ergebnisse sehr gering sind, was nicht auf die Stichprobengröße oder das regulatorische Netzwerk zurückgeführt werden kann. Ein Grund für die genannten Defizite könnten die vereinfachten Modelle zellulärer Prozesse sein, da diese vorhandene Rückkopplungsschleifen ignorieren. Wir schlagen eine neue Methode (Florae) mit Schwerpunkt auf die Berücksichtigung von Rückkopplungsschleifen vor und beurteilen deren Ergebnisse. Mit Floræ können wir die Identifizierung von Knockout- und Knockdown-TF in synthetischen Datensätzen verbessern. Unsere Ergebnisse und die vorgeschlagene Methode erweitern das Wissen über genregulatorische Aktivität können die Identifizierung von Ursachen und Mechanismen regulatorischer (Dys-)Funktionen und die Entwicklung von medizinischen Biomarkern und Therapien unterstützen. / Gene regulation is one of the most important cellular processes and closely interlinked pathogenesis. The elucidation of regulatory mechanisms can be approached by many experimental methods, yet integration of the resulting heterogeneous, large, and noisy data sets into comprehensive models requires rigorous computational methods. A prominent class of methods models genome-wide gene expression as sets of (linear) equations over the activity and relationships of transcription factors (TFs), genes and other factors and optimizes parameters to fit the measured expression intensities. Despite their common root in mathematical optimization, they vastly differ in the types of experimental data being integrated, the background knowledge necessary for their application, the granularity of their regulatory model, the concrete paradigm used for solving the optimization problem and the data sets used for evaluation.
We review five recent methods of this class and compare them qualitatively and quantitatively in a unified framework. Our results show that the result overlaps are very low, though sometimes statistically significant. This poor overall performance cannot be attributed to the sample size or to the specific regulatory network provided as background knowledge. We suggest that a reason for this deficiency might be the simplistic model of cellular processes in the presented methods, where TF self-regulation and feedback loops were not represented. We propose a new method for estimating transcriptional activity, named Florae, with a particular focus on the consideration of feedback loops and evaluate its results. Using Floræ, we are able to improve the identification of knockout and knockdown TFs in synthetic data sets. Our results and the proposed method extend the knowledge about gene regulatory activity and are a step towards the identification of causes and mechanisms of regulatory (dys)functions, supporting the development of medical biomarkers and therapies.
|
76 |
Integrative Modeling and Analysis of High-throughput Biological DataChen, Li 21 January 2011 (has links)
Computational biology is an interdisciplinary field that focuses on developing mathematical models and algorithms to interpret biological data so as to understand biological problems. With current high-throughput technology development, different types of biological data can be measured in a large scale, which calls for more sophisticated computational methods to analyze and interpret the data. In this dissertation research work, we propose novel methods to integrate, model and analyze multiple biological data, including microarray gene expression data, protein-DNA interaction data and protein-protein interaction data. These methods will help improve our understanding of biological systems.
First, we propose a knowledge-guided multi-scale independent component analysis (ICA) method for biomarker identification on time course microarray data. Guided by a knowledge gene pool related to a specific disease under study, the method can determine disease relevant biological components from ICA modes and then identify biologically meaningful markers related to the specific disease. We have applied the proposed method to yeast cell cycle microarray data and Rsf-1-induced ovarian cancer microarray data. The results show that our knowledge-guided ICA approach can extract biologically meaningful regulatory modes and outperform several baseline methods for biomarker identification.
Second, we propose a novel method for transcriptional regulatory network identification by integrating gene expression data and protein-DNA binding data. The approach is built upon a multi-level analysis strategy designed for suppressing false positive predictions. With this strategy, a regulatory module becomes increasingly significant as more relevant gene sets are formed at finer levels. At each level, a two-stage support vector regression (SVR) method is utilized to reduce false positive predictions by integrating binding motif information and gene expression data; a significance analysis procedure is followed to assess the significance of each regulatory module. The resulting performance on simulation data and yeast cell cycle data shows that the multi-level SVR approach outperforms other existing methods in the identification of both regulators and their target genes. We have further applied the proposed method to breast cancer cell line data to identify condition-specific regulatory modules associated with estrogen treatment. Experimental results show that our method can identify biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer.
Third, we propose a bootstrapping Markov Random Filed (MRF)-based method for subnetwork identification on microarray data by incorporating protein-protein interaction data. Methodologically, an MRF-based network score is first derived by considering the dependency among genes to increase the chance of selecting hub genes. A modified simulated annealing search algorithm is then utilized to find the optimal/suboptimal subnetworks with maximal network score. A bootstrapping scheme is finally implemented to generate confident subnetworks. Experimentally, we have compared the proposed method with other existing methods, and the resulting performance on simulation data shows that the bootstrapping MRF-based method outperforms other methods in identifying ground truth subnetwork and hub genes. We have then applied our method to breast cancer data to identify significant subnetworks associated with drug resistance. The identified subnetworks not only show good reproducibility across different data sets, but indicate several pathways and biological functions potentially associated with the development of breast cancer and drug resistance. In addition, we propose to develop network-constrained support vector machines (SVM) for cancer classification and prediction, by taking into account the network structure to construct classification hyperplanes. The simulation study demonstrates the effectiveness of our proposed method. The study on the real microarray data sets shows that our network-constrained SVM, together with the bootstrapping MRF-based subnetwork identification approach, can achieve better classification performance compared with conventional biomarker selection approaches and SVMs.
We believe that the research presented in this dissertation not only provides novel and effective methods to model and analyze different types of biological data, the extensive experiments on several real microarray data sets and results also show the potential to improve the understanding of biological mechanisms related to cancers by generating novel hypotheses for further study. / Ph. D.
|
77 |
A multi-objective GP-PSO hybrid algorithm for gene regulatory network modelingCai, Xinye January 1900 (has links)
Doctor of Philosophy / Department of Electrical and Computer Engineering / Sanjoy Das / Stochastic algorithms are widely used in various modeling and optimization problems. Evolutionary algorithms are one class of population-based stochastic approaches that are inspired from Darwinian evolutionary theory. A population of candidate solutions is initialized at the first generation of the algorithm. Two variation operators, crossover and mutation, that mimic the real world evolutionary process, are applied on the population to produce new solutions from old ones. Selection based on the concept of survival of the fittest is used to preserve parent solutions for next generation. Examples of such algorithms include genetic algorithm (GA) and genetic programming (GP). Nevertheless, other stochastic algorithms may be inspired from animals’ behavior such as particle swarm optimization (PSO), which imitates the cooperation of a flock of birds. In addition, stochastic algorithms are able to address multi-objective optimization problems by using the concept of dominance. Accordingly, a set of solutions that do not dominate each other will be obtained, instead of just one best solution.
This thesis proposes a multi-objective GP-PSO hybrid algorithm to recover gene regulatory network models that take environmental data as stimulus input. The algorithm infers a model based on both phenotypic and gene expression data. The proposed approach is able to simultaneously infer network structures and estimate their associated parameters, instead of doing one or the other iteratively as other algorithms need to. In addition, a non-dominated sorting approach and an adaptive histogram method based on the hypergrid strategy are adopted to address ‘convergence’ and ‘diversity’ issues in multi-objective optimization.
Gene network models obtained from the proposed algorithm are compared to a synthetic network, which mimics key features of Arabidopsis flowering control system, visually and numerically. Data predicted by the model are compared to synthetic data, to verify that they are able to closely approximate the available phenotypic and gene expression data. At the end of this thesis, a novel breeding strategy, termed network assisted selection, is proposed as an extension of our hybrid approach and application of obtained models for plant breeding. Breeding simulations based on network assisted selection are compared to one common breeding strategy, marker assisted selection. The results show that NAS is better both in terms of breeding speed and final phenotypic level.
|
78 |
Architektura regulační sítě metabolismu / The architecture of regulatory network of metabolismGeryk, Jan January 2013 (has links)
The thesis focus on the modularity of metabolic network and foremost on the architecture of regulatory network representing direct regulatory interactions between metabolites and enzymes. I focus on the "modularity measure" in my first work. Modularity measure is quantitative measure of network modularity commonly used for module identification. It was showed that algorithms using this measure can produce modules that are composed of two clearly pronounced sub-modules. Maximum size of module for which there is a risk that is is composed of two sub-modules is called resolution limit of modularity measure. In my first work I generalize resolution limit of modularity measure. The generalized version provide insight to the origin of resolution limit in the null-model used by modularity measure. Moreover it is showed that the risk of omitting of sub-modular structures applies for bigger modules than mentioned in the original publication. The second work is focused on the question how does the modular structure of E. coli metabolic network change if we add regulatory interactions. I find that the modularity of modular core of network slightly increase after regulatory edges addition. The modularity increase is significant with respect to randomized ensemble of regulatory networks. Identified modules...
|
79 |
Trouver les gènes manquants dans des réseaux géniques / Finding missing genes in genetic regulatory networksWang, Woei-Fuh 13 December 2011 (has links)
Le développement de techniques à haut débit fournit de nombreuses données sur le fonctionnement de réseaux de régulation. Il devient donc de plus en plus important de développer des techniques qui permettent de déduire la topologie et le fonctionnement des réseaux de régulation à partir des données expérimentales. La plupart des études dans ce domaine se focalisent sur la reconstruction de l'architecture locale du réseau de régulation et la détermination des paramètres qui relient les composants du réseau. Cependant, les réseaux biologiques ne sont jamais entièrement connus. L'absence d'un noeud important dans le réseau de régulation peut facilement conduire à de mauvaises prédictions de la structure du réseau ou des paramètres d'interactions. Dans cette thèse, nous proposons une méthode qui permet d'inférer l'existence, le profile d'expression et la connexion au reste du réseau d'un gène (ou de gènes) manquant. Pour résoudre ce problème difficile, nous devons simplifier la description du réseau de régulation. Nous faisons l'hypothèse communément acceptée que les interactions dans le réseau sont décrites par des fonctions de Hill. Nous approximons ces fonctions trop compliquées par des fonctions de puissance et nous montrons que cette simplification préserve la dynamique du réseau. En prenant le logarithme du système d'équations nous convertissons le système non-linéaire en un système linéaire. De nombreux outils sont disponibles pour analyser des systèmes linéaires. Nous utilisons l'analyse factorielle (FA) et l'analyse de composants indépendants (ICA) pour extraire le profil d'expression du gène inconnu à partir des profils d'expression des parties connues du réseau de régulation. Après avoir estimé le pattern d'expression du gène inconnu, nous explorons les différentes possibilités de connecter ce gène au reste du réseau. Une recherche exhaustive est trop coûteuse pour des grands réseaux de régulation. Nos proposons donc un algorithme de réduction de l'espace de recherche pour diminuer le nombre de calculs nécessaires. L'algorithme proposé est robuste au bruit expérimental et le profil d'expression du gène inconnu est retrouvé avec une probabilité de 80% dans des réseaux de petite taille et avec une probabilité de 60% pour des grands réseaux. FA est plus efficace que ICA pour extraire le profile du gène inconnu. L'algorithme est finalement appliqué à un réseau biologique réel: le réseau de régulation de la transcription du gène acs d'Escherichia coli. Nous prédisons qu'il y a un gène manquant dans ce réseau et les deux méthodes d'extraction du signal trouvent un profil d'expression très similaire pour le gène inconnu. De plus, ce profil d'expression est identique dans trois contextes expérimentaux différents : la souche sauvage, la souche dont l'adénylate cyclase a été délété et cette même souche complémentée par des l'AMPc ajouté au milieu de croissance. Puisque le profil d'expression du gène inconnu reste le mŘme dans les trois souches nous pouvons conclure que ce gène est indépendant de l'AMPc. Les deux méthodes d'extraction du profil d'expression prédisent deux structures différentes du réseau complet. FA prédit que le gène manquant contrôle l'expression de fis, tandis que ICA prédit que le gène inconnu contrôle d'expression de crp. / With the development of hight-throughput technologies, the investigation of the topologies and the functioning of genetic regulatory networks have become an important research topic in recent years. Most of the studies concentrate on reconstructing the local architecture of genetic regulatory networks and the determination of the corresponding interaction parameters. The preferred data sources are time series expression data. However, inevitably one or more important members of the regulatory network will remain unknown. The absence of important members of the genetic circuit leads to incorrectly inferred network topologies and control mechanisms. In this thesis we propose a method to infer the connection and expression pattern of these “missing genes”. In order to make the problem tractable, we have to make further simplifying assumptions. We assume that the interactions within the network are described by Hill-functions. We then approximate these functions by power-law functions. We show that this simplification still captures the dynamic regulatory behaviors of the network. The genetic control system can now be converted to linear model by using a logarithm transformation. In another word, we can analyze the genetic regulatory networks by linear approaches. In the logarithmic space, we propose a procedure for extracting the expression profile of a missing gene within the otherwise defined genetic regulatory network. The algorithm also determines the regulatory connections of this missing gene to the rest of the regulation network. The inference algorithm is based on Factor Analysis, a well-developed multivariate statistical analysis approach that is used to investigate unknown, underlying features of an ensemble of data, in our case the promoter activities and intracellular concentrations of the known genes. We also explore a second blind sources separation method, “Independent Component Analysis”, which is also commonly used to estimate hidden signals. Once the expression profile of the missing gene has been derived, we investigate possible connections of this gene to the remaining network by methods of search space reduction. The proposed method of inferring the expression profile of a missing gene and connecting it to a known network structure is applied to artificial genetic regulatory networks, as well as a real biologicial network studied in the laboratory: the acs regulatory network of Escherichia coli. In these applications we confirm that power-law functions are a good approximation of Hill-functions. Factor Analysis predicts the expression profiles of missing genes with a high accuracy of 80% in small artificial genetic regulatory networks. The accuracy of Factor Analysis of predicting the expression profiles of missing genes of large artificial genetic regulatory networks is 60%. In contrast, Independent Component Analysis is less powerful than Factor Analysis in extracting the expression profiles of missing components in small, as well as large, artificial genetic regulatory networks. Both Factor Analysis and Independent Component suggest that only one missing gene is sufficient to explain the observed expression profiles of Acs, Fis and Crp. The expression profiles of the missing genes in the △cya strain and in the △cya strain supplemented with cAMP estimated by Factor Analysis and Independent Component Analysis are very similar. Factor Analysis suggests that fis is regulated by the missing genes, while Independent Component Analysis suggests that crp is controlled by the missing gene.
|
80 |
Propriétés du réseau de gènes contrôlant l'organisation du primordium de racine latérale chez Arabidopsis thaliana / Gene regulatory network for lateral root formation in Arabidopsis thalianaTrinh, Duy Chi 22 March 2019 (has links)
L’organogenèse post-embryonnaire des racines latérales joue un rôle essentiel dans l’établissement de l’architecture du système racinaire des plantes, et donc dans leur croissance et leur performance. L’objectif de cette thèse est de caractériser le réseau de gènes régulant le développement des racines latérales et en particulier, l’organisation fonctionnelle du primordium de racine latérale, formant un nouveau méristème racinaire, chez la plante modèleArabidopsis thaliana en combinant des études de biologie des systèmes appliquées à la dynamique du transcriptome lors de la formation des racines latérales avec la caractérisation fonctionnelle de gènes candidats pour la régulation de ce phénomène d’organogenèse.La première partie de la thèse concerne l’identification des cibles de PUCHI, un facteur de transcription de type AP2/EREBP impliqué dans le contrôle de la prolifération et de la différentiation cellulaire dans le primordium de racine latérale. Le phénotype liés à la parte de fonction de PUCHI a été caractérisé en détail et à mis en évidence un rôle de ce facteur de transcription dans l'initiation des racines latérales et le développement et l'organisation des primordia. Par l’analyse de profils spatiaux et temporels d’expression de gènes, nous avons pu mettre en évidence que l’expression de gènes codant des protéines impliquées dans la biosynthèse des acides gras à très longues chaînes (VLCFA) est transitoirement activée durant la formation de la racine latérale et que cette dynamique est dépendante de PUCHI. De plus, le mutant kcs1-5, perturbé dans la biosynthèse de VLCFAs, présente un phénotype de développement des racines latérales similaire à celui de puchi-1. Par ailleurs, la perte de fonction puchi-1 augmente fortement la formation de cals continus dans des racines cultivées sur milieu inducteur riche en auxine, ce qui est cohérent avec le rôle récemment décrit des VLCFA racinaires dans la formation et l’organisation de cals distincts lorsque la racine est cultivé sur milieu inducteur de cals. L'ensemble de nos résultats démontre que PUCHI régule positivement l’expression de gènes de biosynthèse de VLCFAs lors de la formation de racines latérales et la callogenèse. Nos résultats confortent également l’hypothèse selon laquelle la formation des racines latérales et celle de cals racinaires partagent des mécanismes de régulation communs.La seconde partie de la thèse s’intéresse à l’identification de facteurs régulateurs clés dans l’organisation fonctionnelle du primordium de racine latérale et particulièrement, l’organisation d’un nouveau méristème racinaire. J’ai contribué à produire de nouvelles lignées de plantes permettant de suivre en temps réel par microscopie confocale la mise en place des identités cellulaires caractéristiques d’un méristème racinaire dans le primordium de racine latérale en développement. En utilisant un algorithme d’inférence de réseau de gènes, j’ai produit puis analysé les relations prédites de régulation entre gènes d’intérêt, afin d’identifier des gènes candidats potentiellement impliqués dans la formation du centre quiescent, un élément clé dans l’organisation du primordium et la mise en place du nouveau méristème racinaire. La caractérisation fonctionnelle de certains de ces gènes candidats a été initiée.Ces travaux de thèse ont donc contribué à mieux comprendre les mécanismes de régulation de la formation des racines latérales chez Arabidopsis thaliana. / Post-embryonic lateral root organogenesis plays an essential role in defining plant root system architecture, and therefore plant growth and fitness. The aim of the thesis is to elucidate the gene regulatory network regulating lateral root development and de novo root meristem formation during root branching in the model plant Arabidopsis thaliana by combining a system-biology based analysis of lateral root primordium transcritome dynamics with the functional characterization of genes possibly involved in regulating lateral root organogenesis.The first part of the thesis deals with the identification the target genes of PUCHI, an AP2/EREBP transcription factor that is involved in controlling cell proliferation and differentiation during lateral root formation. We showed that loss of PUCHI function leads to defects lateral root initiation and primordium growth and organisation. We found that several genes coding for proteins of the very long chain fatty acid (VLCFA) biosynthesis machinery are transiently induced in a PUCHI-dependent manner during lateral root development. Moreover, a mutant perturbed in VLCFA biosynthesis (kcs1-5) displays similar lateral root development defects as does puchi-1. In addition, puchi-1 loss of function mutant roots show enhanced and continuous callus formation in auxin-rich callus induction medium, consistent with the recently reported role of VLCFAs in organizing separated callus proliferation on this inductive growing medium. Thus, our results show that PUCHI positively regulates the expression of VLCFA biosynthesis genes during lateral root development, and further support the hypothesis that lateral root and callus formation share common genetic regulatory mechanisms.A second part of the thesis specifically addresses the issue of identifying key regulators of root meristem organization in the developing lateral root primordium. Material enabling the tracking of meristem cell identity establishment in developing primordia with live confocal microscopy was generated. A gene network inference was run to predict potential regulatory relationships between genes of interest during the time course of lateral root development. It identified potential regulators of quiescent center formation, a key step in functional organization of the lateral root primordia into a new root apical meristem. The characterization of some of these candidate genes was initiated.Altogether, this work participated in deciphering the genetic regulation of lateral root formation in Arabidopsis thaliana .
|
Page generated in 0.2874 seconds