1 |
High quality genome-scale metabolic network reconstruction of Mycobacterium tuberculosis and comparison with human metabolic network : application for drug targets identificationKalapanulak, Saowalak January 2009 (has links)
Mycobacterium tuberculosis (Mtb), a pathogenic bacterium, is the causative agent in the vast majority of human tuberculosis (TB) cases. Nearly one-third of the world’s population has been affected by TB and annually two million deaths result from the disease. Because of the high cost of medication for a long term treatment with multiple drugs and the increase of multidrug-resistant Mtb strains, faster-acting drugs and more effective vaccines are urgently demanded. Several metabolic pathways of Mtb are attractive for identifying novel drug targets against TB. Hence, a high quality genome-scale metabolic network of Mtb (HQMtb) was reconstructed to investigate its whole metabolism and explore for new drug targets. The HQMtb metabolic network was constructed using an unbiased approach by extracting gene annotation information from various databases and consolidating the data with information from literature. The HQMtb consists of 686 genes, 607 intracellular reactions, 734 metabolites and 471 E.C. numbers, 27 of which are incomplete. The HQMtb was compared with two recently published Mtb metabolic models, GSMN-TB by Beste et al. and iNJ661 model by Jamshidi and Palsson. Due to the different reconstruction methods used, the three models have different characteristics. The 68 new genes and 80 new E.C. numbers were found only in the HQMtb and resulting in approximately 52 new metabolic reactions located in various metabolic pathways, for example biosynthesis of steroid, fatty acid metabolism, and TCA cycle. Through a comparison of HQMtb with a previously published human metabolic network (EHMN) in terms of protein signatures, 42 Mtb metabolic genes were proposed as new drug targets based on two criteria: (a) their protein functional sites do not match with any human protein functional sites; (b) they are essential genes. Interestingly, 13 of them are found in a list of current validated drug targets. Among all proposed drug targets, Rv0189c, Rv3001c and Rv3607c are of interest to be tested in the laboratory because they were also proposed as drug target candidates from two research groups using different methods.
|
2 |
Graph-based modeling and evolutionary analysis of microbial metabolismZhou, Wanding 16 September 2013 (has links)
Microbial organisms are responsible for most of the metabolic innovations on Earth. Understanding microbial metabolism helps shed the light on questions that are central to biology, biomedicine, energy and the environment. Graph-based modeling is a powerful tool that has been used extensively for elucidating the organising principles of microbial metabolism and the underlying evolutionary forces that act upon it. Nevertheless, various graph-theoretic representations and techniques have been applied to metabolic networks, rendering the modeling aspect ad hoc and highlighting the conflicting conclusions based on the different representations.
The contribution of this dissertation is two-fold. In the first half, I revisit the modeling aspect of metabolic networks, and present novel techniques for their representation and analysis. In particular, I explore the limitations of standard graphs representations, and the utility of the more appropriate model---hypergraphs---for capturing metabolic network properties. Further, I address the task of metabolic pathway inference and the necessity to account for chemical symmetries and alternative tracings in this crucial task.
In the second part of the dissertation, I focus on two evolutionary questions. First, I investigate the evolutionary underpinnings of the formation of communities in metabolic networks---a phenomenon that has been reported in the literature and implicated in an organism's adaptation to its environment. I find that the metabolome size better explains the observed community structures. Second, I correlate evolution at the genome level with emergent properties at the metabolic network level. In particular, I quantify the various evolutionary events (e.g., gene duplication, loss, transfer, fusion, and fission) in a group of proteobacteria, and analyze their role in shaping the metabolic networks and determining the organismal fitness.
As metabolism gains an increasingly prominent role in biomedical, energy, and environmental research, understanding how to model this process and how it came about during evolution become more crucial. My dissertation provides important insights in both directions.
|
3 |
Pairwise rational kernels applied to metabolic network predictionsRoche Lima, Abiel 06 April 2015 (has links)
Metabolic networks are represented by the set of metabolic pathways. Metabolic pathways are a series of chemical reactions, in which the product from one reaction serves as the input to another reaction. Many pathways remain incompletely characterized, and in some of them not all enzyme components have been identified. One of the major challenges of computational biology is to obtain better models of metabolic pathways. Existing models are dependent on the
annotation of the genes. This propagates error accumulation when the pathways are predicted by incorrectly annotated genes.
Pairwise kernel frameworks have been used in supervised learning approaches, e.g., Pairwise Support Vector Machines (SVMs), to predict relationships among two pairs of entities. Rational kernels are based on transducers to manipulate sequence data, computing similarity measures between sequences or automata. Rational
kernels take advantage of the smaller and faster representation and algorithms of weighted finite-state transducers. They have been effectively used in problems that handle large amount of sequence information such as protein
essentiality, natural language processing and machine translations.
We propose a new framework, Pairwise Rational Kernels (PRKs), to manipulate pairs of sequence data, as pairwise combinations of rational kernels. We develop experiments using SVM with PRKs applied to metabolic pathway predictions in order to validate our methods. As a result, we obtain faster execution times with PRKs than other kernels, while maintaining accurate predictions. Because raw sequence data can be used, the predictor model avoids the errors introduced by incorrect gene annotations.
We also obtain a new type of Pairwise Rational Kernels based on automaton and transducer operations. In this case, we define new operations over two pairs of automata to obtain
new rational kernels. We also develop experiments to validate these new PRKs to predict metabolic networks. As a result, we obtain the best execution times when we compare them with other kernels and the previous PRKs.
|
4 |
Structure de réseaux biologiques : rôle des noeufs internes vis à vis de la production de composés / Structure of biological networks : role of internal nodes in the production of compoundsLaniau, Julie 23 October 2017 (has links)
Durant cette thèse nous nous sommes intéressés aux réseaux métaboliques et notamment leur modélisation sous forme d'un graphe bipartite dirigé pondéré. Ce dernier permet d'étudier la production d'éléments cibles métaboliques regroupés dans une biomasse à partir de composants provenant du milieu de croissance de l'organisme. Nous nous sommes plus particulièrement penchés sur le rôle des métabolites internes au réseau et la notion d'essentialité de ces derniers pour la production d'une biomasse dont nous avons raffiné la définition dans le cas d'une étude de flux (métabolite essentiel du point de vue de la productibilité du réseau et métabolite essentiel du point de vue de l'efficacité du réseau) puis étendu cette dernière dans le cas d'une étude topologique (métabolite essentiel du point du vue de la persistance du réseau). Nous nous sommes pour cela reposés sur le formalisme d'un part de Flux Balance Analysis et ses dérivés, et d'autre part d'expansion de réseau, afin de définir un métabolite essentiel (ou carrefour), nous permettant de mettre au point un package python (Conquests) cherchant les carrefours dans un réseau métabolite. Nous avons appliqué ce dernier à six réseaux métaboliques dont quatre provenant d'espèces modèles (iJO1360, iAF1260et iJR904 d'E. coli et Synecchocystis) et les deux autres d'espèces plus spécifiques (A. ferrooxidans et T. lutea). Nous avons aussi défini le concept de cluster de métabolites essentiels du point du vue de la persistance du réseau lié aux composants de la biomasse auxquels ils sont nécessaires et que nous avons appliqué sur les six réseaux métaboliques précédents et sur 3600 réseaux dégradés du réseau iJR904de E. coli puis reconstruits selon trois méthodes de gapfilling (Gapfill, Fastgapfill et Meneco) afin de comparer ces dernières. Ces études nous ont permis de mette en avant l'importance de métabolites internes dans la production de composés cibles. / In this thesis we are interested in metabolic networks and, in particular, their modelling with a weighted directed bipartite graph. This representation makes it possible to study the production of target metabolic elements, constituting a biomass, from components coming from the growth medium of the organism. We focused on the role of metabolites inside the network and the notion of essentiality for this elements for the production of a biomass whose definition we have refined in the case of a flow study (metabolite essential for biomass producibility and metabolite essential for biomass efficiency) and extended this notion in the case of a topological study (metabolite essential for biomass sustainability). We rely on the formalism of Flux Balance Analysis and its derivatives, and of network expansion, in order to define an essential metabolite (ME or crossroad), allowing us to develop a python package (Conquests) looking for crossroads in a metabolic network. We applied our concept to six metabolic networks, four of which came from model species (iJO1360, iAF1260 and iJR904 of E. coli and Synecchocystis) and the other two from more specific species (A. ferrooxidans and T. lutea). We have also defined the concept of cluster of ME-sustainability, related to the biomass components to which they are required and which we have applied over the six previous metabolic networks and over 3600 degraded networks of iJR904 of E. coli and reconstructed according to three methods of gapfilling (Gapfill, Fastgapfill and Meneco) to compare the results. These studies have allowed us to highlight the importance of internal metabolites in the production of target compounds.
|
5 |
Système de recommandation basé sur les réseaux pour l'interprétation de résultats de métabolomique / Metabolic network based recommender system for metabolic result interpretationFrainay, Clément 26 June 2017 (has links)
La métabolomique permet une étude à large échelle du profil métabolique d'un individu, représentatif de son état physiologique. La comparaison de ces profils conduit à l'identification de métabolites caractéristiques d'une condition donnée. La métabolomique présente un potentiel considérable pour le diagnostic, mais également pour la compréhension des mécanismes associés aux maladies et l'identification de cibles thérapeutiques. Cependant, ces dernières applications nécessitent d'inclure ces métabolites caractéristiques dans un contexte plus large, décrivant l'ensemble des connaissances relatives au métabolisme, afin de formuler des hypothèses sur les mécanismes impliqués. Cette mise en contexte peut être réalisée à l'aide des réseaux métaboliques, qui modélisent l'ensemble des transformations biochimiques opérables par un organisme. L'une des limites de cette approche est que la métabolomique ne permet pas à ce jour de mesurer l'ensemble des métabolites, et ainsi d'offrir une vue complète du métabolome. De plus, dans le contexte plus spécifique de la santé humaine, la métabolomique est usuellement appliquée à des échantillons provenant de biofluides plutôt que des tissus, ce qui n'offre pas une observation directe des mécanismes physiologiques eux-mêmes, mais plutôt de leur résultante. Les travaux présentés dans cette thèse proposent une méthode pour pallier ces limitations, en suggérant des métabolites pertinents pouvant aider à la reconstruction de scénarios mécanistiques. Cette méthode est inspirée des systèmes de recommandations utilisés dans le cadre d'activités en ligne, notamment la suggestion d'individus d'intérêt sur les réseaux sociaux numériques. La méthode a été appliquée à la signature métabolique de patients atteints d'encéphalopathie hépatique. Elle a permis de mettre en avant des métabolites pertinents dont le lien avec la maladie est appuyé par la littérature scientifique, et a conduit à une meilleure compréhension des mécanismes sous-jacents et à la proposition de scénarios alternatifs. Elle a également orienté l'analyse approfondie des données brutes de métabolomique et enrichie par ce biais la signature de la maladie initialement obtenue. La caractérisation des modèles et des données ainsi que les développements techniques nécessaires à la création de la méthode ont également conduit à la définition d'un cadre méthodologique générique pour l'analyse topologique des réseaux métaboliques. / Metabolomics allows large-scale studies of the metabolic profile of an individual, which is representative of its physiological state. Metabolic markers characterising a given condition can be obtained through the comparison of those profiles. Therefore, metabolomics reveals a great potential for the diagnosis as well as the comprehension of mechanisms behind metabolic dysregulations, and to a certain extent the identification of therapeutic targets. However, in order to raise new hypotheses, those applications need to put metabolomics results in the light of global metabolism knowledge. This contextualisation of the results can rely on metabolic networks, which gather all biochemical transformations that can be performed by an organism. The major bottleneck preventing this interpretation stems from the fact that, currently, no single metabolomic approach allows monitoring all metabolites, thus leading to a partial representation of the metabolome. Furthermore, in the context of human health related experiments, metabolomics is usually performed on bio-fluid samples. Consequently, those approaches focus on the footprints left by impacted mechanisms rather than the mechanisms themselves. This thesis proposes a new approach to overcome those limitations, through the suggestion of relevant metabolites, which could fill the gaps in a metabolomics signature. This method is inspired by recommender systems used for several on-line activities, and more specifically the recommendation of users to follow on social networks. This approach has been used for the interpretation of the metabolic signature of the hepatic encephalopathy. It allows highlighting some relevant metabolites, closely related to the disease according to the literature, and led to a better comprehension of the impaired mechanisms and as a result the proposition of new hypothetical scenario. It also improved and enriched the original signature by guiding deeper investigation of the raw data, leading to the addition of missed compounds. Models and data characterisation, alongside technical developments presented in this thesis, can also offer generic frameworks and guidelines for metabolic networks topological analysis.
|
6 |
Análise de redes metabólicas em Saccharomyces cerevisiae. / Metabolic network analysis of Saccharomyces cerevisiae.Gombert, Andreas Karoly 17 May 2001 (has links)
Análise de Redes Metabólicas foi aplicada à cepa de Saccharomyces cerevisiae CEN.PK113-7D, e a alguns mutantes interrompidos em genes que codificam para proteínas regulatórias envolvidas no fenômeno de repressão por glicose. Todas as cepas foram cultivadas em aerobiose, em meio mínimo contendo [1-13C]glicose como substrato limitante. As células eram recolhidas em situação de crescimento balanceado e submetidas à hidrólise, seguida de derivação e posterior injeção da amostra resultante num cromatógrafo gasoso acoplado a um espectrômetro de massa, para análise da marcação em alguns fragmentos de metabólitos intracelulares. Estes dados serviram como base para a identificação da atividade de algumas vias metabólicas no metabolismo central de S. cerevisiae. Além disto, utilizando-os juntamente com um modelo estequiométrico, foi possível obter uma estimativa para os fluxos no metabolismo central na cepa referência e nos mutantes estudados. Num primeiro momento, a metodologia foi validada para cultivos contínuos e descontínuos. Calculou-se um desvio padrão para a medida da marcação em cada fragmento de metabólito detectado pela metodologia empregada. Na cepa referência, observou-se que o ciclo de Krebs opera de forma cíclica em células que respiram e de forma não cíclica em células que apresentam metabolismo respiratório-fermentativo. Verificou-se que uma maior parte da glicose consumida é desviada para a via das pentoses fosfato no primeiro caso, em relação ao segundo. Foram encontradas evidências para a biossíntese de glicina através da enzima treonina aldolase e para a atividade da enzima málica. A ausência das proteínas Mig1 e Mig2 não altera os padrões de crescimento, produção de etanol e de marcação em metabólitos intracelulares de S. cerevisiae. Já a ausência de Hxk2, Reg1 ou Grr1 provoca alívio na repressão por glicose, observado pelo aumento das atividades respiratórias. / Metabolic Network Analysis was applied to the reference strain CEN.PK113-7D of Saccharomyces cerevisiae, as well as to some mutants disrupted in genes which code for regulatory proteins involved in the glucose repression cascade. All strains were cultivated under aerobic conditions, using minimal medium with [1-13C]glucose as the limiting substrate. Cells were harvested under balanced growth conditions and submitted to hydrolysis, derivatization and injection of the sample into a gas chromatograph coupled to a mass spectrometer for analysis of the labeling pattern in some fragments of intracellular metabolites. These data were used for identifying the activity of some pathways in the central metabolism of S. cerevisiae. Furthermore, using the data together with a stoichiometric model, it was possible to estimate the fluxes in the central metabolism of the reference strain and in the mutant strains. First, the methodology was validated for batch and continuous cultivations. Standard deviations were calculated for the measurement of the fractional labeling in each of the detected fragments. In the reference strain, it was observed that the Krebs cycle operates in a cyclic manner in respiratory cells, whereas it operates in a non cyclic manner under respiro-fermentative metabolism. It was also seen that a greater part of the glucose consumed by the cells enters the pentose phosphate pathway in the former than in the later case. Evidence for the activity of the threonine aldolase and the malic enzyme catalyzed reactions was also found. The absence of the Mig1 and Mig2 proteins does not alter the growth, ethanol formation and labeling pattern of intracellular metabolites in S. cerevisiae. In contrast, the absence of Hxk2, Reg1, or Grr1 provoques a relief in glucose repression, which was observed by an increased respiratory activity.
|
7 |
Pathway Pioneer: Heterogenous Server Architecture for Scientific Visualization and Pathway Search in Metabolic Network Using Informed SearchOswal, Vipul Kantilal 01 August 2014 (has links)
There is a huge demand for analysis and visualization of the biological models. PathwayPioneer is a web-based tool to analyze and visually represent complex biological models. PathwayPioneer generates the initial layout of the model and allows users to customize it. It is developed using .net technologies (C#) and hosted on the Internet Information Service (IIS) server. At back-end it interacts with python-based COBRApy library for biological calculations like Flux Balance Analysis (FBA). We have developed a parallel processing architecture to accommodate processing of large models and enable message-based communication between the .net webserver and python engine. We compared the performance of our online system by loading a website with multiple concurrent dummy users and performed different time intensive operations in parallel.
Given two metabolites of interest, millions of pathways can be found between them even in a small metabolic network. Depth First Search or Breadth First search algorithm retrieves all the possible pathways, thereby requiring huge computational time and resources. In Pathway Search using Informed Method, we have implemented, compared, and analyzed three different informed search techniques (Selected Subsystem, Selected Compartment, and Dynamic Search) and traditional exhaustive search technique. We found that the Dynamic approach performs exceedingly well with respect to time and total number of pathways searches. During our implementation we developed a SBML parser which outperforms the commercial libSBML parser in C#.
|
8 |
Metabolic design of dynamic bioreaction modelsProvost, Agnès 06 November 2006 (has links)
This thesis is concerned with the derivation of bioprocess models intended for engineering purposes. In contrast with other techniques, the methodology used to derive a macroscopic model is based on available intracellular information. This information is extracted from the metabolic network describing the intracellular metabolism. The aspects of metabolic regulation are modeled by representing the metabolism of cultured cells with several metabolic networks.
Here we present a systematic methodology for deriving macroscopic models when such metabolic networks are known. A separate model is derived for each “phase” of the culture. Each of these models relies upon a set of macroscopic bioreactions that resumes the information contained in the corresponding metabolic network.
Such a set of macroscopic bioreactions is obtained by translating the set of Elementary Flux Modes which are well-known tools in the System Biology community. The Elementary Flux Modes are described in the theory of Convex Analysis. They represent pathways across metabolic networks. Once the set of Elementary Flux Modes is computed and translated into macroscopic bioreactions, a general model could be obtained for the type of culture under investigation. However, depending on the size and the complexity of the metabolic network, such a model could contain hundreds, and even thousands, of bioreactions. Since the reaction kinetics of such bioreactions are parametrized with at least one parameter that needs to be identified, the reduction of the general model to a more manageable size is desirable.
Convex Analysis provides further results that allow for the selection of a macroscopic bioreaction subset. This selection is based on the data collected from the available experiments. The selected bioreactions then allow for the construction of a model for the experiments at hand.
|
9 |
Protein-protein interactions and metabolic pathways reconstruction of <i>Caenorhabditis elegans</i>Akhavan Mahdavi, Mahmood 08 June 2007
Metabolic networks are the collections of all cellular activities taking place in a living cell and all the relationships among biological elements of the cell including genes, proteins, enzymes, metabolites, and reactions. They provide a better understanding of cellular mechanisms and phenotypic characteristics of the studied organism. In order to reconstruct a metabolic network, interactions among genes and their molecular attributes along with their functions must be known. Using this information, proteins are distributed among pathways as sub-networks of a greater metabolic network. Proteins which carry out various steps of a biological process operate in same pathway.<p>The metabolic network of <i>Caenorhabditis elegans</i> was reconstructed based on current genomic information obtained from the KEGG database, and commonly found in SWISS-PROT and WormBase. Assuming proteins operating in a pathway are interacting proteins, currently available protein-protein interaction map of the studied organism was assembled. This map contains all known protein-protein interactions collected from various sources up to the time. Topology of the reconstructed network was briefly studied and the role of key enzymes in the interconnectivity of the network was analysed. The analysis showed that the shortest metabolic paths represent the most probable routes taken by the organism where endogenous sources of nutrient are available to the organism. Nonetheless, there are alternate paths to allow the organism to survive under extraneous variations. <p>Signature content information of proteins was utilized to reveal protein interactions upon a notion that when two proteins share signature(s) in their primary structures, the two proteins are more likely to interact. The signature content of proteins was used to measure the extent of similarity between pairs of proteins based on binary similarity score. Pairs of proteins with a binary similarity score greater than a threshold corresponding to confidence level 95% were predicted as interacting proteins. The reliability of predicted pairs was statistically analyzed. The sensitivity and specificity analysis showed that the proposed approach outperformed maximum likelihood estimation (MLE) approach with a 22% increase in area under curve of receiving operator characteristic (ROC) when they were applied to the same datasets. When proteins containing one and two known signatures were removed from the protein dataset, the area under curve (AUC) increased from 0.549 to 0.584 and 0.655, respectively. Increase in the AUC indicates that proteins with one or two known signatures do not provide sufficient information to predict robust protein-protein interactions. Moreover, it demonstrates that when proteins with more known signatures are used in signature profiling methods the overlap with experimental findings will increase resulting in higher true positive rate and eventually greater AUC. <p>Despite the accuracy of protein-protein interaction methods proposed here and elsewhere, they often predict true positive interactions along with numerous false positive interactions. A global algorithm was also proposed to reduce the number of false positive predicted protein interacting pairs. This algorithm relies on gene ontology (GO) annotations of proteins involved in predicted interactions. A dataset of experimentally confirmed protein pair interactions and their GO annotations was used as a training set to train keywords which were able to recover both their source interactions (training set) and predicted interactions in other datasets (test sets). These keywords along with the cellular component annotation of proteins were employed to set a pair of rules that were to be satisfied by any predicted pair of interacting proteins. When this algorithm was applied to four predicted datasets obtained using phylogenetic profiles, gene expression patterns, chance co-occurrence distribution coefficient, and maximum likelihood estimation for S. cerevisiae and <i>C. elegans</i>, the improvement in true positive fractions of the datasets was observed in a magnitude of 2-fold to 10-fold depending on the computational method used to create the dataset and the available information on the organism of interest. <p>The predicted protein-protein interactions were incorporated into the prior reconstructed metabolic network of <i>C. elegans</i>, resulting in 1024 new interactions among 94 metabolic pathways. In each of 1024 new interactions one unknown protein was interacting with a known partner found in the reconstructed metabolic network. Unknown proteins were characterized based on the involvement of their known partners. Based on the binary similarity scores, the function of an uncharacterized protein in an interacting pair was defined according to its known counterpart whose function was already specified. With the incorporation of new predicted interactions to the metabolic network, an expanded version of that network was resulted with 27% increase in the number of known proteins involved in metabolism. Connectivity of proteins in protein-protein interaction map changed from 42 to 34 due to the increase in the number of characterized proteins in the network.
|
10 |
Genome-scale Metabolic Network Reconstruction and Constraint-based Flux Balance Analysis of Toxoplasma gondiiSong, Carl Yulun 27 November 2012 (has links)
The increasing prevalence of apicomplexan parasites such as Plasmodium, Toxoplasma, and Cryptosporidium represents a significant global healthcare burden. Treatment options are increasingly limited due to the emergence of new resistant strains. We postulate that parasites have evolved distinct metabolic strategies critical for growth and survival during human infections, and therefore susceptible to drug targeting using a systematic approach. I developed iCS306, a fully characterized metabolic network reconstruction of the model organism Toxoplasma gondii via extensive curation of available genomic and biochemical data. Using available microarray data, metabolic constraints for six different clinical strains of Toxoplasma were modeled. I conducted various in silico experiments using flux balance analysis in order to identify essential metabolic processes, and to illustrate the differences in metabolic behaviour across Toxoplasma strains. The results elucidate probable explanations for the underlying mechanisms which account for the similarities and differences among strains of Toxoplasma, and among species of Apicomplexa.
|
Page generated in 0.1004 seconds