41 |
Exploring the network’s world: From omics-driven machine learning workflow for drug target identification to quantification of signaling model diversity.Dalpedri, Beatrice 30 October 2024 (has links)
The drug discovery process is challenging, time-consuming, and costly, with drug target identification being an essential step in developing effective therapies. Drug repurposing offers a strategy for identifying new uses for existing drugs, aiming to simplify the process. Machine learning models and network analysis methods have demonstrated promise in both drug target identification and repurposing, providing powerful tools for analyzing complex biological data. This thesis will explore the applications of neural networks and multilayer biological networks for drug repurposing opportunities and network inference problems applied to signaling pathways. A novel machine learning and network-based workflow is presented for identifying drug targets for cystinosis, a rare disease that causes progressive kidney disease, currently lacking effective therapies to prevent the kidney failure. This approach permits to recapitulate the disease mechanisms in the context of renal tubular physiology and identify candidate drug targets for further validation using a cross-species workflow and disease-relevant screening technologies. While machine learning approaches have shown promise, they often need more mechanistic understanding, which is necessary for robust drug target identification and repurposing strategies. Mechanistic models provide crucial insights into the underlying biological mechanisms, complementing machine learning techniques. However, inferring mechanistic signaling networks from omics data poses challenges due to non-identifiability, resulting in multiple valid solutions consistent with the data. After that, the focus shifts towards quantifying signaling model diversity through solver-agnostic solution sampling with CORNETO, an ongoing effort that aims to unify network inference problems via constrained optimization. Mechanistic signaling networks can be inferred from omics data and prior knowledge using combinatorial optimization and mathematical solvers to find the optimal network. However, this problem is in general, non-identifiable, and several solutions may be equally valid. Ignoring the existence of these alternative solutions leads to an incomplete picture of the hypothesis space of consistent mechanistic signaling networks. To alleviate this issue, an algorithm to explore the space of alternative solutions and to conduct sensitivity analysis on the optimal solution is implemented and presented. These algorithms are applied to data from pancreatic cancer cell lines treated with kinase inhibitors to study cellular responses to drug perturbations by inferring mechanistic signaling networks from omics data.
|
42 |
Redes complexas de expressão gênica: síntese, identificação, análise e aplicações / Gene expression complex networks: synthesis, identification, analysis and applicationsLopes, Fabricio Martins 21 February 2011 (has links)
Os avanços na pesquisa em biologia molecular e bioquímica permitiram o desenvolvimento de técnicas capazes de extrair informações moleculares de milhares de genes simultaneamente, como DNA Microarrays, SAGE e, mais recentemente RNA-Seq, gerando um volume massivo de dados biológicos. O mapeamento dos níveis de transcrição dos genes em larga escala é motivado pela proposição de que o estado funcional de um organismo é amplamente determinado pela expressão de seus genes. No entanto, o grande desafio enfrentado é o pequeno número de amostras (experimentos) com enorme dimensionalidade (genes). Dessa forma, se faz necessário o desenvolvimento de novas técnicas computacionais e estatísticas que reduzam o erro de estimação intrínseco cometido na presença de um pequeno número de amostras com enorme dimensionalidade. Neste contexto, um foco importante de pesquisa é a modelagem e identificação de redes de regulação gênica (GRNs) a partir desses dados de expressão. O objetivo central nesta pesquisa é inferir como os genes estão regulados, trazendo conhecimento sobre as interações moleculares e atividades metabólicas de um organismo. Tal conhecimento é fundamental para muitas aplicações, tais como o tratamento de doenças, estratégias de intervenção terapêutica e criação de novas drogas, bem como para o planejamento de novos experimentos. Nessa direção, este trabalho apresenta algumas contribuições: (1) software de seleção de características; (2) nova abordagem para a geração de Redes Gênicas Artificiais (AGNs); (3) função critério baseada na entropia de Tsallis; (4) estratégias alternativas de busca para a inferência de GRNs: SFFS-MR e SFFS-BA; (5) investigação biológica das redes gênicas envolvidas na biossíntese de tiamina, usando a Arabidopsis thaliana como planta modelo. O software de seleção de características consiste de um ambiente de código livre, gráfico e multiplataforma para problemas de bioinformática, que disponibiliza alguns algoritmos de seleção de características, funções critério e ferramentas de visualização gráfica. Em particular, implementa um método de inferência de GRNs baseado em seleção de características. Embora existam vários métodos propostos na literatura para a modelagem e identificação de GRNs, ainda há um problema muito importante em aberto: como validar as redes identificadas por esses métodos computacionais? Este trabalho apresenta uma nova abordagem para validação de tais algoritmos, considerando três aspectos principais: (a) Modelo para geração de Redes Gênicas Artificiais (AGNs), baseada em modelos teóricos de redes complexas, os quais são usados para simular perfis temporais de expressão gênica; (b) Método computacional para identificação de redes gênicas a partir de dados temporais de expressão; e (c) Validação das redes identificadas por meio do modelo AGN. O desenvolvimento do modelo AGN permitiu a análise e investigação das características de métodos de inferência de GRNs, levando ao desenvolvimento de um estudo comparativo entre quatro métodos disponíveis na literatura. A avaliação dos métodos de inferência levou ao desenvolvimento de novas metodologias para essa tarefa: (a) uma função critério, baseada na entropia de Tsallis, com objetivo de inferir os inter-relacionamentos gênicos com maior precisão; (b) uma estratégia alternativa de busca para a inferência de GRNs, chamada SFFS-MR, a qual tenta explorar uma característica local das interdependências regulatórias dos genes, conhecida como predição intrinsecamente multivariada; e (c) uma estratégia de busca, interativa e flutuante, que baseia-se na topologia de redes scale-free, como uma característica global das GRNs, considerada como uma informação a priori, com objetivo de oferecer um método mais adequado para essa classe de problemas e, com isso, obter resultados com maior precisão. Também é objetivo deste trabalho aplicar a metodologia desenvolvida em dados biológicos, em particular na identificação de GRNs relacionadas a funções específicas de Arabidopsis thaliana. Os resultados experimentais, obtidos a partir da aplicação das metodologias propostas, mostraram que os respectivos ganhos de desempenho foram significativos e adequados para os problemas a que foram propostos. / Thanks to recent advances in molecular biology and biochemistry, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as DNA microarrays, SAGE, and more recently RNA-Seq, generating a massive volume of biological data. The mapping of gene transcription levels at large scale is motivated by the proposition that information of the functional state of an organism is broadly determined by its gene expression. However, the main limitation faced is the small number of samples (experiments) with huge dimensionalities (genes). Thus, it is necessary to develop new computational and statistics techniques to reduce the inherent estimation error committed in the presence of a small number of samples with large dimensionality. In this context, particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. The main objective of this research is to infer how genes are regulated, bringing knowledge about the molecular interactions and metabolic activities of an organism. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. In this direction, this work presents some contributions: (1) feature selection software; (2) new approach for the generation of artificial gene networks (AGN); (3) criterion function based on Tsallis entropy; (4) alternative search strategies for GRNs inference: SFFS-MR and SFFS-BA; (5) biological investigation of GRNs involved in the thiamine biosynthesis by adopting the Arabidopsis thaliana as a model plant. The feature selection software is an open-source multiplataform graphical environment for bioinformatics problems, which supports many feature selection algorithms, criterion functions and graphic visualization tools. In particular, a feature selection method for GRNs inference is also implemented in the software. Although there are several methods proposed in the literature for the modeling and identification of GRNs, an important open problem regards: how to validate such methods and its results? This work presents a new approach for validation of such algorithms by considering three main aspects: (a) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (b) computational method for GRNs identification from temporal expression data; and (c) Validation of the identified AGN-based network through comparison with the original network. Through the development of the AGN model was possible the analysis and investigation of the characteristics of GRNs inference methods, leading to the development of a comparative study of four inference methods available in literature. The evaluation of inference methods led to the development of new methodologies for this task: (a) a new criterion function based on Tsallis entropy, in order to infer the genetic inter-relationships with better precision; (b) an alternative search strategy for the GRNs inference, called SFFS-MR, which tries to exploit a local property of the regulatory gene interdependencies, which is known as intrinsically multivariate prediction; and (c) a search strategy, interactive and floating, which is based on scale-free network topology, as a global property of the GRNs, which is considered as a priori information, in order to provide a more appropriate method for this class of problems and thereby achieve results with better precision. It is also an objective of this work, to apply the developed methodology in biological data, particularly in identifying GRNs related to specific functions of the Arabidopsis thaliana. The experimental results, obtained from the application of the proposed methodologies, indicate that the respective performances of each methodology were significant and adequate to the problems that have been proposed.
|
43 |
Automation of a reactor for enzymatic hydrolysis of sugar cane bagasse : Computational intelligencebased adaptive controlFurlong, Vitor Badiale 20 March 2015 (has links)
Submitted by Luciana Sebin (lusebin@ufscar.br) on 2016-09-21T13:52:44Z
No. of bitstreams: 1
DissVBF.pdf: 4418595 bytes, checksum: aaae3efb173c8760a1039251a31ea973 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-23T18:23:48Z (GMT) No. of bitstreams: 1
DissVBF.pdf: 4418595 bytes, checksum: aaae3efb173c8760a1039251a31ea973 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-23T18:24:01Z (GMT) No. of bitstreams: 1
DissVBF.pdf: 4418595 bytes, checksum: aaae3efb173c8760a1039251a31ea973 (MD5) / Made available in DSpace on 2016-09-23T18:24:10Z (GMT). No. of bitstreams: 1
DissVBF.pdf: 4418595 bytes, checksum: aaae3efb173c8760a1039251a31ea973 (MD5)
Previous issue date: 2015-03-20 / Não recebi financiamento / The continuous demand growth for liquid fuels, alongside with the decrease of fossil oil reserves, unavoidable in the long term, induces investigations for new energy sources. A possible alternative is the use of bioethanol, produced by renewable resources such as sugarcane bagasse. Two thirds of the cultivated sugarcane biomass are sugarcane bagasse and leaves, not fermentable when the current, first-generation (1G) process is used. A great interest has been given to techniques capable of utilizing the carbohydrates from this material. Among them, production of second generation (2G) ethanol is a
possible alternative. 2G ethanol requires two additional operations: a pretreatment and a hydrolysis stage. Regarding the hydrolysis, the dominant technical solution has been based on the use of enzymatic complexes to hydrolyze the lignocellulosic substrate. To ensure the feasibility of the process, a high final concentration of glucose after the enzymatic hydrolysis is desirable. To achieve this objective, a high solid consistency in the reactor is necessary. However, a high load of solids generates a series of operational difficulties within the reactor. This is a crucial bottleneck of the 2G process. A possible solution is using a fed-batch process, with feeding profiles of enzymes and substrate that
enhance in the process yield and productivity. The main objective of this work was to implement and test a system to infer online concentrations of fermentable carbohydrates in the reactive system, and to optimize the feeding strategy of substrate and/or enzymatic complex, according to a model-based control strategy. Batch and fed-batch experiments were conducted in order to test the adherence of four simplified kinetic models. The model with best adherence to the experimental data (a modified Michaelis-Mentem model with inhibition by the product) was used to train an Artificial Neural Network (ANN) as a softsensor to predict glucose concentrations. Further, this ANN may be used in a closedloop
control strategy. A feeding profile optimizer was implemented, based on the optimal control approach. The ANN was capable of inferring the product concentration from the available data with good adherence (Determination Coefficient of 0.972). The optimization algorithm generated profiles that increased a process performance index while maintaining operational levels within the reactor, reaching glucose concentrations close to those utilized in current first generation technology a (ranging between 156.0 g.L⁻¹ and 168.3 g.L⁻¹). However rough estimates for scaling up the reactor to industrial dimensions indicate that this conventional reactor design must be replaced by a two-stage reactor, to
minimize the volume of liquid to be stirred. / A crescente demanda por combustíveis líquidos, bem como a diminuição das reservas de petróleo, inevitáveis a longo prazo, induzem pesquisas por novas fontes de energia. Uma possível solução é o uso do bioetanol, produzido de resíduos, como o bagaço de cana-deaçúcar. Dois terços da biomassa cultivada são bagaço e folhas. Estas frações não são fermentescíveis quando se usa a tecnologia de primeira geração atual (1G). Um grande interesse vem sendo prestado a técnicas capazes de utilizar os carboidratos deste material. Dentre elas, a produção de etanol de segunda geração (2G) é uma possível
alternativa. Etanol 2G requer duas operações adicionais: etapas de pré-tratamento e hidrólise. Considerando a hidrólise, a técnica dominante tem sido a utilização de complexos enzimáticos para hidrolisar o substrato lignocelulósico. Para assegurar a
viabilidade do processo, uma alta concentração final de glicose é necessária ao final do processo. Para atingir esse objetivo, uma alta concentração de sólidos no reator é necessária. No entanto, uma carga grande de sólidos gera uma série de dificuldades operacionais para o processo. Este é um gargalo crucial do processo 2G. Uma possível solução é utilizar um processo de batelada alimentada, com perfis de alimentação de enzima e substrato para aumentar produtividade e rendimento. O principal objetivo deste trabalho é implementar e testar um sistema para inferir concentração de carboidratos
fermentescíveis automaticamente e otimizar a política de substrato e/ou enzima em tempo real, de acordo com uma estratégia de controle baseada em modelo cinético. Experimentos de batelada e batelada alimentada foram realizados a fim de testar a
aderência de 4 modelos cinéticos simplificados. O modelo com melhor aderência aos dados experimentais (um modelo de Michaelis-Mentem modificado com inibição por produto) foi utilizado para gerar dados a fim de treinar uma rede neural artificial para predizer concentrações de glicose automaticamente. Em estudos futuros, esta rede pode ser utilizada para compor o fechamento da malha de controle. Um otimizador de perfil de alimentação foi implementado, este foi baseado em uma abordagem de controle ótimo. A rede neural foi capaz de predizer a concentração de produto com os dados disponíveis de
maneira satisfatória (Coeficiente de Determinação de 0.972). O algoritmo de otimização gerou perfis que aumentaram a performance do processo enquanto manteve as condições da hidrólise dentro de níveis operacionais, e gerou concentrações de glicose próximas as obtidas pelo caldo de cana-de-açúcar da primeira geração (valores entre 156.0 g.L ¹ e 168.3 g.L ¹). No entanto, estimativas iniciais de ⁻ ⁻ aumento de escala do processo demonstraram que para atingir dimensões industriais o projeto do reator utilizado deve ser analisado, substituindo o mesmo por um processo em dois estágios para diminuir o volume do reator e energia para agitação.
|
44 |
Redes complexas de expressão gênica: síntese, identificação, análise e aplicações / Gene expression complex networks: synthesis, identification, analysis and applicationsFabricio Martins Lopes 21 February 2011 (has links)
Os avanços na pesquisa em biologia molecular e bioquímica permitiram o desenvolvimento de técnicas capazes de extrair informações moleculares de milhares de genes simultaneamente, como DNA Microarrays, SAGE e, mais recentemente RNA-Seq, gerando um volume massivo de dados biológicos. O mapeamento dos níveis de transcrição dos genes em larga escala é motivado pela proposição de que o estado funcional de um organismo é amplamente determinado pela expressão de seus genes. No entanto, o grande desafio enfrentado é o pequeno número de amostras (experimentos) com enorme dimensionalidade (genes). Dessa forma, se faz necessário o desenvolvimento de novas técnicas computacionais e estatísticas que reduzam o erro de estimação intrínseco cometido na presença de um pequeno número de amostras com enorme dimensionalidade. Neste contexto, um foco importante de pesquisa é a modelagem e identificação de redes de regulação gênica (GRNs) a partir desses dados de expressão. O objetivo central nesta pesquisa é inferir como os genes estão regulados, trazendo conhecimento sobre as interações moleculares e atividades metabólicas de um organismo. Tal conhecimento é fundamental para muitas aplicações, tais como o tratamento de doenças, estratégias de intervenção terapêutica e criação de novas drogas, bem como para o planejamento de novos experimentos. Nessa direção, este trabalho apresenta algumas contribuições: (1) software de seleção de características; (2) nova abordagem para a geração de Redes Gênicas Artificiais (AGNs); (3) função critério baseada na entropia de Tsallis; (4) estratégias alternativas de busca para a inferência de GRNs: SFFS-MR e SFFS-BA; (5) investigação biológica das redes gênicas envolvidas na biossíntese de tiamina, usando a Arabidopsis thaliana como planta modelo. O software de seleção de características consiste de um ambiente de código livre, gráfico e multiplataforma para problemas de bioinformática, que disponibiliza alguns algoritmos de seleção de características, funções critério e ferramentas de visualização gráfica. Em particular, implementa um método de inferência de GRNs baseado em seleção de características. Embora existam vários métodos propostos na literatura para a modelagem e identificação de GRNs, ainda há um problema muito importante em aberto: como validar as redes identificadas por esses métodos computacionais? Este trabalho apresenta uma nova abordagem para validação de tais algoritmos, considerando três aspectos principais: (a) Modelo para geração de Redes Gênicas Artificiais (AGNs), baseada em modelos teóricos de redes complexas, os quais são usados para simular perfis temporais de expressão gênica; (b) Método computacional para identificação de redes gênicas a partir de dados temporais de expressão; e (c) Validação das redes identificadas por meio do modelo AGN. O desenvolvimento do modelo AGN permitiu a análise e investigação das características de métodos de inferência de GRNs, levando ao desenvolvimento de um estudo comparativo entre quatro métodos disponíveis na literatura. A avaliação dos métodos de inferência levou ao desenvolvimento de novas metodologias para essa tarefa: (a) uma função critério, baseada na entropia de Tsallis, com objetivo de inferir os inter-relacionamentos gênicos com maior precisão; (b) uma estratégia alternativa de busca para a inferência de GRNs, chamada SFFS-MR, a qual tenta explorar uma característica local das interdependências regulatórias dos genes, conhecida como predição intrinsecamente multivariada; e (c) uma estratégia de busca, interativa e flutuante, que baseia-se na topologia de redes scale-free, como uma característica global das GRNs, considerada como uma informação a priori, com objetivo de oferecer um método mais adequado para essa classe de problemas e, com isso, obter resultados com maior precisão. Também é objetivo deste trabalho aplicar a metodologia desenvolvida em dados biológicos, em particular na identificação de GRNs relacionadas a funções específicas de Arabidopsis thaliana. Os resultados experimentais, obtidos a partir da aplicação das metodologias propostas, mostraram que os respectivos ganhos de desempenho foram significativos e adequados para os problemas a que foram propostos. / Thanks to recent advances in molecular biology and biochemistry, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as DNA microarrays, SAGE, and more recently RNA-Seq, generating a massive volume of biological data. The mapping of gene transcription levels at large scale is motivated by the proposition that information of the functional state of an organism is broadly determined by its gene expression. However, the main limitation faced is the small number of samples (experiments) with huge dimensionalities (genes). Thus, it is necessary to develop new computational and statistics techniques to reduce the inherent estimation error committed in the presence of a small number of samples with large dimensionality. In this context, particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. The main objective of this research is to infer how genes are regulated, bringing knowledge about the molecular interactions and metabolic activities of an organism. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. In this direction, this work presents some contributions: (1) feature selection software; (2) new approach for the generation of artificial gene networks (AGN); (3) criterion function based on Tsallis entropy; (4) alternative search strategies for GRNs inference: SFFS-MR and SFFS-BA; (5) biological investigation of GRNs involved in the thiamine biosynthesis by adopting the Arabidopsis thaliana as a model plant. The feature selection software is an open-source multiplataform graphical environment for bioinformatics problems, which supports many feature selection algorithms, criterion functions and graphic visualization tools. In particular, a feature selection method for GRNs inference is also implemented in the software. Although there are several methods proposed in the literature for the modeling and identification of GRNs, an important open problem regards: how to validate such methods and its results? This work presents a new approach for validation of such algorithms by considering three main aspects: (a) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (b) computational method for GRNs identification from temporal expression data; and (c) Validation of the identified AGN-based network through comparison with the original network. Through the development of the AGN model was possible the analysis and investigation of the characteristics of GRNs inference methods, leading to the development of a comparative study of four inference methods available in literature. The evaluation of inference methods led to the development of new methodologies for this task: (a) a new criterion function based on Tsallis entropy, in order to infer the genetic inter-relationships with better precision; (b) an alternative search strategy for the GRNs inference, called SFFS-MR, which tries to exploit a local property of the regulatory gene interdependencies, which is known as intrinsically multivariate prediction; and (c) a search strategy, interactive and floating, which is based on scale-free network topology, as a global property of the GRNs, which is considered as a priori information, in order to provide a more appropriate method for this class of problems and thereby achieve results with better precision. It is also an objective of this work, to apply the developed methodology in biological data, particularly in identifying GRNs related to specific functions of the Arabidopsis thaliana. The experimental results, obtained from the application of the proposed methodologies, indicate that the respective performances of each methodology were significant and adequate to the problems that have been proposed.
|
45 |
Comprehensive Characterization of the Transcriptional Signaling of Human Parturition through Integrative Analysis of Myometrial Tissues and Cell LinesStanfield, Zachary 28 August 2019 (has links)
No description available.
|
46 |
Necessary and Sufficient Informativity Conditions for Robust Network Reconstruction Using Dynamical Structure FunctionsChetty, Vasu Nephi 03 December 2012 (has links) (PDF)
Dynamical structure functions were developed as a partial structure representation of linear time-invariant systems to be used in the reconstruction of biological networks. Dynamical structure functions contain more information about structure than a system's transfer function, while requiring less a priori information for reconstruction than the complete computational structure associated with the state space realization. Early sufficient conditions for network reconstruction with dynamical structure functions severely restricted the possible applications of the reconstruction process to networks where each input independently controls a measured state. The first contribution of this thesis is to extend the previously established sufficient conditions to incorporate both necessary and sufficient conditions for reconstruction. These new conditions allow for the reconstruction of a larger number of networks, even networks where independent control of measured states is not possible. The second contribution of this thesis is to extend the robust reconstruction algorithm to all reconstructible networks. This extension is important because it allows for the reconstruction of networks from real data, where noise is present in the measurements of the system. The third contribution of this thesis is a Matlab toolbox that implements the robust reconstruction algorithm discussed above. The Matlab toolbox takes in input-output data from simulations or real-life perturbation experiments and returns the proposed Boolean structure of the network. The final contribution of this thesis is to increase the applicability of dynamical structure functions to more than just biological networks by applying our reconstruction method to wireless communication networks. The reconstruction of wireless networks produces a dynamic interference map that can be used to improve network performance or interpret changes of link rates in terms of changes in network structure, enabling novel anomaly detection and security schemes.
|
47 |
Inferring Topology of Networks With Hidden Dynamic VariablesSchmidt, Raoul, Haehne, Hauke, Hillmann, Laura, Casadiego, Jose, Witthaut, Dirk, Schäfer, Benjamin, Timme, Marc 04 June 2024 (has links)
nferring the network topology from the dynamics of interacting units constitutes a topical challenge that drives research on its theory and applications across physics, mathematics, biology, and engineering. Most current inference methods rely on time series data recorded from all dynamical variables in the system. In applications, often only some of these time series are accessible, while other units or variables of all units are hidden, i.e. inaccessible or unobserved. For instance, in AC power grids, frequency measurements often are easily available whereas determining the phase relations among the oscillatory units requires much more effort. Here, we propose a network inference method that allows to reconstruct the full network topology even if all units exhibit hidden variables. We illustrate the approach in terms of a basic AC power grid model with two variables per node, the local phase angle and the local instantaneous frequency. Based solely on frequency measurements, we infer the underlying network topology as well as the relative phases that are inaccessible to measurement. The presented method may be enhanced to include systems with more complex coupling functions and additional parameters such as losses in power grid models. These results may thus contribute towards developing and applying novel network inference approaches in engineering, biology and beyond.
|
48 |
Systematic inference of regulatory networks that drive cytokine-stimulus integration by T cellsPellet, Elsa Marie 03 January 2020 (has links)
Differenzierungsentscheidungen von Zellen werden durch die Integration mehrerer Stimuli bestimmt. Die Differenzierung von Helfer-T-Zellen (Th-Zellen) ist hierfür ein gut untersuchtes Beispiel: reife Th-Zellen entwickeln sich beim Kontakt mit einem für sie spezifischen Antigen zu einem spezialisierten Subtyp, der von den in ihrer Umgebung vorhandenen Zytokinen abhängt und exprimieren dann einen spezifischen Mastertranskriptionsfaktor. Die häufigsten Th-Zell-Subtypen sind T-bet-exprimierende Th1-Zellen und GATA-3-exprimierende Th2-Zellen. Neuere Entdeckungen bezüglich der Plastizität von Th-Zell-Subtypen sowie die Existenz von T-bet+GATA-3+ Hybrid-Phänotypen haben die detaillierte Untersuchung vom Differenzierungsprozessen von Th-Zellen mit komplexer Zytokinsignale motiviert.
Dazu haben wir systematisch die Zytokine IFN-g, IL-12 und IL-4 während der primären Differenzierung Th-Zellen titriert und Signaltransduktion und Zielgenexpression quantifiziert. Der Umfang und die Komplexität der Daten machten eine systematische Analyse notwendig, um involvierte Mechanismen genau zu identifizieren. Lineare Regressionsanalyse wurde verwendet, um die Netzwerktopologie zu extrahieren, wobei schon bekannte und zahlreiche neue Interaktionen vorausgesagt wurden. Die prognostizierte Netzwerktopologie wurde dann verwendet, um ein mechanistisches, mathematisches Modell der Zytokinsignalintegration zu entwickeln.
Diese Methode hat ein hochgradig vernetztes regulatorisches Netzwerk inferiert. Bisher nicht beschriebene Funktionen von STAT-Proteine, die die Neuverkabelung des Netzwerkes während der Differenzierung vermitteln, wurden vorhergesagt. Ausgewählte neue Interaktionen wurden in gezielten genetischen Experimenten bestätigt. Während gegenseitige Inhibitionsmotive oft als kanonische digitale Schalter interpretiert werden, funktioniert das Th-Zell-Netwerk als ein Rheostat, der Variationen der Zytokinsignale in graduelle Expressionsänderungen der Mastertranskriptionsfaktoren übersetzt. Unsere Arbeit erklärt mechanistisch das beobachtete Kontinuum von Th-Zelldifferenzierungszuständen entlang der Th1-Th2-Achse und beschreibt eine quantitative Methode für die datenbasierte Inferenz zellulärer Netzwerke der Signalintegration. / Cell-fate decisions are governed by the integration of multiple stimuli. Th cell differentiation is a well-studied example thereof: mature Th cells differentiate into a specialised subtype upon encounter with their cognate antigen depending on the polarising cytokines present in their environment and start expressing specific master transcription factors. The most common Th cell subtypes are T-bet-expressing Th1 cells and GATA-3-expressing Th2 cells. Recent discoveries concerning the plasticity of Th cell subtypes as well as the existence of stable T-bet+GATA-3+ hybrid Th1/2 phenotypes have stimulated the detailed study of the differentiation process under different assumptions than the hitherto valid paradigm of single master transcription factor expression by using complex cytokine signals as inputs.
Here, we developed a data-based approach for inferring the molecular network underlying the differentiation of T-bet- and/or GATA-3 expressing lymphocytes. We performed systematic titrations of the polarising cytokines IFN-g, IL-12 and IL-4 during primary differentiation of Th cells and quantified signal transduction as well as target-gene expression. The size and complexity of the dataset made a systematic analysis necessary to identify the mechanisms involved. To extract the network topology, we used linear regression analysis, retrieving known regulatory mechanisms and predicting numerous novel ones. This network topology was used to develop a mechanistic mathematical model of cytokine signal integration.
This approach inferred a highly connected regulatory network. Previously undescribed functions of STAT proteins mediating network rewiring during differentiation were predicted. Selected new interactions were confirmed by experiments using gene-deficient cells. Importantly, while mutual-inhibition motifs are often considered canonical digital switches, the inferred Th-cell network acts as a rheostat, generating a continuum of differentiated states along the Th1-Th2 axis. This work explains the observed Th1-Th2 cell fate continuum mechanistically and provides a quantitative framework for the data-based inference of cellular signal integration networks.
|
49 |
Network Inference from Perturbation Data: Robustness, Identifiability and Experimental DesignGroß, Torsten 29 January 2021 (has links)
Hochdurchsatzverfahren quantifizieren eine Vielzahl zellulärer Komponenten, können aber selten deren Interaktionen beschreiben. Daher wurden in den letzten 20 Jahren verschiedenste Netzwerk-Rekonstruktionsmethoden entwickelt. Insbesondere Perturbationsdaten erlauben dabei Rückschlüsse über funktionelle Mechanismen in der Genregulierung, Signal Transduktion, intra-zellulärer Kommunikation und anderen Prozessen zu ziehen. Dennoch bleibt Netzwerkinferenz ein ungelöstes Problem, weil die meisten Methoden auf ungeeigneten Annahmen basieren und die Identifizierbarkeit von Netzwerkkanten nicht aufklären.
Diesbezüglich beschreibt diese Dissertation eine neue Rekonstruktionsmethode, die auf einfachen Annahmen von Perturbationsausbreitung basiert. Damit ist sie in verschiedensten Zusammenhängen anwendbar und übertrifft andere Methoden in Standard-Benchmarks. Für MAPK und PI3K Signalwege in einer Adenokarzinom-Zellline generiert sie plausible Netzwerkhypothesen, die unterschiedliche Sensitivitäten von PI3K-Mutanten gegenüber verschiedener Inhibitoren überzeugend erklären.
Weiterhin wird gezeigt, dass sich Netzwerk-Identifizierbarkeit durch ein intuitives Max-Flow Problem beschreiben lässt. Dieses analytische Resultat erlaubt effektive, identifizierbare Netzwerke zu ermitteln und das experimentelle Design aufwändiger Perturbationsexperimente zu optimieren. Umfangreiche Tests zeigen, dass der Ansatz im Vergleich zu zufällig generierten Perturbationssequenzen die Anzahl der für volle Identifizierbarkeit notwendigen Perturbationen auf unter ein Drittel senkt.
Schließlich beschreibt die Dissertation eine mathematische Weiterentwicklung der Modular Response Analysis. Es wird gezeigt, dass sich das Problem als analytisch lösbare orthogonale Regression approximieren lässt. Dies erlaubt eine drastische Reduzierung des nummerischen Aufwands, womit sich deutlich größere Netzwerke rekonstruieren und neueste Hochdurchsatz-Perturbationsdaten auswerten lassen. / 'Omics' technologies provide extensive quantifications of components of biological systems but rarely characterize the interactions between them. To fill this gap, various network reconstruction methods have been developed over the past twenty years. Using perturbation data, these methods can deduce functional mechanisms in gene regulation, signal transduction, intra-cellular communication and many other cellular processes. Nevertheless, this reverse engineering problem remains essentially unsolved because inferred networks are often based on inapt assumptions, lack interpretability as well as a rigorous description of identifiability.
To overcome these shortcoming, this thesis first presents a novel inference method which is based on a simple response logic. The underlying assumptions are so mild that the approach is suitable for a wide range of applications while also outperforming existing methods in standard benchmark data sets. For MAPK and PI3K signalling pathways in an adenocarcinoma cell line, it derived plausible network hypotheses, which explain distinct sensitivities of PI3K mutants to targeted inhibitors.
Second, an intuitive maximum-flow problem is shown to describe identifiability of network interactions. This analytical result allows to devise identifiable effective network models in underdetermined settings and to optimize the design of costly perturbation experiments. Benchmarked on a database of human pathways, full network identifiability is obtained with less than a third of the perturbations that are needed in random experimental designs.
Finally, the thesis presents mathematical advances within Modular Response Analysis (MRA), which is a popular framework to quantify network interaction strengths. It is shown that MRA can be approximated as an analytically solvable total least squares problem. This insight drastically reduces computational complexity, which allows to model much bigger networks and to handle novel large-scale perturbation data.
|
Page generated in 0.0835 seconds