• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • 6
  • 3
  • 1
  • 1
  • Tagged with
  • 20
  • 20
  • 20
  • 9
  • 8
  • 6
  • 5
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Bayesian networks for high-dimensional data with complex mean structure.

Kasza, Jessica Eleonore January 2010 (has links)
In a microarray experiment, it is expected that there will be correlations between the expression levels of different genes under study. These correlation structures are of great interest from both biological and statistical points of view. From a biological perspective, the identification of correlation structures can lead to an understanding of genetic pathways involving several genes, while the statistical interest, and the emphasis of this thesis, lies in the development of statistical methods to identify such structures. However, the data arising from microarray studies is typically very high-dimensional, with an order of magnitude more genes being considered than there are samples of each gene. This leads to difficulties in the estimation of the dependence structure of all genes under study. Graphical models and Bayesian networks are often used in these situations, providing flexible frameworks in which dependence structures for high-dimensional data sets can be considered. The current methods for the estimation of dependence structures for high-dimensional data sets typically assume the presence of independent and identically distributed samples of gene expression values. However, often the data available will have a complex mean structure and additional components of variance. Given such data, the application of methods that assume independent and identically distributed samples may result in incorrect biological conclusions being drawn. In this thesis, methods for the estimation of Bayesian networks for gene expression data sets that contain additional complexities are developed and implemented. The focus is on the development of score metrics that take account of these complexities for use in conjunction with score-based methods for the estimation of Bayesian networks, in particular the High-dimensional Bayesian Covariance Selection algorithm. The necessary theory relating to Gaussian graphical models and Bayesian networks is reviewed, as are the methods currently available for the estimation of dependence structures for high-dimensional data sets consisting of independent and identically distributed samples. Score metrics for the estimation of Bayesian networks when data sets are not independent and identically distributed are then developed and explored, and the utility and necessity of these metrics is demonstrated. Finally, the developed metrics are applied to a data set consisting of samples of grape genes taken from several different vineyards. / Thesis (Ph.D.) -- University of Adelaide, School of Mathematical Sciences, 2010
12

Halobacterium salinarum NRC-1: rede de regulação gênica e sua análise probabilística / Halobacterium salinarum NRC-1: genetic regulatory network and it\'s probabilistic analysis.

Crocetti, Guilherme Martins 08 May 2018 (has links)
Este trabalho teve como objetivo principal modelar a Rede de Regulação Gênica do organismo modelo Halobacterium salinarum NRC-1, estabelecendo interações entre as entidades da rede por intermédio de experimentos inéditos de interação física: ChIP- *, RIP-* e dRNA-seq. Em contraponto com as abordagens clássicas de construção de redes, que estimam interações através de medições de expressão gênica, este trabalho as estabeleceu exclusivamente de interações físicas, permitindo que a estrutura final seja uma representação mais fiel ao fenômeno físico de regulação gênica, baseando-se nos fundamentos da Biologia Sistêmica. Em vista da abundância de dados públicos de expressão gênica para o organismo e do objetivo primário, um objetivo secundário foi traçado: identificar, computacionalmente, genes de fato controlados pelas interações fornecidas pela nova rede. Para isso, a estrutura estabelecida foi transformada numa Rede Bayesiana, e a identificação de genes foi efetuada através da análise de suas Tabelas de Probabilidade Condicionais. Finalmente, como os resultados obtidos para o objetivo secundário foram desfavoráveis a utilização de Redes Bayesianas, os resultados efetivos deste trabalho foram a criação de uma nova Rede de Regulação Gênica para a H. salinarum e uma análise em torno da efetividade de Redes Bayesianas neste contexto. / The main goal of this work was modeling the gene regulatory network of the model organism Halobacterium salinarum NRC-1, establishing new interactions between networks entities through unpublished physical interaction experiments: ChIP-*, RIP-* e dRNA-seq. Instead of using classical approaches to build network structures that estimates interactions using gene expression data, this work established them exclusively from physical interactions. Therefore, the final structure is a more reliable representation of the physical phenomenon of gene expression, built using the principles of systems biology. Considering the amount of public available gene expression data and the primary goal, another objective was proposed: a computational analysis to detect genes actually controlled by the interactions of the new network. To achieve this goal the established network was transformed in a Bayesian network, detecting genes through the analysis of their conditional probability tables. Lastly, as the results of the secondary goal went against the use of Bayesian networks, the effective results of this thesis were the creation of a new genetic regulatory network for H. salinarum and an analysis around Bayesian networks in this context.
13

Aplicações de computação bioinspirada em bioinformatica : investigando o papel dos genes e suas interações / Applications of bioinspired computing in bioinformatics : analyzing the role of genes and their interactions

Bezerra, George Barreto Pereira 31 July 2006 (has links)
Orientador: Fernando Jose Von Zuben / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de Computação / Made available in DSpace on 2018-08-11T13:03:57Z (GMT). No. of bitstreams: 1 Bezerra_GeorgeBarretoPereira_M.pdf: 1423598 bytes, checksum: 5587c3941203fcdd6c2eddb7dad89a93 (MD5) Previous issue date: 2006 / Resumo: Esta dissertação trata das redes gênicas, o mecanismo de controle da ativação dos genes nas células, sob três perspectivas computacionais diferentes. Inicialmente, sob uma ótica de engenharia, é elaborada uma ferramenta de inferência de redes gênicas, capaz de reconstruir a estrutura estática dessas redes a partir de um conjunto de dados experimentais. O método proposto para essa tarefa de identificação de sistemas é especialmente projetado para conjunto de dados reduzidos, um cenário bastante comum quando se trata de dados de expressão gênica. Numa segunda etapa, é proposto um modelo computacional das redes gênicas, em que as reações bioquímicas que ocorrem na célula são vistas como equações não-lineares arranjadas numa estrutura conexionista. Desta vez, ao invés de inferir redes existentes, esse modelo é utilizado em conjunto com uma abordagem evolutiva para sintetizar redes gênicas artificiais capazes de realizar tarefas dinâmicas ¿ em específico, para solucionar um problema clássico de robótica evolutiva. Embora o modelo seja empregado como técnica de resolução de problemas, o objetivo agora é mais no sentido científico, isto é, as redes gênicas artificiais evoluídas são analisadas como modelos que podem ajudar a compreender propriedades observadas nos sistemas naturais. Finalmente, a terceira etapa consiste numa abordagem conceitual. O propósito principal é tentar compor um novo cenário para o estudo das redes gênicas, reunindo conceitos e dados empíricos de outras áreas da ciência moderna, como a neurociência e a sinergética, e investigando as implicações de uma nova ótica para o processamento de informação celular. O objetivo aqui é voltado para a compreensão dos mecanismos de processamento de informação em organismos vivos / Abstract: This dissertation deals with genetic networks, the mechanism of control of gene activity in cells, under three different computational perspectives. Initially, as an engineering approach, a computational tool for inference of genetic networks is proposed, which is able to recover the static structure of these networks from experimental datasets. This systems identification method is especially designed for small datasets, a common scenario when coping with gene expression data. In the second step, a computational model for genetic networks is proposed, in which biochemical reactions that occur inside the cell are treated as nonlinear equations in a connectionist structure. Rather than inferring networks from data, this model is used together with an evolutionary algorithm to synthesize artificial genetic networks that are able to solve dynamic tasks ¿ and in particilar, to solve a classic problem in evolutionary robotics. Although the model is used as a problem-solving technique, the objective here is primarily scientific, i.e., the evolved artificial genetic networks are viewed as an opportunity to study properties observed in natural systems. Finally, the third step comprises a conceptual approach, in which ideas from other fields of modern science, like neuroscience and synergetics, are put together to compose a new scenario to the study of the information processing in genetic networks / Mestrado / Engenharia de Computação / Mestre em Engenharia Elétrica
14

Halobacterium salinarum NRC-1: rede de regulação gênica e sua análise probabilística / Halobacterium salinarum NRC-1: genetic regulatory network and it\'s probabilistic analysis.

Guilherme Martins Crocetti 08 May 2018 (has links)
Este trabalho teve como objetivo principal modelar a Rede de Regulação Gênica do organismo modelo Halobacterium salinarum NRC-1, estabelecendo interações entre as entidades da rede por intermédio de experimentos inéditos de interação física: ChIP- *, RIP-* e dRNA-seq. Em contraponto com as abordagens clássicas de construção de redes, que estimam interações através de medições de expressão gênica, este trabalho as estabeleceu exclusivamente de interações físicas, permitindo que a estrutura final seja uma representação mais fiel ao fenômeno físico de regulação gênica, baseando-se nos fundamentos da Biologia Sistêmica. Em vista da abundância de dados públicos de expressão gênica para o organismo e do objetivo primário, um objetivo secundário foi traçado: identificar, computacionalmente, genes de fato controlados pelas interações fornecidas pela nova rede. Para isso, a estrutura estabelecida foi transformada numa Rede Bayesiana, e a identificação de genes foi efetuada através da análise de suas Tabelas de Probabilidade Condicionais. Finalmente, como os resultados obtidos para o objetivo secundário foram desfavoráveis a utilização de Redes Bayesianas, os resultados efetivos deste trabalho foram a criação de uma nova Rede de Regulação Gênica para a H. salinarum e uma análise em torno da efetividade de Redes Bayesianas neste contexto. / The main goal of this work was modeling the gene regulatory network of the model organism Halobacterium salinarum NRC-1, establishing new interactions between networks entities through unpublished physical interaction experiments: ChIP-*, RIP-* e dRNA-seq. Instead of using classical approaches to build network structures that estimates interactions using gene expression data, this work established them exclusively from physical interactions. Therefore, the final structure is a more reliable representation of the physical phenomenon of gene expression, built using the principles of systems biology. Considering the amount of public available gene expression data and the primary goal, another objective was proposed: a computational analysis to detect genes actually controlled by the interactions of the new network. To achieve this goal the established network was transformed in a Bayesian network, detecting genes through the analysis of their conditional probability tables. Lastly, as the results of the secondary goal went against the use of Bayesian networks, the effective results of this thesis were the creation of a new genetic regulatory network for H. salinarum and an analysis around Bayesian networks in this context.
15

Inferring Genetic Regulatory Networks Using Cost-based Abduction and Its Relation to Bayesian Inference

Andrews, Emad Abdel-Thalooth 16 July 2014 (has links)
Inferring Genetic Regulatory Networks (GRN) from multiple data sources is a fundamental problem in computational biology. Computational models for GRN range from simple Boolean networks to stochastic differential equations. To successfully model GRN, a computational method has to be scalable and capable of integrating different biological data sources effectively and homogeneously. In this thesis, we introduce a novel method to model GRN using Cost-Based Abduction (CBA) and study the relation between CBA and Bayesian inference. CBA is an important AI formalism for reasoning under uncertainty that can integrate different biological data sources effectively. We use three different yeast genome data sources—protein-DNA, protein-protein, and knock-out data—to build a skeleton (unannotated) graph which acts as a theory to build a CBA system. The Least Cost Proof (LCP) for the CBA system fully annotates the skeleton graph to represent the learned GRN. Our results show that CBA is a promising tool in computational biology in general and in GRN modeling in particular because CBA knowledge representation can intrinsically implement the AND/OR logic in GRN while enforcing cis-regulatory logic constraints effectively, allowing the method to operate on a genome-wide scale.Besides allowing us to successfully learn yeast pathways such as the pheromone pathway, our method is scalable enough to analyze the full yeast genome in a single CBA instance, without sub-networking. The scalability power of our method comes from the fact that our CBA model size grows in a quadratic, rather than exponential, manner with respect to data size and path length. We also introduce a new algorithm to convert CBA into an equivalent binary linear program that computes the exact LCP for the CBA system, thus reaching the optimal solution. Our work establishes a framework to solve Bayesian networks using integer linear programming and high order recurrent neural networks through CBA as an intermediate representation.
16

Méthodes numériques et formelles pour l'ingénierie des réseaux biologiques : traitement de l'information par des populations d'oscillateurs. Approches par contraintes et Taxonomie des réseaux biologiques / Numerical and formal methods for biological networks engineering : Computing by populations of oscillators, constraint-based approaches and taxonomy of biological networks

Ben Amor, Mohamed Hedi 11 July 2012 (has links)
Cette thèse concerne l'ingénierie des systèmes complexes à partir d'une dynamique souhaitée. En particulier, nous nous intéressons aux populations d'oscillateurs et aux réseaux de régulation génétique. Dans une première partie, nous nous fondons sur une hypothèse, introduite en neurosciences, qui souligne le rôle de la synchronisation neuronale dans le traitement de l'information cognitive. Nous proposons de l'utiliser sur un plan plus large pour étudier le traitement de l'information par des populations d'oscillateurs. Nous discutons des isochrons de quelques oscillateurs classés selon leurs symétries dans l'espace des états. Cela nous permet d'avoir un critère qualitatif pour choisir un oscillateur. Par la suite, nous définissons des procédures d'impression, de lecture et de réorganisation de l'information sur une population d'oscillateurs. En perspective, nous proposons un système à couches d'oscillateurs de Wilson-Cowan. Ce système juxtapose convenablement synchronisation et désynchronisation à travers l'utilisation de deux formes de couplage: un couplage continu et un couplage par pulsation. Nous finissons en proposant une application de ce système: la détection de contours dans une image. En deuxième partie, nous proposons d'utiliser une approche par contraintes pour identifier des réseaux de régulation génétique à partir de connaissances partielles sur leur dynamique et leur structure. Le formalisme que nous utilisons est connu sous le nom de réseaux d'automates booléens à seuil ou réseaux Hopfield-semblables. Nous appliquons cette méthode, afin de déterminer le réseau de régulation de la morphogenèse florale d'Arabidopsis thaliana. Nous montrons l'absence d'unicité des solutions dans l'ensemble des modèles valides (ici, 532 modèles). Nous montrons le potentiel de cette approche dans la détermination et la classification de modèles de réseaux de régulation génétique. L'ensemble de ces travaux mène à un certain nombre d'applications, en particulier dans le développement de nouvelles méthodes de stockage de l'information et dans le design de systèmes de calcul non conventionnel. / This thesis is concerned by the engineering of complex systems from a desired dynamics. Particularly, we are interested by populations of oscillators and genetical regulatory networks. In a first part, we start from a hypothesis introduced in neuroscience, which highlight the role of neural synchronization in the cognitive processing. We propose to use this hypothesis in a more general panorama to investigate the computing with populations of oscillators. We discuss about the isochrons of few oscillators selected according to their symmetry in the state space. Therefore, we define procedures for making footprints, for reading and for reorganizing information by a population of oscillators. As a perspective, we propose a system of lattices of Wilson-Cowan oscillators organized in several interconnected layers. This system properly mixes synchronization and desynchronization by using two types of coupling : pulsed and continuous coupling. At the end of this part, we propose to use this system in order to detect the edges of an image. In the second part, we propose a constraint-based approach to determine the structure of genetic regulatory networks starting from incomplete knowledge on their structure and their dynamics. The formalism we use is widely called thresholded Boolean automata networks or Hopfield-like networks. As an proof of concept, we apply this method to determine the regulatory network of Arabidopsis thaliana flower morphogenesis. We obtain 532 valid models instead of one unique solution and then classify them by using structural robustness criteria. By this way, we showed the potential of this approach in determining and classifying thresholded Boolean automata networks like genetic regulatory networks or neural networks. This works leads to many applications, in particular the developpement and the design of new methods for processing information and the design of systems of unconventional computing.
17

Modelling genetic regulatory networks: a new model for circadian rhythms in Drosophila and investigation of genetic noise in a viral infection process

Xie, Zhi January 2007 (has links)
In spite of remarkable progress in molecular biology, our understanding of the dynamics and functions of intra- and inter-cellular biological networks has been hampered by their complexity. Kinetics modelling, an important type of mathematical modelling, provides a rigorous and reliable way to reveal the complexity of biological networks. In this thesis, two genetic regulatory networks have been investigated via kinetic models. In the first part of the study, a model is developed to represent the transcriptional regulatory network essential for the circadian rhythms in Drosophila. The model incorporates the transcriptional feedback loops revealed so far in the network of the circadian clock (PER/TIM and VRI/PDP1 loops). Conventional Hill functions are not used to describe the regulation of genes, instead the explicit reactions of binding and unbinding processes of transcription factors to promoters are modelled. The model is described by a set of ordinary differential equations and the parameters are estimated from the in vitro experimental data of the clocks’ components. The simulation results show that the model reproduces sustained circadian oscillations in mRNA and protein concentrations that are in agreement with experimental observations. It also simulates the entrainment by light-dark cycles, the disappearance of the rhythmicity in constant light and the shape of phase response curves resembling that of experimental results. The model is robust over a wide range of parameter variations. In addition, the simulated E-box mutation, perS and perL mutants are similar to that observed in the experiments. The deficiency between the simulated mRNA levels and experimental observations in per01, tim01 and clkJrk mutants suggests some differences in the model from reality. Finally, a possible function of VRI/PDP1 loops is proposed to increase the robustness of the clock. In the second part of the study, the sources of intrinsic noise and the influence of extrinsic noise are investigated on an intracellular viral infection system. The contribution of the intrinsic noise from each reaction is measured by means of a special form of stochastic differential equation, the chemical Langevin equation. The intrinsic noise of the system is the linear sum of the noise in each of the reactions. The intrinsic noise arises mainly from the degradation of mRNA and the transcription processes. Then, the effects of extrinsic noise are studied by means of a general form of stochastic differential equation. It is found that the noise of the viral components grows logarithmically with increasing noise intensities. The system is most susceptible to noise in the virus assembly process. A high level of noise in this process can even inhibit the replication of the viruses. In summary, the success of this thesis demonstrates the usefulness of models for interpreting experimental data, developing hypotheses, as well as for understanding the design principles of genetic regulatory networks.
18

Structural and parametric identification of bacterial regulatory networks / Identification structurelle et paramétrique des réseaux de régulation bactériens

Stefan, Diana 30 June 2014 (has links)
Les technologies expérimentales à haut débit produisent de grandes quantités de données sur les niveaux d'expression des gènes dans les bactéries à l'état d'équilibre ou lors des transitions de croissance.Un défi important dans l'interprétation biologique de ces données consiste à en déduire la topologie du réseau de régulation ainsi que les fonctions de régulation quantitatives des gènes.Un grand nombre de méthodes d'inférence a été proposé dans la littérature. Ces méthodes ont été utilisées avec succès dans une variété d'applications, bien que plusieurs problèmes persistent.Nous nous intéressons ici à l'amélioration de deux aspects des méthodes d'inférence.Premièrement, les données transcriptomiques reflètent l'abondance de l'ARNm, tandis que, le plus souvent, les composants régulateurs sont les protéines codées par les ARNm.Bien que les concentrations de l'ARNm et de protéines soient raisonnablement corrélées à l'état stationnaire, cette corrélation devient beaucoup moins évidente dans les données temporelles acquises lors des transitions de croissance à cause des demi-vies très différentes des protéines et des ARNm.Deuxièmement, la dynamique de l'expression génique n'est pas uniquement contrôlée par des facteurs de transcription et d'autres régulateurs spécifiques, mais aussi par des effets physiologiques globaux qui modifient l'activité de tous les gènes. Par exemple, les concentrations de l'ARN polymérase (libre) et les concentrations des ribosomes (libres) varient fortement avec le taux de croissance. Nous devons donc tenir compte de ces effets lors de la reconstruction d'un réseau de régulation à partir de données d'expression génique.Nous proposons ici une approche expérimentale et computationnelle combinée pour répondre à ces deux problèmes fondamentaux dans l'inférence de modèles quantitatifs de promoteurs bactériens à partir des données temporelles d'expression génique.Nous nous intéressons au cas où la dynamique de l'expression génique est mesurée in vivo et en temps réel par l'intermédiaire de gènes rapporteurs fluorescents. Notre approche d'inférence de réseaux de régulation tient compte des différences de demi-vie entre l'ARNm et les protéines et prend en compte les effets physiologiques globaux.Lorsque les demi-vies des protéines sont connues, les modèles expérimentaux utilisés pour dériver les activités des gènes à partir de données de fluorescence sont intégrés pour estimer les concentrations des protéines.L'état physiologique global de la cellule est estimé à partir de l'activité d'un promoteur de phage, dont l'expression n'est contrôlée par aucun des facteurs de transcription et ne dépend que de l'activité de la machinerie d'expression génique.Nous appliquons l'approche à un module central dans le réseau de régulation contrôlant la motilité et le système de chimiotactisme chez Escherichia coli.Ce module est composé des gènes FliA, FlgM et tar.FliA est un facteur sigma qui dirige l'ARN polymérase vers les opérons codant pour des composants de l'assemblage des flagelles.Le troisième composant du réseau, tar, code pour la protéine récepteur chimiotactique de l'aspartate, Tar, et est directement transcrit par FliA associé à l' holoenzyme ARN polymérase. Le module FliA-FlgM est particulièrement bien adapté pour l'étude des problèmes d'inférence considérés ici, puisque le réseau a été bien étudié et les démivies des protéines jouent un rôle important dans son fonctionnement.Nos résultats montrent que, pour la reconstruction fiable de réseaux de régulation transcriptionelle chez les bactéries, il est nécessaire d'inclure les effets globaux dans le modèle de réseau et d'en déduire de manière explicite les concentrations des protéines à partir des profils d'expression observés, car la demi-vie de l'ARNm et des protéines sont très différentes. Notre approche reste généralement applicable à une grande variété de problèmes d'inférence de réseaux et nous discutons les limites et les extensions possibles de la méthode. / High-throughput technologies yield large amounts of data about the steady-state levels and the dynamical changes of gene expression in bacteria. An important challenge for the biological interpretation of these data consists in deducing the topology of the underlying regulatory network as well as quantitative gene regulation functions from such data. A large number of inference methods have been proposed in the literature and have been successful in a variety of applications, although several problems remain. We focus here on improving two aspects of the inference methods. First, transcriptome data reflect the abundance of mRNA, whereas the components that regulate are most often the proteins coded by the mRNAs. Although the concentrations of mRNA and protein correlate reasonably during steady-state growth, this correlation becomes much more tenuous in time-series data acquired during growth transitions in bacteria because of the very different half-lives of proteins and mRNA. Second, the dynamics of gene expression is not only controlled by transcription factors and other specific regulators, but also by global physiological effects that modify the activity of all genes. For example, the concentrations of (free) RNA polymerase and the concentration of ribosomes vary strongly with growth rate. We therefore have to take into account such effects when trying to reconstruct a regulatory network from gene expression data. We propose here a combined experimental and computational approach to address these two fundamental problems in the inference of quantitative models of the activity of bacterial promoters from time-series gene expression data. We focus on the case where the dynamics of gene expression is measured in vivo and in real time by means of fluorescent reporter genes. Our network reconstruction approach accounts for the differences between mRNA and protein half-lives and takes into account global physiological effects. When the half-lives of the proteins are available, the measurement models used for deriving the activities of genes from fluorescence data are integrated to yield estimates of protein concentrations. The global physiological state of the cell is estimated from the activity of a phage promoter, whose expression is not controlled by any transcription factor and depends only on the activity of the transcriptional and translational machinery. We apply the approach to a central module in the regulatory network controlling motility and the chemotaxis system in Escherichia coli. This module comprises the FliA, FlgM and tar genes. FliA is a sigma factor that directs RNA polymerase to operons coding for components of the flagellar assembly. The effect of FliA is counteracted by the antisigma factor FlgM, itself transcribed by FliA. The third component of the network, tar, codes for the aspartate chemoreceptor protein Tar and is directly transcribed by the FliA-containing RNA polymerase holoenzyme. The FliA-FlgM module is particularly well-suited for studying the inference problems considered here, since the network has been well-studied and protein half-lives play an important role in its functioning. We stimulated the FliA-FlgM module in a variety of wild-type and mutant strains and different growth media. The measured transcriptional response of the genes was used to systematically test the information required for the reliable inference of the regulatory interactions and quantitative predictive models of gene regulation. Our results show that for the reliable reconstruction of transcriptional regulatory networks in bacteria it is necessary to include global effects into the network model and explicitly deduce protein concentrations from the observed expression profiles. Our approach should be generally applicable to a large variety of network inference problems and we discuss limitations and possible extensions of the method.
19

Machine Learning for Exploring State Space Structure in Genetic Regulatory Networks

Thomas, Rodney H. 01 January 2018 (has links)
Genetic regulatory networks (GRN) offer a useful model for clinical biology. Specifically, such networks capture interactions among genes, proteins, and other metabolic factors. Unfortunately, it is difficult to understand and predict the behavior of networks that are of realistic size and complexity. In this dissertation, behavior refers to the trajectory of a state, through a series of state transitions over time, to an attractor in the network. This project assumes asynchronous Boolean networks, implying that a state may transition to more than one attractor. The goal of this project is to efficiently identify a network's set of attractors and to predict the likelihood with which an arbitrary state leads to each of the network’s attractors. These probabilities will be represented using a fuzzy membership vector. Predicting fuzzy membership vectors using machine learning techniques may address the intractability posed by networks of realistic size and complexity. Modeling and simulation can be used to provide the necessary training sets for machine learning methods to predict fuzzy membership vectors. The experiments comprise several GRNs, each represented by a set of output classes. These classes consist of thresholds τ and ¬τ, where τ = [τlaw,τhigh]; state s belongs to class τ if the probability of its transitioning to attractor 􀜣 belongs to the range [τlaw,τhigh]; otherwise it belongs to class ¬τ. Finally, each machine learning classifier was trained with the training sets that was previously collected. The objective is to explore methods to discover patterns for meaningful classification of states in realistically complex regulatory networks. The research design took a GRN and a machine learning method as input and produced output class < Ατ > and its negation ¬ < Ατ >. For each GRN, attractors were identified, data was collected by sampling each state to create fuzzy membership vectors, and machine learning methods were trained to predict whether a state is in a healthy attractor or not. For T-LGL, SVMs had the highest accuracy in predictions (between 93.6% and 96.9%) and precision (between 94.59% and 97.87%). However, naive Bayesian classifiers had the highest recall (between 94.71% and 97.78%). This study showed that all experiments have extreme significance with pvalue < 0.0001. The contribution this research offers helps clinical biologist to submit genetic states to get an initial result on their outcomes. For future work, this implementation could use other machine learning classifiers such as xgboost or deep learning methods. Other suggestions offered are developing methods that improves the performance of state transition that allow for larger training sets to be sampled.
20

Modelling genetic regulatory networks: a new model for circadian rhythms in Drosophila and investigation of genetic noise in a viral infection process

Xie, Zhi January 2007 (has links)
In spite of remarkable progress in molecular biology, our understanding of the dynamics and functions of intra- and inter-cellular biological networks has been hampered by their complexity. Kinetics modelling, an important type of mathematical modelling, provides a rigorous and reliable way to reveal the complexity of biological networks. In this thesis, two genetic regulatory networks have been investigated via kinetic models. In the first part of the study, a model is developed to represent the transcriptional regulatory network essential for the circadian rhythms in Drosophila. The model incorporates the transcriptional feedback loops revealed so far in the network of the circadian clock (PER/TIM and VRI/PDP1 loops). Conventional Hill functions are not used to describe the regulation of genes, instead the explicit reactions of binding and unbinding processes of transcription factors to promoters are modelled. The model is described by a set of ordinary differential equations and the parameters are estimated from the in vitro experimental data of the clocks' components. The simulation results show that the model reproduces sustained circadian oscillations in mRNA and protein concentrations that are in agreement with experimental observations. It also simulates the entrainment by light-dark cycles, the disappearance of the rhythmicity in constant light and the shape of phase response curves resembling that of experimental results. The model is robust over a wide range of parameter variations. In addition, the simulated E-box mutation, perS and perL mutants are similar to that observed in the experiments. The deficiency between the simulated mRNA levels and experimental observations in per01, tim01 and clkJrk mutants suggests some differences in the model from reality. Finally, a possible function of VRI/PDP1 loops is proposed to increase the robustness of the clock. In the second part of the study, the sources of intrinsic noise and the influence of extrinsic noise are investigated on an intracellular viral infection system. The contribution of the intrinsic noise from each reaction is measured by means of a special form of stochastic differential equation, the chemical Langevin equation. The intrinsic noise of the system is the linear sum of the noise in each of the reactions. The intrinsic noise arises mainly from the degradation of mRNA and the transcription processes. Then, the effects of extrinsic noise are studied by means of a general form of stochastic differential equation. It is found that the noise of the viral components grows logarithmically with increasing noise intensities. The system is most susceptible to noise in the virus assembly process. A high level of noise in this process can even inhibit the replication of the viruses. In summary, the success of this thesis demonstrates the usefulness of models for interpreting experimental data, developing hypotheses, as well as for understanding the design principles of genetic regulatory networks.

Page generated in 0.0872 seconds