Global ETD Search

61	Supervised Inference of Gene Regulatory Networks Sen, Malabika Ashit 09 September 2021 (has links) A gene regulatory network (GRN) records the interactions among transcription factors and their target genes. GRNs are useful to study how transcription factors (TFs) control gene expression as cells transition between states during differentiation and development. Scientists usually construct GRNs by careful examination and study of the literature. This process is slow and painstaking and does not scale to large networks. In this thesis, we study the problem of inferring GRNs automatically from gene expression data. Recent data-driven approaches to infer GRNs increasingly rely on single-cell level RNA-sequencing (scRNA-seq) data. Most of these methods rely on unsupervised or association based strategies, which cannot leverage known regulatory interactions by design. To facilitate supervised learning, we propose a novel graph convolutional neural network (GCN) based autoencoder to infer new regulatory edges from a known GRN and scRNA-seq data. As the name suggests, a GCN-based autoencoder consists of an encoder that learns a low-dimensional embedding of the nodes (genes) in the input graph (the GRN) through a series of graph convolution operations and a decoder that aims to reconstruct the original graph as accurately as possible. We investigate several GCN-based architectures to determine the ideal encoder-decoder combination for GRN reconstruction. We systematically study the performance of these and other supervised learning methods on different mouse and human scRNA-seq datasets for two types of evaluation. We demonstrate that our GCN-based approach substantially outperforms traditional machine learning approaches. / Master of Science / In multi-cellular living organisms, stem cells differentiate into multiple cell types. Proteins called transcription factors (TFs) control the activity of genes to effect these transitions. It is possible to represent these interactions abstractly using a gene regulatory network (GRN). In a GRN, each node is a TF or a gene and each edge connects a TF to a gene or TF that it controls. New high-throughput technologies that can measure gene expression (activity) in individual cells provide rich data that can be used to construct GRNs. In this thesis, we take advantage of recent advances in the field of machine learning to develop a new computational method for computationally constructing GRNs. The distinguishing property of our technique is that it is supervised, i.e., it uses experimentally-known interactions to infer new regulatory connections. We investigate several variations of this approach to reconstruct a GRN as close to the original network as possible. We analyze and provide a rationale for the decisions made in designing, evaluating, and choosing the characteristics of our predictor. We show that our predictor has a reconstruction accuracy that is superior to other supervised-learning approaches. Gene Regulatory Networks Network Inference Link Prediction Graph Convolutional Networks Graph Machine Learning
62	Algebraic Methods for Modeling Gene Regulatory Networks Murrugarra Tomairo, David M. 01 August 2012 (has links) So called discrete models have been successfully used in engineering and computational systems biology. This thesis discusses algebraic methods for modeling and analysis of gene regulatory networks within the discrete modeling context. The first chapter gives a background for discrete models and put in context some of the main research problems that have been pursued in this field for the last fifty years. It also outlines the content of each subsequent chapter. The second chapter focuses on the problem of inferring dynamics from the structure (topology) of the network. It also discusses the characterization of the attractor structure of a network when a particular class of functions control the nodes of the network. Chapters~3 and 4 focus on the study of multi-state nested canalyzing functions as biologically inspired functions and the characterization of their dynamics. Chapter 5 focuses on stochastic methods, specifically on the development of a stochastic modeling framework for discrete models. Stochastic discrete modeling is an alternative approach from the well-known mathematical formalizations such as stochastic differential equations and Gillespie algorithm simulations. Within the discrete setting, a framework that incorporates propensity probabilities for activation and degradation is presented. This approach allows a finer analysis of discrete models and provides a natural setup for cell population simulations. Finally, Chapter 6 discusses future research directions inspired by the work presented here. / Ph. D. Systems Biology Stochastic Discrete Modeling Intrinsic Noise. Nested Canalyzing Functions Gene Regulatory Networks Robustness
63	Algorithms for regulatory network inference and experiment planning in systems biology Pratapa, Aditya 17 July 2020 (has links) I present novel solutions to two different classes of computational problems that arise in the study of complex cellular processes. The first problem arises in the context of planning large-scale genetic cross experiments that can be used to validate predictions of multigenic perturbations made by mathematical models. (i) I present CrossPlan, a novel methodology for systematically planning genetic crosses to make a set of target mutants from a set of source mutants. CrossPlan is based on a generic experimental workflow used in performing genetic crosses in budding yeast. CrossPlan uses an integer-linear-program (ILP) to maximize the number of target mutants that we can make under certain experimental constraints. I apply it to a comprehensive mathematical model of the protein regulatory network controlling cell division in budding yeast. (ii) I formulate several natural problems related to efficient synthesis of a target mutant from source mutants. These formulations capture experimentally-useful notions of verifiability (e.g., the need to confirm that a mutant contains mutations in the desired genes) and permissibility (e.g., the requirement that no intermediate mutants in the synthesis be inviable). I present several polynomial time or fixed-parameter tractable algorithms for optimal synthesis of a target mutant for special cases of the problem that arise in practice. The second problem I address is inferring gene regulatory networks (GRNs) from single cell transcriptomic (scRNA-seq) data. These GRNs can serve as starting points to build mathematical models. (iii) I present BEELINE, a comprehensive evaluation of state-of-the-art algorithms for inferring gene regulatory networks (GRNs) from single-cell gene expression data. The evaluations from BEELINE suggest that the area under the precision-recall curve and early precision of these algorithms are moderate. Techniques that do not require pseudotime-ordered cells are generally more accurate. Based on these results, I present recommendations to end users of GRN inference methods. BEELINE will aid the development of gene regulatory network inference algorithms. (iv) Based on the insights gained from BEELINE, I propose a novel graph convolutional neural network (GCN) based supervised algorithm for GRN inference form single-cell gene expression data. This GCN-based model has a considerably better accuracy than existing supervised learning algorithms for GRN inference from scRNA-seq data and can infer cell-type specific regulatory networks. / Doctor of Philosophy / A small number of key molecules can completely change the cell's state, for example, a stem cell differentiating into distinct types of blood cells or a healthy cell turning cancerous. How can we uncover the important cellular events that govern complex biological behavior? One approach to answering the question has been to elucidate the mechanisms by which genes and proteins control each other in a cell. These mechanisms are typically represented in the form of a gene or protein regulatory network. The resulting networks can be modeled as a system of mathematical equations, also known as a mathematical model. The advantage of such a model is that we can computationally simulate the time courses of various molecules. Moreover, we can use the model simulations to predict the effect of perturbations such as deleting one or more genes. A biologist can perform experiments to test these predictions. Subsequently, the model can be iteratively refined by reconciling any differences between the prediction and the experiment. In this thesis I present two novel solutions aimed at dramatically reducing the time and effort required for this build-simulate-test cycle. The first solution I propose is in prioritizing and planning large-scale gene perturbation experiments that can be used for validating existing models. I then focus on taking advantage of the recent advances in experimental techniques that enable us to measure gene activity at a single-cell resolution, known as scRNA-seq. This scRNA-seq data can be used to infer the interactions in gene regulatory networks. I perform a systematic evaluation of existing computational methods for building gene regulatory networks from scRNA-seq data. Based on the insights gained from this comprehensive evaluation, I propose novel algorithms that can take advantage of prior knowledge in building these regulatory networks. The results underscore the promise of my approach in identifying cell-type specific interactions. These context-specific interactions play a key role in building mathematical models to study complex cellular processes such as a developmental process that drives transitions from one cell type to another network biology experiment planning gene regulatory networks deep learning single cell transcriptomics
64	Exploring transcription patterns and regulatory motifs in Arabidopsis thaliana Bahirwani, Vishal January 1900 (has links) Master of Science / Department of Computing and Information Sciences / Doina Caragea / Recent work has shown that bidirectional genes (genes located on opposite strands of DNA, whose transcription start sites are not more than 1000 basepairs apart) are often co-expressed and have similar biological functions. Identification of such genes can be useful in the process of constructing gene regulatory networks. Furthermore, analysis of the intergenic regions corresponding to bidirectional genes can help to identify regulatory elements, such as transcription factor binding sites. Approximately 2500 bidirectional gene pairs have been identified in Arabidopsis thaliana and the corresponding intergenic regions have been shown to be rich in regulatory elements that are essential for the initiation of transcription. Identifying such elements is especially important, as simply searching for known transcription factor binding sites in the promoter of a gene can result in many hits that are not always important for transcription initiation. Encouraged by the findings about the presence of essential regulatory elements in the intergenic regions corresponding to bidirectional genes, in this thesis, we explore a motif-based machine learning approach to identify intergenic regulatory elements. More precisely, we consider the problem of predicting the transcription pattern for pairs of consecutive genes in Arabidopsis thaliana using motifs from AthaMap and PLACE. We use machine learning algorithms to learn models that can predict the direction of transcription for pairs of consecutive genes. To identify the most predictive motifs and, therefore, the most significant regulatory elements, we perform feature selection based on mutual information and feature abstraction based on family or sequence similarity. Preliminary results demonstrate the feasibility of our approach. Gene regulatory networks Machine learning Arabidopsis thaliana Motif Hierarchical agglomerative clustering Bioinformatics Bidirectional genes Computer Science (0984)
65	Avaliação de métodos de inferência de redes de regulação gênica. / Evaluation of gene regulatory networks inference methods. Fachini, Alan Rafael 17 October 2016 (has links) A representação do Sistema de Regulação Gênica por meio de uma Rede de Regulação Gênica (GRN) pode facilitar a compreensão dos processos biológicos no nível molecular, auxiliando no entendimento do comportamento dos genes, a descoberta da causa de doenças e o desenvolvimento de novas drogas. Através das GRNs pode-se avaliar quais genes estão ativos e quais são suas influências no sistema. Nos últimos anos, vários métodos computacionais foram desenvolvidos para realizar a inferência de redes a partir de dados de expressão gênica. Esta pesquisa apresenta uma análise comparativa de métodos de inferência de GRNs, realizando uma revisão do modelo experimental descrito na literatura atual aplicados a conjuntos de dados contendo poucas amostras. Apresenta também o uso comitês de especialistas (ensemble) para agregar o resultado dos métodos a fim de melhorar a qualidade da inferência. Como resultado obteve-se que o uso de poucas amostras de dados (abaixo de 50) não fornecem resultados interessantes para a inferência de redes. Demonstrou-se também que o uso de comitês de especialistas melhoram os resultados de inferência. Os resultados desta pesquisa podem auxiliar em pesquisas futuras baseadas em GRNs. / The representation of the gene regulation system by means of a Gene Regulatory Network (GRN) can help the understanding of biological processes at the molecular level, elucidating the behavior of genes and leading to the discovery of disease causes and the development of new drugs. GRNs allow to evaluate which genes are active and how they influence the system. In recent years, many computational methods have been developed for networks inference from gene expression data. This study presents a comparative analysis of GRN inference methods, reviewing the experimental modeling present in the state-of-art scientific publications applied to datasets with small data samples. The use of ensembles was proposed to improve the quality of the network inference. As results, we show that the use of small data samples (less than 50 samples) do not show a good result in the network inference problem. We also show that the use of ensemble improve the network inference. Bioinformática Bioinformatics Comitês de Especialistas Ensemble Gene Regulatory Networks Machine learning Network Inference Redes de Regulação Gênica
66	Confounding effects in gene expression and their impact on downstream analysis Lachmann, Alexander January 2016 (has links) The reconstruction of gene regulatory networks is one of the milestones of computational system biology. We introduce a new implementation of ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) to reverse engineer transcriptional regulatory networks with improved mutual information estimators and significant improvement in performance. In the context of data driven network inference we identify two major confounding biases and introduce solutions to remove some of the discussed biases. First we identify prevalent spatial biases in gene expression studies derived from plate based designs. We investigate the gene expression profiles of a million samples from the LINCS dataset and find that the vast majority (96%) of the tested plates is affected by significant spatial bias. We can show that our proposed method to correct these biases results in a significant improvement of similarity between biological replicates assayed in different plates. Lastly we discuss the effect of CNV on gene expression and its confounding effect on the correlation landscape of genes in the context of cancer samples. We propose a method that removes the variance in gene expression explained by CNV and show that TF target predictions can be significantly improved. Genetic regulation Genetic regulation--Data processing Gene expression Gene expression--Data processing Bioinformatics Gene regulatory networks Genetics
67	The evolution and regulation of the chordate ParaHox cluster Garstang, Myles Grant January 2016 (has links) The ParaHox cluster is the evolutionary sister of the Hox cluster. Like the Hox cluster, the ParaHox cluster is subject to complex regulatory phenomena such as collinearity. Despite the breakup of the ParaHox cluster within many animals, intact and collinear clusters have now been discovered within the chordate phyla in amphioxus and the vertebrates, and more recently within the hemichordates and echinoderms. The archetypal ParaHox cluster of amphioxus places it in a unique position in which to examine the regulatory mechanisms controlling ParaHox gene expression within the last common ancestor of chordates, and perhaps even the wider Deuterostomia. In this thesis, the genomic and regulatory landscape of the amphioxus ParaHox cluster is characterised in detail. New genomic and transcriptomic resources are used to better characterise the B.floridae ParaHox cluster and surrounding genomic region, and conserved non-coding regions and regulatory motifs are identified across the ParaHox cluster of three species of amphioxus. In conjunction with this, the impact of retrotransposition upon the ParaHox cluster is examined and analyses of transposable elements and the AmphiSCP1 retrogene reveal that the ParaHox cluster may be more insulated from outside influence than previously thought. Finally, the detailed analyses of a regulatory element upstream of AmphiGsx reveals conserved mechanisms regulating Gsx CNS expression within the chordates, and TCF/Lef is likely a direct regulator of AmphiGsx within the CNS. The work in this thesis makes use of new genomic and transcriptomic resources available for amphioxus to better characterise the genomic and regulatory landscape of the amphioxus ParaHox cluster, serving as a basis for the improved identification and characterisation of functional regulatory elements and conserved regulatory mechanisms. This work also highlights the potential of Ciona intestinalis as a ‘living test tube' to allow the detailed characterisation of amphioxus ParaHox regulatory elements. 572.8
68	Avaliação de métodos de inferência de redes de regulação gênica. / Evaluation of gene regulatory networks inference methods. Alan Rafael Fachini 17 October 2016 (has links) A representação do Sistema de Regulação Gênica por meio de uma Rede de Regulação Gênica (GRN) pode facilitar a compreensão dos processos biológicos no nível molecular, auxiliando no entendimento do comportamento dos genes, a descoberta da causa de doenças e o desenvolvimento de novas drogas. Através das GRNs pode-se avaliar quais genes estão ativos e quais são suas influências no sistema. Nos últimos anos, vários métodos computacionais foram desenvolvidos para realizar a inferência de redes a partir de dados de expressão gênica. Esta pesquisa apresenta uma análise comparativa de métodos de inferência de GRNs, realizando uma revisão do modelo experimental descrito na literatura atual aplicados a conjuntos de dados contendo poucas amostras. Apresenta também o uso comitês de especialistas (ensemble) para agregar o resultado dos métodos a fim de melhorar a qualidade da inferência. Como resultado obteve-se que o uso de poucas amostras de dados (abaixo de 50) não fornecem resultados interessantes para a inferência de redes. Demonstrou-se também que o uso de comitês de especialistas melhoram os resultados de inferência. Os resultados desta pesquisa podem auxiliar em pesquisas futuras baseadas em GRNs. / The representation of the gene regulation system by means of a Gene Regulatory Network (GRN) can help the understanding of biological processes at the molecular level, elucidating the behavior of genes and leading to the discovery of disease causes and the development of new drugs. GRNs allow to evaluate which genes are active and how they influence the system. In recent years, many computational methods have been developed for networks inference from gene expression data. This study presents a comparative analysis of GRN inference methods, reviewing the experimental modeling present in the state-of-art scientific publications applied to datasets with small data samples. The use of ensembles was proposed to improve the quality of the network inference. As results, we show that the use of small data samples (less than 50 samples) do not show a good result in the network inference problem. We also show that the use of ensemble improve the network inference. Bioinformática Comitês de Especialistas Redes de Regulação Gênica Bioinformatics Ensemble Gene Regulatory Networks Machine learning Network Inference
69	A Systems-Level Analysis of an Epithelial to Mesenchymal Transition Saunders, Lindsay Rose January 2012 (has links) <p>Embryonic development occurs with precisely timed morphogenetic cell movements directed by complex gene regulation. In this orchestrated series of events, some epithelial cells undergo extensive changes to become free moving mesenchymal cells. The transformation resulting in an epithelial cell becoming mesenchymal is called an epithelial to mesenchymal transition (EMT), a dramatic cell biological change that occurs throughout development, tissue repair, and disease. Extensive <italic>in vitro</italic> research has identified many EMT regulators. However, most <italic>in vitro</italic> studies often reduce the complicated phenotypic change to a binary choice between successful and failed EMT. Research utilizing models has generally been limited to a single aspect of EMT without considering the total transformation. Fully understanding EMT requires experiments that perturb the system via multiple channels and observe several individual components from the series of cellular changes, which together make a successful EMT.</p><p>In this study, we have taken a novel approach to understand how the sea urchin embryo coordinates an EMT. We use systems level methods to describe the dynamics of EMT by directly observing phenotypic changes created by shifting transcriptional network states over the course of primary mesenchyme cell (PMC) ingression, a classic example of developmental EMT. We systematically knocked down each transcription factor in the sea urchin's PMC gene regulatory network (GRN). In the first assay, one fluorescently labeled knockdown PMC precursor was transplanted onto an unperturbed host embryo and we observed the resulting phenotype <italic>in vivo</italic> from before ingression until two hours post ingression using time-lapse fluorescent microscopy. Movies were projected for computational analyses of several phenotypic changes relevant to EMT: apical constriction, apical basal polarity, motility, and de-adhesion. </p><p>A separate assay scored each transcription factor for its requirement in basement membrane invasion during EMT. Again, each transcription factor was knocked down one by one and embryos were immuno-stained for laminin, a major component of basement membrane, and scored on the presence or absence of a laminin hole at the presumptive entry site of ingression. </p><p>The measured results of both assays were subjected to rigorous unsupervised data analyses: principal component analysis, emergent self-organizing map data mining, and hierarchical clustering. This analytical approach objectively compared the various phenotypes that resulted from each knockdown. In most cases, perturbation of any one transcription factor resulted in a unique phenotype that shared characteristics with its upstream regulators and downstream targets. For example, Erg is a known regulator of both Hex and FoxN2/3 and all three shared a motility phenotype; additionally, Hex and Erg both regulated apical constriction but Hex additionally affected invasion and FoxN2/3 was the lone regulator of cell polarity. Measured phenotypic changes in conjunction with known GRN relationships were used to construct five unique subcircuits of the GRN that described how dynamic regulatory network states control five individual components of EMT: apical constriction, apical basal polarity, motility, de-adhesion, and invasion. The five subcircuits were built on top of the GRN and integrated existing fate specification control with the morphogenetic EMT control.</p><p>Early in the EMT study, we discovered one PMC gene, Erg, was alternatively spliced. We identified 22 splice variants of Erg that are expressed during ingression. Our Erg knockdown targeted the 5'UTR, present in all spliceoforms; therefore, the knockdown uniformly perturbed all native Erg transcripts (∑Erg). Specific function was demonstrated for the two most abundant spliceoforms, Erg-0 and Erg-4, by knockdown of ∑Erg and mRNA rescue with a single spliceoform; the mRNA expression constructs contained no 5'UTR and were not affected by the knockdown. Different molecular phenotypes were observed, and both spliceoforms targeted Tbr, Tel, and FoxO, only Erg-0 targeted FoxN2/3 and only Erg-4 targeted Hex. Neither targeted Tgif, which was regulated by ∑Erg knockdown sans rescue. Our results suggest the embryo employs a minimum of three unique roles in the GRN for alternative splicing of Erg. </p><p>Overall, these experiments increase the completeness and descriptive power of the GRN with two additional levels of complexity. We uncovered five sub-circuits of EMT control, which integrated into the GRN provide a novel view of how a complex morphogenetic movement is controlled by the embryo. We also described a new functional role for alternative splicing in the GRN where the transcriptional targets for two splice variants of Erg are unique subsets of the total set of ∑Erg targets.</p> / Dissertation Developmental biology Cellular biology Systematic biology alternative splicing EMT epithelial to mesenchymal transition gene regulatory networks GRN sea urchin
70	Comparative Developmental Transcriptomics of Echinoderms Vaughn, Roy 01 January 2012 (has links) The gastrula stage represents the point in development at which the three primary germ layers diverge. At this point the gene regulatory networks that specify the germ layers are established and the genes that define the differentiated states of the tissues have begun to be activated. These networks have been well characterized in sea urchins, but not in other echinoderms. Embryos of the brittle star Ophiocoma wendtii share a number of developmental features with sea urchin embryos, including the ingression of mesenchyme cells that give rise to an embryonic skeleton. Notable differences are that no micromeres are formed during cleavage divisions and no pigment cells are formed during development to the pluteus larva stage. More subtle changes in timing of developmental events also occur. To explore the molecular basis for the similarities and differences between these two echinoderms, the gastrula transcriptome of Ophiocoma wendtii was sequenced and characterized. I identified brittle star transcripts that correspond to 3385 genes in existing databases, including 1863 genes shared with the sea urchin Strongylocentrotus purpuratus gastrula transcriptome. I have characterized the functional classes of genes present in the transcriptome and compared them to those found in sea urchin. I then examined which members of the germ-layer specific gene regulatory networks (GRNs) of S. purpuratus are expressed in the O. wendtii gastrula. The results indicate that there is a shared "genetic toolkit" central to the echinoderm gastrula, a key stage in embryonic development, though there are also differences that reflect changes in developmental processes. The brittle star expresses genes representing all functional classes at the gastrula stage. Brittle stars and sea urchins have comparable numbers of each class of genes, and share many of the genes expressed at gastrula. Examination of the brittle star genes whose sea urchin orthologs are utilized in germ layer specification reveals a relatively higher level of conservation of key regulatory components compared to the overall transcriptome. I also identify genes that were either lost or whose temporal expression has diverged from that of sea urchins. Overall, the data suggest that embryonic skeleton formation in sea urchins and brittle stars represents convergent evolution by independent cooptation of a shared pathway utilized in adult skeleton formation. Transcription factors are of central importance to both development and evolution. Patterns of their expression and interactions form the gene regulatory networks which control the building of the embryonic body. Alterations in these patterns can result in the construction of altered bodies. To help increase understanding of this process, I compared the transcription factor mRNAs present in early gastrula-stage embryos of the brittle star Ophiocoma wendtii to those found in two species of sea urchins and a starfish. Brittle star homologs were found for one third of the transcription factors in the sea urchin genome and half of those that are expressed at equivalent developmental stages in sea urchins and starfish. Overall, the patterns of transcription factors found and not found in brittle star resemble those of other echinoderms, with the differences largely consistent with morphological differences. This study provides further evidence for the existence of deeply conserved developmental genetic processes, with various elements shared among echinoderms, deuterostomes, and metazoans. Brittle Star Gastrula Gene Regulatory Networks Sea Urchin Transcription Factors American Studies Arts and Humanities Developmental Biology Evolution Genetics

Search results