Global ETD Search

31	Computational approaches for protein functions and gene association networks Yalamanchili, Hari Krishna January 2014 (has links) Entire molecular biology revolves primarily around proteins and genes (DNA and RNA). They collaborate with each other facilitating various biomolecular systems. Thus, to comprehend any biological phenomenon from very basic cell division to most complex cancer, it is fundamental to decode the functional dynamics of proteins and genes. Recently, computational approaches are being widely used to supplement traditional experimental approaches. However, each automated approach has its own advantages and limitations. In this thesis, major shortcomings of existing computational approaches are identified and alternative fast yet precise methods are proposed. First, a strong need for reliable automated protein function prediction is identified. Almost half of protein functional interpretations are enigmatic. Lack of universal functional vocabulary further elevates the problem. NRProF, a novel neural response based method is proposed for protein functional annotation. Neural response algorithm simulates human brain in classifying images; the same is applied here for classifying proteins. Considering Gene Ontology (GO) hierarchical structure as background, NRProF classifies a protein of interest to a specific GO category and thus assigns the corresponding function. Having established reliable protein functional annotations, protein and gene collaborations are studied next. Interactions amongst transcription factors (TFs) and transcription factor binding sites (TFBSs) are fundamental for gene regulation and are highly specific, even in evolution background. To explain this binding specificity a Co-Evo (co-evolutionary) relationship is hypothesized. Pearson correlation and Mutual Information (MI) metrics are used to validate the hypothesis. Residue level MI is used to infer specific binding residues of TFs and corresponding TFBSs, assisting a thorough understanding of gene regulatory mechanism and aid targeted gene therapies. After comprehending TF and TFBS associations, interplay between genes is abstracted as Gene Regulatory Networks. Several methods using expression correlations are proposed to infer gene networks. However, most of them ignore the embedded dynamic delay induced by complex molecular interactions and other riotous cellular mechanisms, involved in gene regulation. The delay is rather obvious in high frequency time series expression data. DDGni, a novel network inference strategy is proposed by adopting gapped smith-waterman algorithm. Gaps attune expression delays and local alignment unveils short regulatory windows, which traditional methods overlook. In addition to gene level expression data, recent studies demonstrated the merits of exon-level RNA-Seq data in profiling splice variants and constructing gene networks. However, the large number of exons versus small sample size limits their practical application. SpliceNet, a novel method based on Large Dimensional Trace is proposed to infer isoform specific co-expression networks from exon-level RNA-Seq data. It provides a more comprehensive picture to our understanding of complex diseases by inferring network rewiring between normal and diseased samples at isoform resolution. It can be applied to any exon level RNA-Seq data and exon array data. In summary, this thesis first identifies major shortcomings of existing computational approaches to functional association of proteins and genes, and develops several tools viz. NRProF, Co-Evo, DDGni and SpliceNet. Collectively, they offer a comprehensive picture of the biomolecular system under study. / published_or_final_version / Biochemistry / Doctoral / Doctor of Philosophy Proteins - Data processing
32	Application of Logic Synthesis Toward the Inference and Control of Gene Regulatory Networks Lin, Pey Chang K 16 December 2013 (has links) In the quest to understand cell behavior and cure genetic diseases such as cancer, the fundamental approach being taken is undergoing a gradual change. It is becoming more acceptable to view these diseases as an engineering problem, and systems engineering approaches are being deployed to tackle genetic diseases. In this light, we believe that logic synthesis techniques can play a very important role. Several techniques from the field of logic synthesis can be adapted to assist in the arguably huge effort of modeling cell behavior, inferring biological networks, and controlling genetic diseases. Genes interact with other genes in a Gene Regulatory Network (GRN) and can be modeled as a Boolean Network (BN) or equivalently as a Finite State Machine (FSM). As the expression of genes deter- mine cell behavior, important problems include (i) inferring the GRN from observed gene expression data from biological measurements, and (ii) using the inferred GRN to explain how genetic diseases occur and determine the ”best” therapy towards treatment of disease. We report results on the application of logic synthesis techniques that we have developed to address both these problems. In the first technique, we present Boolean Satisfiability (SAT) based approaches to infer the predictor (logical support) of each gene that regulates melanoma, using gene expression data from patients who are suffering from the disease. From the output of such a tool, biologists can construct targeted experiments to understand the logic functions that regulate a particular target gene. Our second technique builds upon the first, in which we use a logic synthesis technique; implemented using SAT, to determine gene regulating functions for predictors and gene expression data. This technique determines a BN (or family of BNs) to describe the GRN and is validated on a synthetic network and the p53 network. The first two techniques assume binary valued gene expression data. In the third technique, we utilize continuous (analog) expression data, and present an algorithm to infer and rank predictors using modified Zhegalkin polynomials. We demonstrate our method to rank predictors for genes in the mutated mammalian and melanoma networks. The final technique assumes that the GRN is known, and uses weighted partial Max-SAT (WPMS) towards cancer therapy. In this technique, the GRN is assumed to be known. Cancer is modeled using a stuck-at fault model, and ATPG techniques are used to characterize genes leading to cancer and select drugs to treat cancer. To steer the GRN state towards a desirable healthy state, the optimal selection of drugs is formulated using WPMS. Our techniques can be used to find a set of drugs with the least side-effects, and is demonstrated in the context of growth factor pathways for colon cancer. Genomics Logic Synthesis Boolean Satisfiability Gene Regulatory Networks
33	Integrative approaches to modelling and knowledge discovery of molecular interactions in bioinformatics Jain, Vishal January 2008 (has links) The core focus of this research lies in developing and using intelligent methods to solve biological problems and integrating the knowledge for understanding the complex gene regulatory phenomenon. We have developed an integrative framework and used it to: model molecular interactions from separate case studies on time-series gene expression microarray datasets, molecular sequences and structure data including the functional role of microRNAs; to extract knowledge; and to build reusable models for the central dogma theme. Knowledge was integrated with the use of ontology and it can be reused to facilitate new discoveries as demonstrated on one of our systems – the Brain Gene Ontology (BGO). The central dogma theme states that proteins are produced from the DNA (gene) via an intermediate transcript called RNA. Later these proteins play the role of enzymes to perform the checkpoints as a gene expression control. Also, according to the recently emerged paradigm, sometimes genes do not code for proteins but results in small molecules of microRNAs which in turn controls the gene regulation. The idea is that such a very complicated molecular biology process (central dogma) results in production of a wide variety of data that can be used by computer scientists for modelling and to enable discoveries. We have suggested that this range of data should actually be taken into account for analysis to understand the concept of gene regulation instead of just taking one source of data and applying some standard methods to reveal facts in the system biology. The problem is very complex and, currently, computational algorithms have not been really successful because either existing methods have certain problems or the proven results were obtained for only one domain of the central dogma of molecular biology, so there has always been a lack of knowledge integration. Proper maintenance of diverse sources of data, structures and, in particular, their adaptation to new knowledge is one of the most challenging problems and one of the crucial tasks towards the knowledge integration vision is the efficient encoding of human knowledge in ontologies. More specifically this work has contributed towards the development of novel computational and information science methods and we have promoted the vision of knowledge integration by developing brain gene ontology (BGO) system. With the integrative use of several bioinformatics methods, this research has indeed resulted in modelling of such knowledge that has not been revealed in system biology so far. There are many discoveries made during my study and some of the findings are briefly mentioned as follows: (1) in relation to leukaemia disease we have discovered a new gene “TCF-1” that interacts with the “telomerase” gene. (2) With respect to yeast cell cycle analysis, we hypothesize that exoglucanase gene “exg1” is now implicated to be tied with “MCB cluster regulation” and a “mannosidase” with “histone linked mannoses”. A new quantitative prediction is that the time delay of the interaction between two genes seems to be approximately 30 minutes, or 0.17 cell cycles. Next, Cdc22, Suc22 and Mrc1 genes were discovered that interacts with each other as the potential candidates in controlling the Ribonucleotide reductase (RNR) activity. (3) Upon studying the phenomenon of Long Term Potentiation (LTP) it was found that the transcription factors, responsible for regulation of gene expression, begin to be elevated as soon as 30 min after induction of LTP, and remain elevated up to 2 hours. (4) Human microRNA data investigation resulted in the successful identification of two miRNA families i.e. let-7 and mir-30. (5) When we analysed the CNS cancer data, a set of 10 genes (HMG-I(Y), NBL1, UBPY, Dynein, APC, TARBP2, hPGT, LTC4S, NTRK3, and Gps2) was found to give 85% correct prediction on drug response. (6) Upon studying the AMPA, GABRA and NMDA receptors we hypothesize that phenylalanine (F at position 269) and leucine (L at position 353) in these receptors play the role of a binding centre for their interaction with several other genes/proteins such as c-jun, mGluR3, Jerky, BDNF, FGF-2, IGF-1, GALR1, NOS and S100beta. All the developed methods that we have used to discover above mentioned findings are very generic and can be easily applied on any dataset with some constraints. We believe that this research has established the significant fact that integrative use of various computational intelligence methods is critical to reveal new aspects of the problem and finally knowledge integration is also a must. During this coursework, I have significantly published this research in reputed international journals, presented results in several conferences and also produced book chapters. Bioinformatics Gene regulatory networks (GRNs) Interaction Computational Ontology Integration
34	Consensus network inference of microarray gene expression data Mohammed, Suhaib January 2016 (has links) Genetic and protein interactions are essential to regulate cellular machinery. Their identification has become an important aim of systems biology research. In recent years, a variety of computational network inference algorithms have been employed to reconstruct gene regulatory networks from post-genomic data. However, precisely predicting these regulatory networks remains a challenge. We began our study by assessing the ability of various network inference algorithms to accurately predict gene regulatory interactions using benchmark simulated datasets. It was observed from our analysis that different algorithms have strengths and weaknesses when identifying regulatory networks, with a gene-pair interaction (edge) predicted by one algorithm not always necessarily consistent with the other. An edge not predicted by most inference algorithms may be an important one, and should not be missed. The naïve consensus (intersection) method is perhaps the most conservative approach and can be used to address this concern by extracting the edges consistently predicted across all inference algorithms; however, it lacks credibility as it does not provide a quantifiable measure for edge weights. Existing quantitative consensus approaches, such as the inverse-variance weighted method (IVWM) and the Borda count election method (BCEM), have been previously implemented to derive consensus networks from diverse datasets. However, the former method was biased towards finding local solutions in the whole network, and the latter considered species diversity to build the consensus network. In this thesis we proposed a novel consensus approach, in which we used Fishers Combined Probability Test (FCPT) to combine the statistical significance values assigned to each network edge by a number of different networking algorithms to produce a consensus network. We tested our method by applying it to a variety of in silico benchmark expression datasets of different dimensions and evaluated its performance against individual inference methods, Bayesian models and also existing qualitative and quantitative consensus techniques. We also applied our approach to real experimental data from the yeast (S. cerevisiae) network as this network has been comprehensively elucidated previously. Our results demonstrated that the FCPT-based consensus method outperforms single algorithms in terms of robustness and accuracy. In developing the consensus approach, we also proposed a scoring technique that quantifies biologically meaningful hierarchical modular networks. 572.8
35	Modeling gene regulatory networks through data integration Azizi, Elham 12 March 2016 (has links) Modeling gene regulatory networks has become a problem of great interest in biology and medical research. Most common methods for learning regulatory dependencies rely on observations in the form of gene expression data. In this dissertation, computational models for gene regulation have been developed based on constrained regression by integrating comprehensive gene expression data for M. tuberculosis with genome-scale ChIP-Seq interaction data. The resulting models confirmed predictive power for expression in independent stress conditions and identified mechanisms driving hypoxic adaptation and lipid metabolism in M. tuberculosis. I then used the regulatory network model for M. tuberculosis to identify factors responding to stress conditions and drug treatments, revealing drug synergies and conditions that potentiate drug treatments. These results can guide and optimize design of drug treatments for this pathogen. I took the next step in this direction, by proposing a new probabilistic framework for learning modular structures in gene regulatory networks from gene expression and protein-DNA interaction data, combining the ideas of module networks and stochastic blockmodels. These models also capture combinatorial interactions between regulators. Comparisons with other network modeling methods that rely solely on expression data, showed the essentiality of integrating ChIP-Seq data in identifying direct regulatory links in M. tuberculosis. Moreover, this work demonstrates the theoretical advantages of integrating ChIP-Seq data for the class of widely-used module network models. The systems approach and statistical modeling presented in this dissertation can also be applied to problems in other organisms. A similar approach was taken to model the regulatory network controlling genes with circadian gene expression in Neurospora crassa, through integrating time-course expression data with ChIP-Seq data. The models explained combinatorial regulations leading to different phase differences in circadian rhythms. The Neurospora crassa network model also works as a tool to manipulate the phases of target genes. Bioinformatics Computational modeling Data integration Gene regulatory networks Neurospora Tuberculosis
36	Synthesising executable gene regulatory networks in haematopoiesis from single-cell gene expression data Woodhouse, Steven January 2017 (has links) A fundamental challenge in biology is to understand the complex gene regulatory networks which control tissue development in the mammalian embryo, and maintain homoeostasis in the adult. The cell fate decisions underlying these processes are ultimately made at the level of individual cells. Recent experimental advances in biology allow researchers to obtain gene expression profiles at single-cell resolution over thousands of cells at once. These single-cell measurements provide snapshots of the states of the cells that make up a tissue, instead of the population-level averages provided by conventional high-throughput experiments. The aim of this PhD was to investigate the possibility of using this new high resolution data to reconstruct mechanistic computational models of gene regulatory networks. In this thesis I introduce the idea of viewing single-cell gene expression profiles as states of an asynchronous Boolean network, and frame model inference as the problem of reconstructing a Boolean network from its state space. I then give a scalable algorithm to solve this synthesis problem. In order to achieve scalability, this algorithm works in a modular way, treating different aspects of a graph data structure separately before encoding the search for logical rules as Boolean satisfiability problems to be dispatched to a SAT solver. Together with experimental collaborators, I applied this method to understanding the process of early blood development in the embryo, which is poorly understood due to the small number of cells present at this stage. The emergence of blood from Flk1+ mesoderm was studied by single cell expression analysis of 3934 cells at four sequential developmental time points. A mechanistic model recapitulating blood development was reconstructed from this data set, which was consistent with known biology and the bifurcation of blood and endothelium. Several model predictions were validated experimentally, demonstrating that HoxB4 and Sox17 directly regulate the haematopoietic factor Erg, and that Sox7 blocks primitive erythroid development. A general-purpose graphical tool was then developed based on this algorithm, which can be used by biological researchers as new single-cell data sets become available. This tool can deploy computations to the cloud in order to scale up larger high-throughput data sets. The results in this thesis demonstrate that single-cell analysis of a developing organ coupled with computational approaches can reveal the gene regulatory networks that underpin organogenesis. Rapid technological advances in our ability to perform single-cell profiling suggest that my tool will be applicable to other organ systems and may inform the development of improved cellular programming strategies.
37	Statistical modeling of oscillating biological networks for structure inference and experimental design Trejo Baños, Daniel January 2016 (has links) Oscillations lie at the core of many biological processes, from the cell cycle, to circadian oscillations and developmental processes. They are essential to enable organisms to adapt to varying conditions in environmental cycles, from day/night to seasonal. Transcriptional regulatory networks are one of the mechanisms behind these biological oscillations. One of the main problems of computational systems biology is elucidating the interaction between biological components. A common mathematical abstraction is to represent these interactions as networks whose nodes are the reactive species and the interactions are edges. There is abundant literature dealing with the reconstruction of the network structure from steady-state gene expression measurements; still, there are lots of advancements to be made because of the complex nature of biological systems. Experimental design is another obstacle to overcome; we wish to perform experiments that help us best define the network structure according to our current knowledge of the system. In the first chapters of this thesis we will focus on reconstructing the network structure of biological oscillators by explicitly leveraging the cyclical nature of the transcriptional signals. We present a method for reconstructing network interactions tailored to this special but important class of genetic circuits. The method is based on projecting the signal onto a set of oscillatory basis functions. We build a Bayesian hierarchical model within a frequency domain linear model in order to enforce sparsity and incorporate prior knowledge about the network structure. Experiments on real and simulated data show that the method can lead to substantial improvements over competing approaches if the oscillatory assumption is met, and remains competitive also in cases it is not. Having defined a model for gene expression in oscillatory systems, we also consider the problem of designing informative experiments for elucidating the dynamics and better identify the model. We demonstrate our approach on a benchmark scenario in plant biology, the circadian clock network of Arabidopsis thaliana, and discuss the different value of three types of commonly used experiments in terms of aiding the reconstruction of the network. Finally we provide the architecture and design of a software implementation to plug in statistical methods of gene expression inference and network reconstruction into a biological data integration platform. 572.8
38	Constructing Temporal Transcriptional Regulatory Cascades in the Context of Development and Cell Differentiation Daou, Rayan 08 May 2020 (has links) No description available. 510 Gene regulation Regulatory networks Regulatory cascades Informatik (PPN619939052)
39	The myocyte enhancer factor-2 (MEF2) family mediates complex gene regulation in skeletal and cardiac myocytes Desjardins, Cody Alan 10 August 2017 (has links) Regulation of striated muscle differentiation and development are complex processes coordinated by an array of transcription factors. MEF2 is a crucial transcription factor required for muscle differentiation, but the roles of the individual MEF2 family members, MEF2A-D, have not been extensively evaluated. Acute ablation of Mef2 expression in skeletal myoblasts revealed a required role for MEF2A activity in myoblast differentiation that was not shared with the other MEF2 factors. We hypothesized that a transcriptomic level analysis of Mef2-deficient skeletal myoblasts would reveal distinct regulatory roles for each MEF2 isoform. Comparative microarray analysis supported our hypothesis and we observed distinct gene programs preferentially-sensitive to individual MEF2 isoforms. While there was no variance in the consensus binding site associated with regulation by individual MEF2 isoforms, we did observe uniquely enriched binding sites for candidate co-regulatory proteins that mediate these complex regulatory patterns. Based on our observations in skeletal myoblasts, we performed a series of acute Mef2 knockdowns in neonatal cardiomyocytes and uncovered a requirement for MEF2A and -D, but not MEF2C in cardiomyocyte survival. Comparative microarray analysis confirmed that, similar to skeletal myoblasts, the MEF2 family regulated distinct but overlapping gene programs in cardiomyocytes. Additionally, this analysis uncovered a previously uncharacterized antagonistic regulation of a subset of cell cycle and sarcomere genes. Interestingly, Mef2a and -d knockdowns caused an upregulation of cell cycle markers and downregulation of sarcomere genes, with the opposite regulatory pattern in Mef2c knockdown. Further investigation of the proximal promoter region of these genes revealed enriched binding sites for transcription factors associated with key signaling pathways in the developing embryo, Hedgehog and Notch. Overexpression of constitutively active components of these signaling pathways revealed that Notch function requires the presence of MEF2A and -D, while Hedgehog does not appear to interact with these two isoforms. We have shown through our studies that MEF2, a core muscle transcription factor, takes part in complex regulatory interactions that are critical for the appropriate development of striated muscle tissues. / 2018-08-09T00:00:00Z Biology Cardiomyocyte Cell cycle Differentiation Gene regulatory networks MEF2 Transcription
40	Mathematical Model of the Cell Cycle Control and Asymmetry Development in Caulobacter crescentus Xu, Chunrui 23 June 2022 (has links) Caulobacter crescentus goes through a classic dimorphic cell division cycle to adapt to the stringent environment and reduce intraspecific competition. Caulobacter mother cell gives rise to two progenies with distinct morphology - a motile swarmer cell equipped with a flagellum and a sessile stalked cell equipped with a stalk. Because of the nature of dimorphic lifestyle, Caulobacter becomes a model bacterium to study the cell differentiation, signalling transduction, stress response, and asymmetry development of prokaryotes. The dimorphic cell cycle of Caulobacter is driven by the elaborate spatiotemporal organization of regulatory molecules through regulations of synthesis, degradation, phosphorelay, and localization. There is a wealth of experimental observations about gene/protein interactions and localizations accumulated in recent decades, while several mathematical models have been proposed to study the cell cycle progression in Caulobacter. However, the specific control mechanisms of stress response and spatial asymmetry establishment are yet clearly elucidated, while these mechanisms are of fundamental importance to understanding the bacterial survival strategy and developing the microbial industry. Here we utilize mathematical modeling to study the regulatory network of cell cycle control in C. crescentus, focusing on the stress response and asymmetry development. First, we investigate the starvation response of Caulobacter through the connection of phosphotransferase systems (PTS) and guanine nucleotide-based second messenger system. We have developed a mathematical model to capture the temporal dynamics of vital regulatory second messengers, c-di-GMP (cdG) and guanosine pentaphosphate or tetraphosphate (pppGpp or ppGpp), under normal and stressful conditions. This research suggests that the RelA-SpoT homolog enzymes have the potential to effectively influence the cell cycle in response to nutrition changes by regulating cdG and (p)ppGpp levels. We further integrate the second messenger network into a temporal cell cycle model to investigate molecular mechanisms underlying responses of Caulobacter to nutrition starvation. Our model suggests that the cdG-relevant starvation signal is essential but not sufficient to robustly arrest the cell cycle of Caulobacter. We also demonstrate that there may be unknown pathway(s) reducing CtrA under starvation conditions, which results in delayed cytokinesis in starved stalked cells. The cell cycle development of Caulobacter is determined by the periodical activation and deactivation of the master regulator CtrA. cdG is an essential component of the ClpXP pro- tease complex, which is specifically responsible for the degradation of CtrA. We propose a mathematical model for the hierarchical assembly of ClpXP complexes, together with modeling DNA replication, transcription, and protein interactions, to characterize the Caulobacter cell cycle. Our model suggests that the ClpXP-based proteolysis system contributes to the timing and robustness of the cell cycle progression. Furthermore, we construct a spatiotemporal model with Turing-pattern mechanism to study the morphogenesis and asymmetry establishment during the cell cycle of Caulobacter. We apply reaction-diffusion equations to capture the spatial dynamics of scaffolding proteins PodJ, PopZ, and SpmX, which organize two distinct poles of Caulobacter. The spatial regulations influence the activity and distribution of key cell cycle regulators, governing the dimorphic lifestyle of Caulobacter. Our model captures major spatiotemporal experimental observations of wild-type and mutant cells. It provides predictions of novel mutant strains and explains the spatial regulatory mechanisms of bacterial cell cycle progression. / Doctor of Philosophy / Cell is the basic unit of life that undergoes a process called 'cell cycle' consisting of DNA replication and cell division to exhibit various functions, abilities, and behaviors. The cell cycle is well organized by complex regulations in time and space that determine when and where changes take place. The regulations behind cell cycle development play important roles for living organisms but are not fully understood. In this dissertation, we utilize mathematical models and focus on a model bacterium, Caulobacter crescentus, to capture characteristics of cell cycle and study the underlying regulations. Caulobacter is widely distributed in freshwater, including environments with poor nutrients. It divides asymmetrically, generating a pair of daughter cells with different appearances and replicative potentials. Therefore, Caulobacter population has the flexibility to save energy by halting DNA replication and to reduce the competition with siblings by settling into different places. We utilize the nature of the asymmetrical division of Caulobacter to quantitatively investigate the control mechanisms of cell cycle development, including how cells detect and respond to external cues and develop different organelles at specific times and locations. Mathematical model Stress response Asymmetrical cell cycle Protein regulatory networks

Search results