Spelling suggestions: "subject:"gene networks"" "subject:"ene networks""
11 |
Métodos estatísticos para a análise de dados de cDNA microarray em um ambiente computacional integrado / Statistical methods for cDNA microarray data analysis in an integrated computational environmentGustavo Henrique Esteves 23 March 2007 (has links)
Análise de expressão gênica em larga escala é de fundamental importância para a biologia molecular atual pois possibilita a medida dos níveis de expressão de milhares de genes simultaneamente, o que torna viável a realização de trabalhos voltados para biologia de sistemas (systems biology). Dentre as principais técnicas experimentais disponíveis para esta finalidade, a tecnologia de microarray tem sido amplamente utilizada. Este procedimento para medida de expressão gênica é bastante complexo e os dados obtidos são freqüentemente observacionais, o que dificulta a modelagem estatística. Não existe um protocolo padrão para a geração e avaliação desses dados, sendo portanto necessário buscar procedimentos de análise que sejam adequados para cada caso. Assim, os principais métodos matemáticos e estatísticos aplicados para a análise desses dados deveriam estar disponíveis de uma forma organizada, coerente e simples em um ambiente computacional que confira robustez, confiabilidade e reprodutibilidade às análises realizadas. Uma forma de garantir estas características é através da representação (e documentação) de todos os algoritmos utilizados na forma de um grafo direcionado e acíclico que descreva todo o conjunto de transformações, ou operações, aplicadas seqüencialmente ao conjunto de dados. De acordo com esta filosofia, um ambiente foi implementado neste trabalho incorporando diversos procedimentos disponíveis na literatura atual, além de outros que foram aprimorados ou propostos nesta tese. Dentre os métodos de análise já disponíveis que foram incorporados destacam-se aqueles para a construção de agrupamentos, busca de genes diferencialmente expressos e classificadores, construção de redes de relevância e classificação funcional de grupos gênicos. Além disso, o método de construção de redes de relevância foi revisto e aprimorado e um modelo estatístico para a classificação funcional de redes de regulação gênica foi proposto e implementado. Esses dois últimos métodos surgiram a partir de problemas biológicos para os quais não existiam procedimentos de análise adequados na literatura. Finalmente, são apresentados dois conjuntos de dados que foram analisados utilizando diversas ferramentas disponíveis neste ambiente computacional. / High throughput gene expression analysis has a great importance to molecular biology nowadays because it can measure expression profiles for hundreds of genes, and this turn possible studies focused in systems biology. Between the main experimental techniques available in this direction, the microarray technology has been widely used. This experimental procedure to quantify gene expression profiles is very complex and the data obtained is frequently observational, what difficult the statistical modelling. There is not a standard protocol for the generation and evaluation of microarray data, therefore it is necessary to search by adequate analysis methods for each case. Thus, the main mathematical and statistical methods applied to microarray data analysis would have to be available in an organized, coherent and simple way in a computational environment that confer robustness, reliability and reproducibility to the data analysis. One way to guarantee these characteristics is through the representation (and documentation) of all used algorithms as a directed and acyclic graph that describes the set of transformations, or operations, applied sequentially to the dataset. According to this philosophy, an environment was implemented in this work aggregating several data analysis procedures already available in the literature, beyond other methods that were improved or proposed in this thesis. Between the procedures already available that were incorporated we can distinguish that ones for cluster analysis, differentially expressed genes and classifiers search, construction of relevance networks and functional classification of gene groups. Moreover, the method for construction of relevance networks was revised and improved and an statistical model was proposed and implemented for the functional classification of gene regulation networks. The last two procedures was born from biological problems for which adequate data analysis methods didn?t exist in the literature. Finally, we presented two datasets that were evaluated using several data analysis procedures available in this computational environment.
|
12 |
Bacteriophage technologies and their application to synthetic gene networksKrom, Russell-John 03 November 2015 (has links)
Synthetic biology, a field that sits between Biology and Engineering disciplines, has come into its own in the last decade. The decreasing cost of DNA synthesis has lead to the creation of larger and more complex synthetic gene networks, engineered with functional goals rather than simple demonstration. While many methods have been developed to reduce the time required to produce complex networks, none focus upon the considerable tuning needed to turn structurally correct networks into functional gene networks. To this end, we created a Plug-and-Play synthetic gene network assembly that emphasizes character-driven iteration for producing functional synthetic gene networks. This platform enables post-construction modification and easy tuning of networks through its ability to swap individual parts. To demonstrate this system, we constructed a functional bistable genetic toggle and transformed it into two functionally distinct synthetic networks.
Once these networks have been created and tuned at the bench, they next must be delivered to bacteria in their target environment. While this is easy for industrial applications, delivering synthetic networks as medical therapeutics has a host of problems, such as competing microbes, the host immune system, and harsh microenvironments. Therefore, we employed bacteriophage technologies to deliver functional synthetic gene networks to specific bacterial strains in various microenvironments.
We first sought to deliver functional genetic networks to bacteria present in the gut microbiome. This allows for functionalization of these bacteria to eventually sense disease states and secrete therapeutics. As a proof of concept a simple circuit was created using the Plug-and-Play platform and tested before being moved into the replicative form plasmid of the M13 bacteriophage. Bacteriophage particles carrying this network were used to infect gut bacteria of mice. Infection and functionality of the synthetic network was monitored from screening fecal samples. Next, we employed phagemid technologies to deliver high copy plasmids expressing antibacterial networks to target bacteria. This allows for sustained expression of antibacterial genes that cause non-lytic bacterial death without reliance upon traditional small molecule antibiotics. Phagemid particles carrying our antibacterial networks were then tested against wild type and antibiotic-resistant bacteria in an in vitro and in vivo environment.
|
13 |
Noise and Robustness downstream of a morphogen gradient: Quantitative approach by imaging transcription dynamics in living embryosPerez Romero, Carmina Angelica January 2019 (has links)
This thesis was done in collaboration with Sorbonne University as part of a double degree Cotutelle. / During development, cell differentiation frequently occurs upon signaling from concentration or activity gradients of molecules called morphogens. These molecules control in a dose-dependent manner the expression of sets of target genes that determine cell identity. A simple paradigm to study morphogens is the Bicoid gradient, which determines antero-posterior patterning in fruit fly embryos. The Bicoid transcription factor allows the rapid step-like expression of its major target gene hunchback, expressed only in the anterior half of the embryo. The general goal of my thesis was to understand how the information contained in the Bicoid morphogen gradient is rapidly interpreted to provide the precise expression pattern of its target.
Using the MS2 system to fluorescently tag specific RNA in living embryos, we were able to show that the ongoing transcription process at the hunchback promoter is bursty and likely functions according to a two-state model. At each nuclear interphase, transcription is first observed in the anterior and it rapidly spreads towards the posterior, as expected for a Bicoid dose-dependent activation process. Surprisingly, it takes only 3 minutes from the first hints of transcription at the anterior to reach steady state with the setting of a sharp expression border in the middle of the embryo. Using modeling taking into account this very fast dynamics, we show that the presence of only 6 Bicoid binding sites (known number of sites in the hunchback promoter) in the promoter, is not sufficient to explain the establishment of a sharp expression border in such a short time. Thus, either more Bicoid binding sites or inputs from other transcription factors could help reconcile the model to the data. To better understand the role of transcription factors other than Bicoid in this process, I used a two-pronged strategy involving synthetic MS2 reporters combined with the analysis of the hunchback MS2 reporter in various mutant backgrounds. I show that the pioneer factor Zelda and the Hunchback protein itself are also critical for hunchback expression, maternal Hunchback acting at nuclear cycle 11-12, while zygotic Hunchback is acting later at nuclear cycle 13-14. The synthetic reporter approach indicate that in contrast to Hunchback and Caudal, Bicoid is able to activate transcription on its own when bound to the promoter. However, the presence of 6 Bicoid binding sites only leads to stochastic activation of the target loci. Interestingly, the binding of Hunchback to the Bicoid-dependent promoter reduces this stochasticity while Caudal might act as a posterior repressor gradient. Confronting these experimental data to theoretical models is ongoing and should allow to better understand the role of transcription factors, other than Bicoid, in hunchback expression at the mechanistic level. / Thesis / Doctor of Philosophy (PhD) / Have you ever wondered how a single cell can become a full grown organism? Well it starts when an egg and sperm fuse together. As time passes this single cell divides over and over again until an organism is formed. During this developmental process, somehow the cells know exactly where they are and what they need to become so that they form the organism. However, we don’t fully understand this process and this is what we hope to answer with our research: How do the cells know where they are and what they need to become during development?
We study this process in the fruit fly. Although fruit flies might not look a lot like us, during early embryonic development we are quite similar, so we can try to answer these questions in fruit flies and what we find might be relevant to other organisms like us.
During development, the first element that an embryo needs to know is the orientation of its body, where the head and tail, the left and right and the back and front of the body will be. We concentrate on studying how the head to tail axis, which we call the anterior-posterior axis, is formed.
To know where the head is going to be, the embryo releases proteins called morphogens that broadcast instructions to other genes so that cells know where they are and what they should become. We study a morphogen called Bicoid. Its concentration is high in the anterior, the region that will become the head of the embryo, and lower as you move towards the posterior where the tail will form. Bicoid activates a gene called hunchback, which ends up dividing the embryo in two large parts, the top and the bottom. However, Bicoid’s message fades away during each cell division and needs to be read again at the beginning of each new nuclear cycle. So how is the message read and how long does this process take? This last question is particularly critical during the period of very fast cell division.
My thesis tries to answer this question. We found out that it takes 3 minutes for a nuclei to read the Bicoid concentration, activate hunchback and express it correctly. However, in contrast to what was believed before, or namely, that only Bicoid was involved in this process, we found out that other players are involved in helping relay this message. This way hunchback can accurately divide the body in two parts exactly in the middle and without mistake in such a short period of time.
|
14 |
Modèles de théorie des jeux pour la formation de réseaux / Game theoretic Models of network FormationCesari, Giulia 13 December 2016 (has links)
Cette thèse traite de l’analyse théorique et l’application d’une nouvelle famille de jeux coopératifs, où la valeur de chaque coalition peut être calculée à partir des contributions des joueurs par un opérateur additif qui décrit comme les capacités individuelles interagissent au sein de groupes. Précisément, on introduit une grande classe de jeux, les Generalized Additive Games, qui embrasse plusieurs classes de jeux coopératifs dans la littérature, et en particulier de graph games, où un réseau décrit les restrictions des possibilités d’interaction entre les joueurs. Des propriétés et solutions pour cette classe de jeux sont étudiées, avec l’objectif de fournir des outils pour l’analyse de classes de jeux connues, ainsi que pour la construction de nouvelles classes de jeux avec des propriétés intéressantes d’un point de vue théorique. De plus, on introduit une classe de solutions pour les communication situations, où la formation d’un réseau est décrite par un mécanisme additif, et dans la dernière partie de cette thèse on présente des approches avec notre modèle à des problèmes réels modélisés par des graph games, dans les domaines de la théorie de l’argumentation et de la biomédecine. / This thesis deals with the theoretical analysis and the application of a new family of cooperative games, where the worth of each coalition can be computed from the contributions of single players via an additive operator describing how the individual abilities interact within groups. Specifically, we introduce a large class of games, namely the Generalized Additive Games, which encompasses several classes of cooperative games from the literature, and in particular of graph games, where a network describes the restriction of the interaction possibilities among players. Some properties and solutions of such class of games are studied, with the objective of providing useful tools for the analysis of known classes of games, as well as for the construction of new classes of games with interesting properties from a theoretic point of view. Moreover, we introduce a class of solution concepts for communication situations, where the formation of a network is described by means of an additive pattern, and in the last part of the thesis we present two approaches using our model to real-world problems described by graph games, in the fields of Argumentation Theory and Biomedicine.
|
15 |
A method for the identification of biological pathwaysHonti, Frantisek January 2014 (has links)
Plenty of gene variants have been associated with disease, indicating widespread genetic heterogeneity, which leaves the molecular basis of complex diseases unclear. However, it is widely postulated that the products of genes whose mutations are implicated in the same disease function together in the same biological pathways and it is the disruption of these pathways that underlies the disease. Such pathways are not well defined and their identification could help elucidate disease mechanisms. To discern molecular pathways of relevance to complex disease, I have inferred functional associations between human genes from diverse data types and assessed these associations with a novel phenotype-based method. I could confirm the hypothesis that dysfunctions of genes associated with each other in terms of functional genomic and proteomic data tend to give rise to the same disease. Examining the functional association between disease-associated gene variants, I have found that genes implicated through de novo sequence variants are biased in their coding sequence length and that longer genes tend to cluster together in gene networks, leading to exaggerated p-values in functional studies. I have controlled for the confounding bias and, testing different data sources, found that an integrated phenotypic-linkage network offers superior power to detect functional associations among genes mutated in the same disease. Applying these methods to clinical phenotypes related to intellectual disability, I have observed an increased predictive potential in identifying genes associated with these phenotypes. I have also performed case–control association analyses of variants from an exome-sequencing study of Parkinson’s disease and tested the functional associations of the mutated genes. I have advanced a framework for the identification of biological pathways disrupted in complex disorders, also demonstrating the suitability of this method to functionally sub-cluster the gene variants underlying a complex disorder, with implications for the understanding of disease mechanisms.
|
16 |
Desenvolvimento de uma ferramenta computacional para análise de co-expressão gênica e sua aplicação na biologia de sistemas / Development of a computational tool for gene co-expression analyses and its application in systems biologyRusso, Pedro de Sa Tavares 09 May 2019 (has links)
A Biologia de Sistemas proporciona um olhar holístico sobre os processos biológicos, integrando os diversos componentes intracelulares através de redes altamente complexas. Em particular, redes de co-expressão tem permitido nos últimos anos uma compreensão cada vez maior dos sistemas biológicos e dos mecanismos moleculares que os regem. Por outro lado, as ferramentas matemáticas e estatísticas já desenvolvidas para a análise destas redes e sistemas são, em geral, densas e pouco familiares para profissionais das áreas biológicas e da saúde. Portanto, a fim de possibilitar uma análise ao mesmo tempo relevante e facilitada, nosso grupo criou a ferramenta CEMiTool, que tem por objetivo identificar módulos de coexpressão de genes de modo automático, de maneira fácil e intuitiva para usuários com pouca ou nenhuma experiência com linguagens de programação. A fim de demonstrar a facilidade de uso da ferramenta, aplicamos o CEMiTool a mais de 1000 estudos de transcriptômica, cujos resultados foram utilizados para a confecção de um banco de dados, permitindo a integração de informações entre estudos. Além disso, para facilitar ainda mais o acesso a este tipo de análises, foi criada uma versão online da ferramenta, denominada webCEMiTool, que permite realizar as análises no navegador. Finalmente, criou-se também a ferramenta annotator, permitindo a definição automática de grupos de amostras de estudos de transcriptômica a partir do agrupamento de cadeias de caracteres presentes em dados de anotação. Todo o código está livremente disponível à comunidade. / System biology methods provide a holistic view of biological processes, integrating the several intracellular molecular components via the use of highly complex networks. In particular, co-expression networks have allowed for an increasing understanding of biological systems and the complex molecular mechanisms driving them. On the other hand, previously described tools for the analysis of biological networks are in general relatively difficult to use for life and health scientists given their high mathematical and computational demand. Therefore, in order to provide at the same time a relevant and easy-to-use analysis, we have developed the CEMiTool package, which aims to identify gene coexpression modules in an automatic, easy and intuitive way for users with little to no prior computational expertise. We applied CEMiTool to over 1000 transcriptomics studies and used the results to create a new gene coexpression database, which allows users to integrate information across analyses. Moreover, to further facilitate analyses we developed an online version of the tool named webCEMiTool, which permits users to run coexpression analysis easily via browser. Finally, we also developed annotator, a package for automatically determining experimental groups based on sample annotation string similarity. All code is freely available to the community.
|
17 |
Integração de dados na inferência de redes de genes: avaliação de informações biológicas e características topológicas / Data integration in gene networks inference: evaluation of biological and topological featuresFabio Fernandes da Rocha Vicente 02 May 2016 (has links)
Os componentes celulares não atuam sozinhos, mas sim em uma rede de interações. Neste sentido, é fundamental descobrir como os genes se relacionam e compreender a dinâmica do sistema biológico. Este conhecimento pode contribuir para o tratamento de doenças, para o melhoramento genético de plantas e aumento de produção agrícola, por exemplo. Muitas redes gênicas são desconhecidas ou apenas conhecidas parcialmente. Neste contexto, a inferência de Redes Gênicas surgiu como possível solução e tem por objetivo recuperar a rede a partir de dados de expressão gênica utilizando modelos probabilísticos. No entanto, um problema intrínseco da inferência de redes é formalmente descrito como maldição da dimensionalidade (a quantidade de variáveis é muito maior que a quantidade de amostras). No contexto biológico, este problema é ainda agravado pois é necessário lidar com milhares de genes e apenas um ou duas dezenas de amostras de dados de expressão. Assim, os modelos de inferência buscam contornar este problema propondo soluções que minimizem o erro de estimação. Nos modelos de predição ainda há muitos empates, isto é, apenas os dados de expressão não são suficientes para decidir pela interação correta entre os genes. Neste contexto, a proposta de integração de outros dados biológicos além do dado de expressão gênica surge como possível solução. No entanto, estes dados são heterogêneos: referem-se a interações físicas, relacionamentos funcionais, localização, dentre outros. Além disto são representados de diferentes formas: como dado quantitativo, qualitativo, como atributos nominais ou atributos ordinais. Algumas vezes organizados em estrutura hierárquica, em outras como um grafo e ainda como anotação descritiva. Além disto, não está claro como cada tipo de dado pode contribuir com a inferência e redução do erro dos modelos. Portanto, é fundamental buscar compreender a relação entre os dados biológicos disponíveis, bem como investigar como integrá-los na inferência. Assim, neste trabalho desenvolveu-se três metodologias de integração de dados e a contribuição de cada tipo foi analisada. Os resultados mostraram que o uso conjunto de dados de expressão e outros dados biológicos melhora a predição das redes. Também apontaram para diferença no potencial de redução do erro de acordo com o tipo de dado. Além disto, os resultados mostraram que o conhecimento da topologia da rede também reduz o erro além de inferir redes topologicamente coerentes com a topologia esperada / It is widely known that the cellular components do not act in isolation but through a network of interactions. In this sense, it is essential to discover how genes interact with each other and to understand the dynamics of the biological system. This knowledge can contribute for the treatment of diseases, contribute for plant breeding and increased agricultural production. In this context, the inference of Gene Networks (GNs) has emerged as a possible solution, studying how to recover the network from gene expression data through probabilistic models. However, a known problem of network inference is formally described as curse of dimensionality (the number of variables is much larger than the number of samples). In biological problems, it is even worse since there is only few samples and thousands of genes. However, there are still many ties found in the prediction models, that is, only the expression data are frequently not enough to decide the correct interaction between genes. In this context, data integration is proposed as a possible solution. However, the data are heterogeneous, refer to physical interactions and functional location. They are represented in different ways as quantitative or qualitative information, being nominal or ordinal attributes. Sometimes organized in hierarchical structure or as a graph. In addition, it is unclear how each type of data can contribute to the inference and reduction of the error. Therefore, it is very important to understand the relationship between the biological information available. Also, it is important to investigate how to integrate them in the inference algorithm. Thus, this work has developed three data integration methodologies and also, the contribution of biological information was analyzed. The results showed that the combined use of expression data and biological information improves the inference. Moreover, the results shows distinct behaviour of distinct data in error reduction. Also, experiments that include topological features into the models, shows that the knowledge of the network topology can increase the corrctness of the inferred newtorks
|
18 |
Integração de dados na inferência de redes de genes: avaliação de informações biológicas e características topológicas / Data integration in gene networks inference: evaluation of biological and topological featuresVicente, Fabio Fernandes da Rocha 02 May 2016 (has links)
Os componentes celulares não atuam sozinhos, mas sim em uma rede de interações. Neste sentido, é fundamental descobrir como os genes se relacionam e compreender a dinâmica do sistema biológico. Este conhecimento pode contribuir para o tratamento de doenças, para o melhoramento genético de plantas e aumento de produção agrícola, por exemplo. Muitas redes gênicas são desconhecidas ou apenas conhecidas parcialmente. Neste contexto, a inferência de Redes Gênicas surgiu como possível solução e tem por objetivo recuperar a rede a partir de dados de expressão gênica utilizando modelos probabilísticos. No entanto, um problema intrínseco da inferência de redes é formalmente descrito como maldição da dimensionalidade (a quantidade de variáveis é muito maior que a quantidade de amostras). No contexto biológico, este problema é ainda agravado pois é necessário lidar com milhares de genes e apenas um ou duas dezenas de amostras de dados de expressão. Assim, os modelos de inferência buscam contornar este problema propondo soluções que minimizem o erro de estimação. Nos modelos de predição ainda há muitos empates, isto é, apenas os dados de expressão não são suficientes para decidir pela interação correta entre os genes. Neste contexto, a proposta de integração de outros dados biológicos além do dado de expressão gênica surge como possível solução. No entanto, estes dados são heterogêneos: referem-se a interações físicas, relacionamentos funcionais, localização, dentre outros. Além disto são representados de diferentes formas: como dado quantitativo, qualitativo, como atributos nominais ou atributos ordinais. Algumas vezes organizados em estrutura hierárquica, em outras como um grafo e ainda como anotação descritiva. Além disto, não está claro como cada tipo de dado pode contribuir com a inferência e redução do erro dos modelos. Portanto, é fundamental buscar compreender a relação entre os dados biológicos disponíveis, bem como investigar como integrá-los na inferência. Assim, neste trabalho desenvolveu-se três metodologias de integração de dados e a contribuição de cada tipo foi analisada. Os resultados mostraram que o uso conjunto de dados de expressão e outros dados biológicos melhora a predição das redes. Também apontaram para diferença no potencial de redução do erro de acordo com o tipo de dado. Além disto, os resultados mostraram que o conhecimento da topologia da rede também reduz o erro além de inferir redes topologicamente coerentes com a topologia esperada / It is widely known that the cellular components do not act in isolation but through a network of interactions. In this sense, it is essential to discover how genes interact with each other and to understand the dynamics of the biological system. This knowledge can contribute for the treatment of diseases, contribute for plant breeding and increased agricultural production. In this context, the inference of Gene Networks (GNs) has emerged as a possible solution, studying how to recover the network from gene expression data through probabilistic models. However, a known problem of network inference is formally described as curse of dimensionality (the number of variables is much larger than the number of samples). In biological problems, it is even worse since there is only few samples and thousands of genes. However, there are still many ties found in the prediction models, that is, only the expression data are frequently not enough to decide the correct interaction between genes. In this context, data integration is proposed as a possible solution. However, the data are heterogeneous, refer to physical interactions and functional location. They are represented in different ways as quantitative or qualitative information, being nominal or ordinal attributes. Sometimes organized in hierarchical structure or as a graph. In addition, it is unclear how each type of data can contribute to the inference and reduction of the error. Therefore, it is very important to understand the relationship between the biological information available. Also, it is important to investigate how to integrate them in the inference algorithm. Thus, this work has developed three data integration methodologies and also, the contribution of biological information was analyzed. The results showed that the combined use of expression data and biological information improves the inference. Moreover, the results shows distinct behaviour of distinct data in error reduction. Also, experiments that include topological features into the models, shows that the knowledge of the network topology can increase the corrctness of the inferred newtorks
|
19 |
On text mining to identify gene networks with a special reference to cardiovascular disease / Identifiering av genetiska nätverk av betydelse för kärlförkalkning med hjälp av automatisk textsökning i Medline, en medicinsk litteraturdatabasStrandberg, Per Erik January 2005 (has links)
<p>The rate at which articles gets published grows exponentially and the possibility to access texts in machine-readable formats is also increasing. The need of an automated system to gather relevant information from text, text mining, is thus growing. </p><p>The goal of this thesis is to find a biologically relevant gene network for atherosclerosis, themain cause of cardiovascular disease, by inspecting gene cooccurrences in abstracts from PubMed. In addition to this gene nets for yeast was generated to evaluate the validity of using text mining as a method. </p><p>The nets found were validated in many ways, they were for example found to have the well known power law link distribution. They were also compared to other gene nets generated by other, often microbiological, methods from different sources. In addition to classic measurements of similarity like overlap, precision, recall and f-score a new way to measure similarity between nets are proposed and used. The method uses an urn approximation and measures the distance from comparing two unrelated nets in standard deviations. The validity of this approximation is supported both analytically and with simulations for both Erd¨os-R´enyi nets and nets having a power law link distribution. The new method explains that very poor overlap, precision, recall and f-score can still be very far from random and also how much overlap one could expect at random. The cutoff was also investigated. </p><p>Results are typically in the order of only 1% overlap but with the remarkable distance of 100 standard deviations from what one could have expected at random. Of particular interest is that one can only expect an overlap of 2 edges with a variance of 2 when comparing two trees with the same set of nodes. The use of a cutoff at one for cooccurrence graphs is discussed and motivated by for example the observation that this eliminates about 60-70% of the false positives but only 20-30% of the overlapping edges. This thesis shows that text mining of PubMed can be used to generate a biologically relevant gene subnet of the human gene net. A reasonable extension of this work is to combine the nets with gene expression data to find a more reliable gene net.</p>
|
20 |
Inference Of Piecewise Linear Systems With An Improved Method Employing Jump DetectionSelcuk, Ahmet Melih 01 September 2007 (has links) (PDF)
Inference of regulatory relations in dynamical systems is a promising active research
area. Recently, most of the investigations in this field have been stimulated by the
researches in functional genomics. In this thesis, the inferential modeling problem for
switching hybrid systems is studied. The hybrid systems refers to dynamical systems
in which discrete and continuous variables regulate each other, in other words the
jumps and flows are interrelated. In this study, piecewise linear approximations are
used for modeling purposes and it is shown that piecewise linear models are capable
of displaying the evolutionary characteristics of switching hybrid systems approxi-
mately. For the mentioned systems, detection of switching instances and inference of
locally linear parameters from empirical data provides a solid understanding about
the system dynamics. Thus, the inference methodology is based on these issues. The
primary difference of the inference algorithm is the idea of transforming the switch-
ing detection problem into a jump detection problem by derivative estimation from
discrete data. The jump detection problem has been studied extensively in signal
processing literature. So, related techniques in the literature has been analyzed care-
fully and suitable ones adopted in this thesis. The primary advantage of proposed
method would be its robustness in switching detection and derivative estimation. The
theoretical background of this robustness claim and the importance of robustness for
real world applications are explained in detail.
|
Page generated in 0.0565 seconds