Global ETD Search

11	A computational approach for comparative oncogenomics using mouse models Brett, Benjamin Thomas 01 May 2014 (has links) Cancer is the second most common cause of death in the United States. It is a complex disease with environmental, genetic, and lifestyle factors influencing the likelihood of getting cancer and the development of any resulting tumor. Understanding the genetics of cancer is integral to developing novel patient-specific treatments. However, due to complexity, hundreds to thousands of tumors are required for sufficient power to identify the network of relationships among these genes. Animal models of cancer are commonly used to reduce cost and to control experimental variables allowing for more specific hypothesis testing. The Sleeping Beauty transposon mutagenesis system can be used to model cancer in mice. While the Sleeping Beauty mutagenesis system is an important tool in understanding cancer, it has specific computational needs. Experiments need to be analyzed in a fast, unbiased, and efficient manner. A computational method must also accurately model the system allowing for validation and interpretation. Here I present an updated Integration Analysis System and use this system to validate the assumptions present in forward genetic screens of cancer using the Sleeping Beauty. This system allows for rapid identification of cancer genes, but does not directly aid in understanding the relationship between the genes. Given the complexity of cancer, understanding the relationship between cancer genes is very difficult. I have created a connectedness network utilizing the STRING database to better derive an understanding of cancer genes. STRING is a database of known and predicted protein-protein interactions. The connectedness between pairs of genes is calculated using a network reliability metric. This database allows for increased power to detect known pathways when compared to STRING alone. Combining this connectivity network with the set of cancer genes identified by the Integration Analysis System is a strategy for rapid and efficient interpretation of the genetic results. Bioinformatics Cancer Comparative Oncogenomics Insertional Mutagenesis Network Biology Sleeping Beauty Genetics
12	Network-based approaches to studying healthy and disease development Gao, Long 01 May 2017 (has links) Network biology has proven to be powerful tool for representing and analyzing complex molecular networks. It has also been successfully applied to biological field helping understand various biological processes. However, our current knowledge about the dynamics of gene networks during disease progression is rather limited. On the other hand, network construction is a prerequisite of network analysis. When the number of samples is limited, state-of-art computational methods for network construction are not robust in terms of low statistical power. In addition, molecular networks have been used extensively to improve the inference accuracy of causal coding variants, but this potential has not been investigated to the same extent for noncoding variants. To address those limitations, I first developed inference of multiple differential modules (iMDM) algorithm to study network dynamics. This method is able to identify both unique and shared modules from multiple gene networks, each of which denoting a different perturbation condition. Using iMDM algorithm, I identified different types of modules to understand heart failure progression and disease dynamics. Next, I developed a computational framework to construct condition specific transcriptional regulatory network. I also developed a computational method to rank transcription factors in the transcriptional regulatory network. Applying this framework to RNA-seq data for hematopoietic stem cell development, I successfully constructed corresponding transcriptional regulatory network and identified key transcriptional factors that play important roles. Finally, I developed Annotation of Regulatory Variants using Integrated Networks (ARVIN), a network-based algorithm, to identify causal genetic variants for diseases. By applying ARVIN to various diseases, we obtained a systems understanding of the gene circuitry that is affected by all enhancer mutations in a given disease. Complex diseases Genetic variants Hematopoietic stem cell Network Biology
13	Systems Biology of Human Colorectal Cancer Nibbe, Rod K. January 2009 (has links) No description available. Bioinformatics Biology Biomedical Research Genetics human colon cancer proteomics systems biology bioinformatics networks network biology
14	Algorithms for regulatory network inference and experiment planning in systems biology Pratapa, Aditya 17 July 2020 (has links) I present novel solutions to two different classes of computational problems that arise in the study of complex cellular processes. The first problem arises in the context of planning large-scale genetic cross experiments that can be used to validate predictions of multigenic perturbations made by mathematical models. (i) I present CrossPlan, a novel methodology for systematically planning genetic crosses to make a set of target mutants from a set of source mutants. CrossPlan is based on a generic experimental workflow used in performing genetic crosses in budding yeast. CrossPlan uses an integer-linear-program (ILP) to maximize the number of target mutants that we can make under certain experimental constraints. I apply it to a comprehensive mathematical model of the protein regulatory network controlling cell division in budding yeast. (ii) I formulate several natural problems related to efficient synthesis of a target mutant from source mutants. These formulations capture experimentally-useful notions of verifiability (e.g., the need to confirm that a mutant contains mutations in the desired genes) and permissibility (e.g., the requirement that no intermediate mutants in the synthesis be inviable). I present several polynomial time or fixed-parameter tractable algorithms for optimal synthesis of a target mutant for special cases of the problem that arise in practice. The second problem I address is inferring gene regulatory networks (GRNs) from single cell transcriptomic (scRNA-seq) data. These GRNs can serve as starting points to build mathematical models. (iii) I present BEELINE, a comprehensive evaluation of state-of-the-art algorithms for inferring gene regulatory networks (GRNs) from single-cell gene expression data. The evaluations from BEELINE suggest that the area under the precision-recall curve and early precision of these algorithms are moderate. Techniques that do not require pseudotime-ordered cells are generally more accurate. Based on these results, I present recommendations to end users of GRN inference methods. BEELINE will aid the development of gene regulatory network inference algorithms. (iv) Based on the insights gained from BEELINE, I propose a novel graph convolutional neural network (GCN) based supervised algorithm for GRN inference form single-cell gene expression data. This GCN-based model has a considerably better accuracy than existing supervised learning algorithms for GRN inference from scRNA-seq data and can infer cell-type specific regulatory networks. / Doctor of Philosophy / A small number of key molecules can completely change the cell's state, for example, a stem cell differentiating into distinct types of blood cells or a healthy cell turning cancerous. How can we uncover the important cellular events that govern complex biological behavior? One approach to answering the question has been to elucidate the mechanisms by which genes and proteins control each other in a cell. These mechanisms are typically represented in the form of a gene or protein regulatory network. The resulting networks can be modeled as a system of mathematical equations, also known as a mathematical model. The advantage of such a model is that we can computationally simulate the time courses of various molecules. Moreover, we can use the model simulations to predict the effect of perturbations such as deleting one or more genes. A biologist can perform experiments to test these predictions. Subsequently, the model can be iteratively refined by reconciling any differences between the prediction and the experiment. In this thesis I present two novel solutions aimed at dramatically reducing the time and effort required for this build-simulate-test cycle. The first solution I propose is in prioritizing and planning large-scale gene perturbation experiments that can be used for validating existing models. I then focus on taking advantage of the recent advances in experimental techniques that enable us to measure gene activity at a single-cell resolution, known as scRNA-seq. This scRNA-seq data can be used to infer the interactions in gene regulatory networks. I perform a systematic evaluation of existing computational methods for building gene regulatory networks from scRNA-seq data. Based on the insights gained from this comprehensive evaluation, I propose novel algorithms that can take advantage of prior knowledge in building these regulatory networks. The results underscore the promise of my approach in identifying cell-type specific interactions. These context-specific interactions play a key role in building mathematical models to study complex cellular processes such as a developmental process that drives transitions from one cell type to another network biology experiment planning gene regulatory networks deep learning single cell transcriptomics
15	Working Together: Using protein networks of bacterial species to compare essentiality, centrality, and conservation in Escherichia coli. Wimble, Christopher 01 January 2015 (has links) Proteins in Escherichia coli were compared in terms of essentiality, centrality, and conservation. The hypotheses of this study are: for proteins in Escherichia coli, (1) there is a positive, measureable correlation between protein conservation and essentiality, (2) there is a positive relationship between conservation and degree centrality, and (3) essentiality and centrality also have a positive correlation. The third hypothesis was supported by a moderate correlation, the first with a weak correlation, and the second hypotheis was not supported. When proteins that did not map to orthologous groups and proteins that had no interactions were removed, the relationship between essentality and conservation increased to a strong relationship. This was due to the effect of proteins that did not map to orthologus groups and suggests that protein orthology represented by clusters of orthologus groups does not accurately dipict protein conservation among the species studied. Essentiality Protein Conservation Centrality Graph Theory Protein-Protein Interaction Network PPI Escherichia coli Saccharomyces cerevisiae Baker’s Yeast Network Biology Aging Replicative Aging Target of Rapamycin TOR Interactomics Bioinformatics
16	Analysis, integration and applications of the human interactome Chaurasia, Gautam 12 December 2012 (has links) Protein-Protein Interaktions (PPI) Netzwerke liefern ein Grundgerüst für systematische Untersuchungen der komplexen molekularen Maschinerie in der Zelle. Die Komplexität von Protein-Wechselwirkungen stellt jedoch in Bezug auf ihre Identifizierung, Validierung und Annotation eine große experimentelle und rechnerische Herausforderung dar. In dieser Arbeit analysierte ich diese Probleme und lieferte Lösungen, um die Limitierungen aktueller humanen PPI Netzwerke zu überwinden. Meine Arbeit kann in zwei Teile aufgeteilt werden: Im ersten Teil führte ich eine kritischen Vergleich von acht unabhängig konstruierten humanen PPI Netzwerke durch, um mögliche experimentellen Verzerrungen zu erkennen. Die Ergebnisse zeigten starke Tendenzen bezüglich der Selektion und Detektion von Interaktionen, die in zukünftigen Anwendungen dieser Netzwerke berücksichtigt werden sollten. Einer der wichtigsten Schlussfolgerungen dieser Studie war, dass die derzeitigen humanen Interaktions Netzwerke komplementär sind und deshalb wurde eine Datenbank mit der Bezeichnung Unified Human Interaktome (UniHI) entwickelt, die menschliche PPI Daten aus zwölf wichtigsten Quellen integriert. Im zweiten Teil dieser Forschungsarbeit benutzte ich die Daten aus der UniHI Datenbank, die genetischen Modifikatoren in einer bestimmten Krankheit, Chorea Huntington (HD) eine autosomal dominante neurodegenerative Erkrankung, zu charakterisieren. Um die Proteine zu identifizieren, die den Krankheitsverlauf modifizieren können, wurden Protein Interaktion Daten mit Genexpressionsdaten von HD-Patienten in Kombination mit einem Mehrschritt-Filterungsverfahren integriert. Mit dem neuartigen Ansatz wurde ein Nucleus caudatus-spezifische Protein-Interaktion HD (PPI)-Netzwerk vorhergesagt, das 14 potentiell dysregulierten Proteine direkt oder indirekt mit dem Huntingtin-Protein verlinkt, mit mögliche Verbindung zu Molekularen Prozessen wie z.B. Apoptose, Metabolismus, neuronale Entwicklung. / Protein interaction networks aim to provide the scaffold maps for systematic studies of the complex molecular machinery in the cell. The complexity of protein interactions poses, however, large experimental and computational challenges regarding their identification, validation and annotation. Additionally, storage and linking is demanding since new data are rapidly accumulating. In this research work, I addressed these issues and provided solutions to overcome the limitations of current human protein-protein interaction (PPI) maps. In particular, my thesis can be partitioned into two parts: In the first part, I conducted a comparative assessment of eight recently constructed human protein-protein interaction networks to identify experimental biases. Results showed strong selection and detection biases which are necessary to take into consideration in future applications of these maps. One of the important conclusions of this study was that the current human interaction networks contain complementary information; hence, a database was developed, termed as Unified Human Interactome (UniHI), integrating human PPI data from twelve major sources. Several new tools were included for querying, analyzing and visualizing human PPI networks. In the second part of this research work, UniHI dataset was applied to characterize the genetic modifiers involved in a specific disease: Chorea Huntington (HD), an autosomal dominant neurodegenerative disease. To find the modifiers, a network-based modeling approach was implemented by integrating huntingtin-specific protein interaction network with gene expression data from HD patients in multiple steps. Using this approach, a Caudate Nucleus-specific HD protein interaction (PPI) network was predicted, connecting 14 potentially dysregulated proteins directly or indirectly to the disease protein, showing a possible link to molecular processes such as pro-apoptotic pathways, cell survival, anti-apoptotic, growth, and neuronal diseases. System Biologie Netzwerk Biologie Protein-protein Wechselwirkung Grpah analyze Huntington-Krankheit Systems Biology Network Biology Protein-protein Interaction Graph Analysis Huntington Disease 570 Biowissenschaften, Biologie 32 Biologie WD 5100 ddc:570
17	Pathway-centric approaches to the analysis of high-throughput genomics data Hänzelmann, Sonja, 1981- 11 October 2012 (has links) In the last decade, molecular biology has expanded from a reductionist view to a systems-wide view that tries to unravel the complex interactions of cellular components. Owing to the emergence of high-throughput technology it is now possible to interrogate entire genomes at an unprecedented resolution. The dimension and unstructured nature of these data made it evident that new methodologies and tools are needed to turn data into biological knowledge. To contribute to this challenge we exploited the wealth of publicly available high-throughput genomics data and developed bioinformatics methodologies focused on extracting information at the pathway rather than the single gene level. First, we developed Gene Set Variation Analysis (GSVA), a method that facilitates the organization and condensation of gene expression proﬁles into gene sets. GSVA enables pathway-centric downstream analyses of microarray and RNA-seq gene expression data. The method estimates sample-wise pathway variation over a population and allows for the integration of heterogeneous biological data sources with pathway-level expression measurements. To illustrate the features of GSVA, we applied it to several use-cases employing diﬀerent data types and addressing biological questions. GSVA is made available as an R package within the Bioconductor project. Secondly, we developed a pathway-centric genome-based strategy to reposition drugs in type 2 diabetes (T2D). This strategy consists of two steps, ﬁrst a regulatory network is constructed that is used to identify disease driving modules and then these modules are searched for compounds that might target them. Our strategy is motivated by the observation that disease genes tend to group together in the same neighborhood forming disease modules and that multiple genes might have to be targeted simultaneously to attain an eﬀect on the pathophenotype. To ﬁnd potential compounds, we used compound exposed genomics data deposited in public databases. We collected about 20,000 samples that have been exposed to about 1,800 compounds. Gene expression can be seen as an intermediate phenotype reﬂecting underlying dysregulatory pathways in a disease. Hence, genes contained in the disease modules that elicit similar transcriptional responses upon compound exposure are assumed to have a potential therapeutic eﬀect. We applied the strategy to gene expression data of human islets from diabetic and healthy individuals and identiﬁed four potential compounds, methimazole, pantoprazole, bitter orange extract and torcetrapib that might have a positive eﬀect on insulin secretion. This is the ﬁrst time a regulatory network of human islets has been used to reposition compounds for T2D. In conclusion, this thesis contributes with two pathway-centric approaches to important bioinformatic problems, such as the assessment of biological function and in silico drug repositioning. These contributions demonstrate the central role of pathway-based analyses in interpreting high-throughput genomics data. / En l'última dècada, la biologia molecular ha evolucionat des d'una perspectiva reduccionista cap a una perspectiva a nivell de sistemes que intenta desxifrar les complexes interaccions entre els components cel•lulars. Amb l'aparició de les tecnologies d'alt rendiment actualment és possible interrogar genomes sencers amb una resolució sense precedents. La dimensió i la naturalesa desestructurada d'aquestes dades ha posat de manifest la necessitat de desenvolupar noves eines i metodologies per a convertir aquestes dades en coneixement biològic. Per contribuir a aquest repte hem explotat l'abundància de dades genòmiques procedents d'instruments d'alt rendiment i disponibles públicament, i hem desenvolupat mètodes bioinformàtics focalitzats en l'extracció d'informació a nivell de via molecular en comptes de fer-ho al nivell individual de cada gen. En primer lloc, hem desenvolupat GSVA (Gene Set Variation Analysis), un mètode que facilita l'organització i la condensació de perfils d'expressió dels gens en conjunts. GSVA possibilita anàlisis posteriors en termes de vies moleculars amb dades d'expressió gènica provinents de microarrays i RNA-seq. Aquest mètode estima la variació de les vies moleculars a través d'una població de mostres i permet la integració de fonts heterogènies de dades biològiques amb mesures d'expressió a nivell de via molecular. Per il•lustrar les característiques de GSVA, l'hem aplicat a diversos casos usant diferents tipus de dades i adreçant qüestions biològiques. GSVA està disponible com a paquet de programari lliure per R dins el projecte Bioconductor. En segon lloc, hem desenvolupat una estratègia centrada en vies moleculars basada en el genoma per reposicionar fàrmacs per la diabetis tipus 2 (T2D). Aquesta estratègia consisteix en dues fases: primer es construeix una xarxa reguladora que s'utilitza per identificar mòduls de regulació gènica que condueixen a la malaltia; després, a partir d'aquests mòduls es busquen compostos que els podrien afectar. La nostra estratègia ve motivada per l'observació que els gens que provoquen una malaltia tendeixen a agrupar-se, formant mòduls patogènics, i pel fet que podria caldre una actuació simultània sobre múltiples gens per assolir un efecte en el fenotipus de la malaltia. Per trobar compostos potencials, hem usat dades genòmiques exposades a compostos dipositades en bases de dades públiques. Hem recollit unes 20.000 mostres que han estat exposades a uns 1.800 compostos. L'expressió gènica es pot interpretar com un fenotip intermedi que reflecteix les vies moleculars desregulades subjacents a una malaltia. Per tant, considerem que els gens d'un mòdul patològic que responen, a nivell transcripcional, d'una manera similar a l'exposició del medicament tenen potencialment un efecte terapèutic. Hem aplicat aquesta estratègia a dades d'expressió gènica en illots pancreàtics humans corresponents a individus sans i diabètics, i hem identificat quatre compostos potencials (methimazole, pantoprazole, extracte de taronja amarga i torcetrapib) que podrien tenir un efecte positiu sobre la secreció de la insulina. Aquest és el primer cop que una xarxa reguladora d'illots pancreàtics humans s'ha utilitzat per reposicionar compostos per a T2D. En conclusió, aquesta tesi aporta dos enfocaments diferents en termes de vies moleculars a problemes bioinformàtics importants, com ho son el contrast de la funció biològica i el reposicionament de fàrmacs "in silico". Aquestes contribucions demostren el paper central de les anàlisis basades en vies moleculars a l'hora d'interpretar dades genòmiques procedents d'instruments d'alt rendiment. Functional Genomics Systems Biology Network Biology Microarray analysis RNA-seq Drug repurposing Diabetes Gene Set Enrichment Analysis Reverse-engineering of networks Genòmica funcional Biologia de sistemes Biologia de xarxes Anàlisi de microarray Seqüenciació d'ARN Reutilització de medicaments Diabetis Inferència de xarxes 577
18	Global functional association network inference and crosstalk analysis for pathway annotation Ogris, Christoph January 2017 (has links) Cell functions are steered by complex interactions of gene products, like forming a temporary or stable complex, altering gene expression or catalyzing a reaction. Mapping these interactions is the key in understanding biological processes and therefore is the focus of numerous experiments and studies. Small-scale experiments deliver high quality data but lack coverage whereas high-throughput techniques cover thousands of interactions but can be error-prone. Unfortunately all of these approaches can only focus on one type of interaction at the time. This makes experimental mapping of the genome-wide network a cost and time intensive procedure. However, to overcome these problems, different computational approaches have been suggested that integrate multiple data sets and/or different evidence types. This widens the stringent definition of an interaction and introduces a more general term - functional association. FunCoup is a database for genome-wide functional association networks of Homo sapiens and 16 model organisms. FunCoup distinguishes between five different functional associations: co-membership in a protein complex, physical interaction, participation in the same signaling cascade, participation in the same metabolic process and for prokaryotic species, co-occurrence in the same operon. For each class, FunCoup applies naive Bayesian integration of ten different evidence types of data, to predict novel interactions. It further uses orthologs to transfer interaction evidence between species. This considerably increases coverage, and allows inference of comprehensive networks even for not well studied organisms. BinoX is a novel method for pathway analysis and determining the relation between gene sets, using functional association networks. Traditionally, pathway annotation has been done using gene overlap only, but these methods only get a small part of the whole picture. Placing the gene sets in context of a network provides additional evidence for pathway analysis, revealing a global picture based on the whole genome. PathwAX is a web server based on the BinoX algorithm. A user can input a gene set and get online network crosstalk based pathway annotation. PathwAX uses the FunCoup networks and 280 pre-defined pathways. Most runs take just a few seconds and the results are summarized in an interactive chart the user can manipulate to gain further insights of the gene set's pathway associations. / <p>At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 2: Manuscript.</p> biological networks global gene association networks gene networks protein networks functional association functional coupling network biology pathway analysis pathway annotation pathway enrichment network-based enrichment enrichment Bioinformatics and Systems Biology Bioinformatik och systembiologi
19	Integrative approaches to investigate the molecular basis of diseases and adverse drug reactions: from multivariate statistical analysis to systems biology Bauer-Mehren, Anna 08 November 2010 (has links) Despite some great success, many human diseases cannot be effectively treated, prevented or cured, yet. Moreover, prescribed drugs are often not very efficient and cause undesired side effects. Hence, there is a need to investigate the molecular basis of diseases and adverse drug reactions in more detail. For this purpose, relevant biomedical data needs to be gathered, integrated and analysed in a meaningful way. In this regard, we have developed novel integrative analysis approaches based on both perspectives, classical multivariate statistics and systems biology. A novel multilevel statistical method has been developed for exploiting molecular and pharmacological information for a set of drugs in order to investigate undesired side effects. Systems biology approaches have been used to study the genetic basis of human diseases at a global scale. For this purpose, we have developed an integrated gene-disease association database and tools for user-friendly access and analysis. We showed that modularity applies for mendelian, complex and environmental diseases and identified disease-related core biological processes. We have constructed a workflow to investigate adverse drug reactions using our gene-disease association database. A detailed study of currently available pathway data has been performed to evaluate its applicability to build network models. Finally, a strategy to integrate information about sequence variations with biological pathways has been implemented to study the effect of the sequence variations onto biological processes. In summary, the developed methods are of immense practical value for other biomedical researchers and can aid to improve the understanding of the molecular basis of diseases and adverse drug reactions.A pesar de que existen tratamientos eficaces para las enfermedades, no hay todavía una cura o un tratamiento efectivo para muchas de ellas. Asimismo los medicamentos pueden ser ineficaces o causar efectos secundarios indeseables. Por lo tanto, es necesario investigar en profundidad las bases moleculares de las enfermedades y de los efectos secundarios de los medicamentos. Para ello, es necesario identificar y analizar de forma integrada los datos biomédicos relevantes. En este sentido, hemos desarrollado nuevos métodos de análisis e integración de datos biomédicos que van desde el análisis estadístico multivariante a la biología de sistemas. En primer lugar, hemos desarrollado un nuevo método estadístico multinivel para la explotación de la información molecular y farmacológica de un conjunto de drogas a fin de investigar efectos secundarios no deseados. Luego, hemos usado métodos de biología de sistemas para estudiar las bases genéticas de enfermedades humanas a escala global. Para ello, hemos integrado en una base de datos asociaciones entre genes y enfermedades y hemos desarrollado herramientas para el fácil acceso y análisis de los datos. Mostramos que las enfermedades mendelianas, complejas y ambientales presentan modularidad e identificamos los procesos biológicos relacionados con dichas enfermedades. Hemos construido una herramienta para investigar las reacciones adversas a los medicamentos basada en nuestra base de datos de asociaciones entre genes y enfermedades. Realizamos un estudio detallado de los datos disponibles sobre los procesos biológicos para evaluar su aplicabilidad en la construcción de modelos dinámicos. Por último, desarrollamos una estrategia para integrar la información sobre las variaciones de secuencia de genes con los procesos biológicos para estudiar el efecto de dichas variaciones en los procesos biológicos. En resumen, los métodos presentados en esta tesis constituyen una herramienta valiosa para otros investigadores y pueden ayudar a mejorar la comprensión de las bases moleculares de las enfermedades y de las reacciones adversas a los medicamentos. redes biológicas procesos biológicos análisis de redes asociaciones entre genes y enfermedades efectos adversos a medicamentos análisis estadístico multivariante investigación biomédica biología de enfermedades biología de sistemas integración de datos biología computacional bioinformática drug safety signal genetic origin of disease network biology biological pathway gene-disease associations multivariate statistical analysis adverse drug reactions disease biology biomedical research systems biology data integration computational biology bioinformatics 57
20	Functional association networks for disease gene prediction Guala, Dimitri January 2017 (has links) Mapping of the human genome has been instrumental in understanding diseasescaused by changes in single genes. However, disease mechanisms involvingmultiple genes have proven to be much more elusive. Their complexityemerges from interactions of intracellular molecules and makes them immuneto the traditional reductionist approach. Only by modelling this complexinteraction pattern using networks is it possible to understand the emergentproperties that give rise to diseases.The overarching term used to describe both physical and indirect interactionsinvolved in the same functions is functional association. FunCoup is oneof the most comprehensive networks of functional association. It uses a naïveBayesian approach to integrate high-throughput experimental evidence of intracellularinteractions in humans and multiple model organisms. In the firstupdate, both the coverage and the quality of the interactions, were increasedand a feature for comparing interactions across species was added. The latestupdate involved a complete overhaul of all data sources, including a refinementof the training data and addition of new class and sources of interactionsas well as six new species.Disease-specific changes in genes can be identified using high-throughputgenome-wide studies of patients and healthy individuals. To understand theunderlying mechanisms that produce these changes, they can be mapped tocollections of genes with known functions, such as pathways. BinoX wasdeveloped to map altered genes to pathways using the topology of FunCoup.This approach combined with a new random model for comparison enables BinoXto outperform traditional gene-overlap-based methods and other networkbasedtechniques.Results from high-throughput experiments are challenged by noise and biases,resulting in many false positives. Statistical attempts to correct for thesechallenges have led to a reduction in coverage. Both limitations can be remediedusing prioritisation tools such as MaxLink, which ranks genes using guiltby association in the context of a functional association network. MaxLink’salgorithm was generalised to work with any disease phenotype and its statisticalfoundation was strengthened. MaxLink’s predictions were validatedexperimentally using FRET.The availability of prioritisation tools without an appropriate way to comparethem makes it difficult to select the correct tool for a problem domain.A benchmark to assess performance of prioritisation tools in terms of theirability to generalise to new data was developed. FunCoup was used for prioritisationwhile testing was done using cross-validation of terms derived fromGene Ontology. This resulted in a robust and unbiased benchmark for evaluationof current and future prioritisation tools. Surprisingly, previously superiortools based on global network structure were shown to be inferior to a localnetwork-based tool when performance was analysed on the most relevant partof the output, i.e. the top ranked genes.This thesis demonstrates how a network that models the intricate biologyof the cell can contribute with valuable insights for researchers that study diseaseswith complex genetic origins. The developed tools will help the researchcommunity to understand the underlying causes of such diseases and discovernew treatment targets. The robust way to benchmark such tools will help researchersto select the proper tool for their problem domain. / <p>At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 5: Manuscript. Paper 6: Manuscript.</p> network biology biological networks network prediction functional association functional coupling network integration functional association networks genome-wide association networks gene networks protein networks fret functional enrichment analysis network cross-talk pathway annotation gene prioritisation network-based gene prioritization benchmarking Bioinformatics and Systems Biology Bioinformatik och systembiologi

Search results