Return to search

Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis

Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment.

First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e.

Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes.

Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium.

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa.de:bsz:14-qucosa-62562
Date12 December 2017
CreatorsRoyer, Loic
ContributorsTechnische Universtiät Dresden, Fakultät Informatik, Prof. Dr. Michael Schroeder, Prof. Dr. Ralf Zimmer
PublisherSaechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typedoc-type:doctoralThesis
Formatapplication/pdf

Page generated in 0.0021 seconds