Return to search

Towards the integration of structural and systems biology: structure-based studies of protein-protein interactions on a genome-wide scale

Knowledge of protein-protein interactions (PPIs) is essential to understanding regulatory processes in a cell. High-throughput experimental methods have made significant contributions to PPI determination, but they are known to have many false positives and fail to identify a signification portion of bona fide interactions. The same is true for the many computational tools that have been developed. Significantly, although protein structures provide atomic details of PPIs, they have had relatively little impact in large-scale PPI predictions and there has been only limited overlap between structural and systems biology. Here in this thesis, I present our progress in combining structural biology and systems biology in the context of studies analyzing, coarse-grained modeling and prediction of protein-protein interactions.
I first report a comprehensive analysis of the degree to which the location of a protein interface is conserved in sets of proteins that share different levels of similarities. Our results show that while, in general, the interface conservation is most significant among close neighbors, it is still significant even for remote structural neighbors. Based on this finding, we designed PredUs, a method to predict protein interface simply by "mapping" the interface information from its structural neighbors (i.e., "templates") to the target structure. We developed the PredUs web server to predict protein interfaces using this "template-based" method and a support vector machine (SVM) to further improve predictions. The PredUs webserver outperforms other state-of-the-art methods that are typically based on amino acid properties in terms of both prediction precision and recall. Meanwhile, PredUs runs very fast and can be used to study protein interfaces in a high throughput fashion. Maybe more importantly, it is not sensitive to local conformational changes and small errors in structures and thus can be applied to predict interface of protein homology models, when experimental structures are not available.
I then describe a novel structural modeling method that uses geometric relationships between protein structures, including both PDB structures and homology models, to accurately predict PPIs on a genome-wide scale. We applied the method with considerable success to both the yeast and the human genomes. We found that the accuracy and the coverage of our structure-based prediction compare favorably with the methods derived from sequence and functional clues, e.g. sequence similarity, co-expression, phylogenetic similarity, etc. Results further improve when using a naive Bayesian classifier to combine structural information with non-structural clues (PREPPI), yielding predictions of comparable quality to high-throughput experiments. Our data further suggests that PREPPI predictions are substantially complementary to those by experimental methods thus providing a way to dissect interactions that would be hard to identify on a purely high-throughput experimental basis.
We have for the first time designed a "template-based" method that predicts protein interface with high precision and recall. We have also for the first time used 3D structure as part of the repertoire of experimental and computational information and find a way to accurately infer PPIs on a large scale. The success of PredUs and PREPPI can be attributed to the exploitation of both the information contained in imperfect models and the remote structure-function relationships between proteins that have been usually considered to be unrelated. Our results constitute a significant paradigm shift in both structural and systems biology and suggest that they can be integrated to an extent that has not been possible in the past.

Identiferoai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/D8D234GH
Date January 2012
CreatorsZhang, Qiangfeng Cliff
Source SetsColumbia University
LanguageEnglish
Detected LanguageEnglish
TypeTheses

Page generated in 0.0024 seconds