Biological data are inherently interconnected: protein sequences are connected to their annotations, the annotations are structured into ontologies, and so on. While protein-protein interactions are already represented by graphs, in this work I am presenting how a graph structure can be used to enrich the annotation of protein sequences thanks to algorithms that analyze the graph topology. We also describe a novel solution to restrict the data generation needed for building such a graph, thanks to constraints on the data and dynamic programming. The proposed algorithm ideally improves the generation time by a factor of 5. The graph representation is then exploited to build a comprehensive database, thanks to the rising technology of graph databases. While graph databases are widely used for other kind of data, from Twitter tweets to recommendation systems, their application to bioinformatics is new. A graph database is proposed, with a structure that can be easily expanded and queried.
Identifer | oai:union.ndltd.org:unibo.it/oai:amsdottorato.cib.unibo.it:6914 |
Date | 04 June 2015 |
Creators | Profiti, Giuseppe <1980> |
Contributors | Casadio, Rita |
Publisher | Alma Mater Studiorum - Università di Bologna |
Source Sets | Università di Bologna |
Language | English |
Detected Language | English |
Type | Doctoral Thesis, PeerReviewed |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0022 seconds