Return to search

An XML-based Database of Molecular Pathways / En XML-baserad databas för molekylära reaktioner

Research of protein-protein interactions produce vast quantities of data and there exists a large number of databases with data from this research. Many of these databases offers the data for download on the web in a number of different formats, many of them XML-based. With the arrival of these XML-based formats, and especially the standardized formats such as PSI-MI, SBML and BioPAX, there is a need for searching in data represented in XML. We wanted to investigate the capabilities of XML query tools when it comes to searching in this data. Due to the large datasets we concentrated on native XML database systems that in addition to search in XML data also offers storage and indexing specially suited for XML documents. A number of queries were tested on data exported from the databases IntAct and Reactome using the XQuery language. There were both simple and advanced queries performed. The simpler queries consisted of queries such as listing information on a specified protein or counting the number of reactions. One central issue with protein-protein interactions is to find pathways, i.e. series of interconnected chemical reactions between proteins. This problem involve graph searches and since we suspected that the complex queries it required would be slow we also developed a C++ program using a graph toolkit. The simpler queries were performed relatively fast. Pathway searches in the native XML databases took long time even for short searches while the C++ program achieved much faster pathway searches.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-3717
Date January 2005
CreatorsHall, David
PublisherLinköpings universitet, Institutionen för datavetenskap, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0021 seconds