In the field of proteomic mass spectrometry, proteins can be sequenced by two independent yet complementary algorithms: de novo sequencing which uses no prior knowledge and database search which relies upon existing protein databases. In the case where an organism’s protein database is not available, the software Spider was developed in order to search sequence tags produced by de novo sequencing against a database from a related organism while accounting for both errors in the sequence tags and mutations.
This thesis further develops Spider by using the concept of reconstruction in order to predict the real sequence by considering both the sequence tags and their matched homologous peptides. The significant value of these reconstructed sequences is demonstrated. Additionally, the runtime is greatly reduced and separated into independent caching and matching steps.
This new approach allows for the development of an efficient algorithm for search. In addition, the algorithm’s output can be used for new applications. This is illustrated by a contribution to a complete protein sequencing application.
Identifer | oai:union.ndltd.org:WATERLOO/oai:uwspace.uwaterloo.ca:10012/5853 |
Date | January 2011 |
Creators | Yuen, Denis |
Source Sets | University of Waterloo Electronic Theses Repository |
Language | English |
Detected Language | English |
Type | Thesis or Dissertation |
Page generated in 0.0014 seconds