In shotgun proteomics, the liquid chromatography step is used to separate peptides in order to analyze as few as possible at the same time in the mass spectrometry step. Each peptide has a retention time, that is how long it takes to pass through the chromatography column. Prediction of the retention time can be used to gain increased identification of peptides or in order to create targeted proteomics experiments. Using machine learning methods such as support vector machines has given a high prediction accuracy, but such methods require known features that the retention time depends on. In this thesis we let a convolutional network, learn to rank the retention times instead of predicting the retention times themselves. We also tested how the prediction accuracy depends on the size of the training set. We found that pairwise ranking of peptides outperforms pointwise ranking and that adding more training data increased accuracy until the end without an increase in training time. This implies that accuracy can be further increased by training on even greater training sets.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-224794 |
Date | January 2018 |
Creators | Kruczek, Daniel |
Publisher | KTH, Skolan för elektroteknik och datavetenskap (EECS) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | TRITA-EECS-EX ; 2018:77 |
Page generated in 0.0023 seconds