• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Improving Protein Identification In Mass Spectrometry Imaging Using Machine Learning and Spatial Spectral Information

Shahryari Fard, Soroush 17 January 2022 (has links)
Mass spectrometry imaging (MSI) is a high-throughput technique that in addition to performing protein identification, can capture the spatial localization of proteins within biological tissue. Nevertheless, sample pre-processing and MSI instrumentation limit protein identification capability in MSI compared to more standard tandem mass spectrometry-based proteomics methods. Despite these limitations, the current protein identification approaches used in MSI were originally designed for standard mass spectrometry-based proteomics and do not take advantage of the spatial information acquired in MSI. Herein, I explore the benefit of using the spatial spectral information for protein identification using two objectives. For the first objective, I developed a novel supervised learning spatially-aware protein identification algorithm (SAPID) for mass spectrometry imaging and benchmarked it against ProteinProphet and Percolator, which are state-of-the-art tools for protein identification confidence assessment. I showed that SAPID identifies on average 20% more proteins at <1% false discovery rate compared to the other two algorithms.Furthermore, more proteins are identified when spatial features are used to identify proteins compared to when they are not suggesting their additional benefit. For the second objective, I used SAPID to rescue false positive and false negative protein identifications made by ProteinProphet. By examining a combination of data sampling and learning algorithms, I was able to achieve a good classification performance compared to the baseline given the extremeimbalance in the dataset. Finally, by improving proteome characterization in MSI, our approach will help providing a better understanding of the processes taking place in biological tissues.

Page generated in 0.1279 seconds