• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Artificial Intelligence Platform for Mass Spectrometry based Lipidomics Experimentation and Data Analysis

Connor Hammond Beveridge (19818204) 11 October 2024 (has links)
<p dir="ltr">Small molecule structural elucidation of unknown compounds using tandem mass spectrometry (MS/MS) is a significant challenge due to the need for expert interpretation or the use of peak matching software. These methods are time-consuming or constrained by the available molecular entries in databases. To address these limitations, we introduce a machine learning (ML) framework designed to predict functional groups from MS/MS spectra without requiring database searches. The ML models were trained using data from the Mass Bank of North America (MONA) and a custom MS/MS database generated using a high-throughput desorption electrospray ionization (DESI) MS/MS platform. This approach showcases the transferability of model predictions across different instruments, independent of the acquisition method. Our models achieved an average molecular F1 score of 87% for MONA data and 76% for data generated with DESI, with corresponding molecular accuracies of 94% and 87%, respectively. These results demonstrate the robustness of the ML framework across diverse datasets and confirm its validity in predicting structural information.</p><p dir="ltr">To further validate the robustness of our machine learning framework, the models were rigorously tested on blind datasets from independent sources, specifically provided by Corteva Agriscience and Merck & Co., Inc. These datasets were entirely separate from those used in training and validation, ensuring no prior exposure during model development. The blind datasets included spectra generated using a different instrument and a different ionization method—specifically, electrospray ionization (ESI)—which contrasts with the desorption electrospray ionization (DESI) method used to generate the custom MS/MS database for model training one of the models. When tested on this new, unseen data, the model trained on the MONA database, which also contains ESI spectra, achieved an average molecular F1 score of 80% with an accuracy of 88%. The model trained on DESI-generated data performed comparably well, achieving an average molecular F1 score of 78% with an accuracy of 85%. These findings emphasize the practical applicability of the ML framework for mass spectrometry in real-world scenarios, where diverse data sources, instruments, and ionization methods are commonly encountered.</p><p dir="ltr">In an alternate example, lipidomics – the comprehensive study of lipids in a biological system – heavily relies on tandem MS experiments. In particular, multiple reaction monitoring (MRM) profiling has gained considerable attention in MS-based lipidomics, as it can be particularly advantageous for the rapid screening of lipids in biological samples. While effective, the large amounts of data generated from lipidomic MRM-profiling workflows can be daunting. Moreover, existing software tools for lipid annotation often lack comprehensive automated workflows and integration with statistical and bioinformatics platforms. To address these challenges, we developed the Comprehensive Lipidomic Automated Workflow (CLAW) platform, which includes end-to-end modules for worklist generation, lipid annotation, parsing, statistical analysis, and bioinformatic analysis. In particular, CLAW is designed specifically for MRM-profiling, including isomer-specific MRM profiling conducted utilizing ozone electrospray ionization (OzESI), affording lipid species annotation at the carbon-carbon double bond (C=C) level. Notably, CLAW represents a first of its kind platform with integrated AI agents. Importantly, by utilizing a developed integrated language user interface (LUI) with large language models (LLMs), users can interact with CLAW via a chatbot terminal. Specifically, AI agents use LLMs to interpret human language and interact with the various tools and functionalities provided by the CLAW framework, offering robust and context-aware assistance, marking the first end-to-end application of an AI agent in mass spectrometry lipidomics. This innovation represents a significant step forward in automating complex workflows and making advanced analytical tools more accessible to researchers, ultimately accelerating scientific discovery in the field. In short, this AI-driven interface allows users to perform complex statistical analyses more efficiently, positioning CLAW as a powerful tool for high-throughput lipidomics analysis. Successful applications of CLAW to a broad range of biological samples afforded unprecedented insights to disease pathogenesis, including neurodegenerative diseases like Alzheimer's disease.</p>

Page generated in 0.079 seconds