Return to search

DEMOCRATISING DEEP LEARNING IN MICROBIAL METABOLITES RESEARCH / DEMOCRATISING DEEP LEARNING IN NATURAL PRODUCTS RESEARCH

Deep learning models are dominating performance across a wide variety of tasks. From protein folding to computer vision to voice recognition, deep learning is changing the way we interact with data. The field of natural products, and more specifically genomic mining, has been slow to adapt to these new technological innovations. As we are in the midst of a data explosion, it is not for lack of training data. Instead, it is due to the lack of a blueprint demonstrating how to correctly integrate these models to maximise performance and inference. During my PhD, I showcase the use of large language models across a variety of data domains to improve common workflows in the field of natural product drug discovery. I improved natural product scaffold comparison by representing molecules as sentences. I developed a series of deep learning models to replace archaic technologies and create a more scalable genomic mining pipeline decreasing running times by 8X. I integrated deep learning-based genomic and enzymatic inference into legacy tooling to improve the quality of short-read assemblies. I also demonstrate how intelligent querying of multi-omic datasets can be used to facilitate the gene signature prediction of encoded microbial metabolites. The models and workflows I developed are wide in scope with the hopes of blueprinting how these industry standard tools can be applied across the entirety of natural product drug discovery. / Thesis / Doctor of Philosophy (PhD)

Identiferoai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/28495
Date January 2023
CreatorsDial, Keshav
ContributorsMagarvey, Nathan, Biochemistry
Source SetsMcMaster University
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0022 seconds