Global ETD Search

Return to search

Exploring the performance of Conformal Prediction on Chemical Properties and Its Influencing Factors

Machine learning has gained much attention and extended to the field of drug discovery. However, due to the uncertainties of the dataset, predictions should be quantitatively analyzed. Conformal prediction is a powerful method for quantifying these uncertainties, generating a predefined confidence level and a corresponding interval within which the true target is anticipated to fall. This paper aims to explore the effects of different chemical representations of SMILES structures for training (chemical descriptors, Morgan fingerprints), machine learning algorithms (k-nearest neighbor, support vector machine, random forest, extreme gradient boosting, and artificial neural network), and different normalization methods (k-nearest neighbor, Mondrian regression) in influencing the conformal prediction results. We find that Morgan fingerprint outperforms chemical descriptors, Mondrian regression outperforms knearest neighbor for one or several values of coverage, and the mean, median, and standard deviation of the output interval. None of the investigated machine learning methods extremely outperforms the other methods. Conformal predictive system, an alternative form of conformal prediction was also investigated to explore its usefulness in drug discovery.

http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-531352

Bioinformatics (Computational Biology)

Bioinformatik (beräkningsbiologi)

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-531352
Date	January 2024
Creators	Chen, Yuhang
Publisher	Uppsala universitet, Institutionen för farmaceutisk biovetenskap
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.002 seconds

Exploring the performance of Conformal Prediction on Chemical Properties and Its Influencing Factors

Description

Links & Downloads

Tags

Additional Fields