Spelling suggestions: "subject:"amorphological profiles"" "subject:"amorphological frofiles""
1 |
Improving ligand-based modelling by combining various featuresOmran, Abir January 2021 (has links)
Background: In drug discovery morphological profiles can be used to identify and establish a drug's biological activity or mechanism of action. Quantitative structure-activity relationship (QSAR) is an approach that uses the chemical structures to predict properties e.g., biological activity. Support Vector Machine (SVM) is a machine learning algorithm that can be used for classification. Confidence measures as conformal predictions can be implemented on top of machine learning algorithms. There are several methods that can be applied to improve a model’s predictive performance. Aim: The aim in this project is to evaluate if ligand-based modelling can be improved by combining features from chemical structures, target predictions and morphological profiles. Method: The project was divided into three experiments. In experiment 1 five bioassay datasets were used. In experiment 2 and 3 a cell painting dataset was used that contained morphological profiles from three different classes of kinase inhibitors, and the classes were used as endpoints. Support vector machine, liblinear models were built in all three experiments. A significant level of 0.2 was set to calculate the efficiency. The mean observed fuzziness and efficiency were used as measurements to evaluate the model performance. Results: Similar trends were observed for all datasets in experiment 1. Signatures+CDK13+TP which is the most complex model obtained the lowest mean observed fuzziness in four out of five times. With a confidence level of 0.8, TP+Signatures obtained the highest efficiency. Signatures+Morphological Profiles+TP obtained the lowest mean observed fuzziness in experiment 2 and 3. Signatures obtained the highest correct single label predictions with a confidence of 80%. Discussion: Less correct single label predictions were observed for the active class in comparison to the inactive class. This could have been due to them being harder to predict. The morphological profiles did not contribute with an improvement to the models predictive performance compared to Signatures. This could be due to the lack of information obtained from the dataset. Conclusion: A combination of features from chemical structures and target predictions improved ligand-based modelling compared to models only built on one of the features. The combination of features from chemical structures and morphological profiles did not improve the ligand-based models, compared to the model only built on chemical structures. By adding features from target predictions to a model built with features from chemical structures and morphological profiles a decrease in mean observed fuzziness was obtained.
|
2 |
Predicting morphological effect of compounds on COVID-19 infected cellsÖhrner, Viktor January 2023 (has links)
The cost of developing new drugs is high and the aim of computer-assisted drug discovery is to reduce that development cost, either through virtual screening or generating novel compounds. System biology is one approach to drug discovery where the response of a biological system is the subject of study, instead of drug target interaction. One way to observe a biological system is through microscopy images that are taken of cells perturbed with compounds. Image software extracts information called morphological profiles from the images that can be used for data hungry models. One of the ways artificial intelligence has been applied to drug discovery is with generative models that can generate new compounds. One such generative model is reinforcement learning that employs a critic to guide the generation of compounds towards desirable behaviors. In this study different machine learning models were tested if they could predict the morphological response of COVID-19 infected cells to compounds from their structure. No modells showed any promising results. The reason that no model performed well was because of the dataset. There is a lot of variance in the dataset, meaning that the response to the same compound varies. There was also a lot of difference between the compounds in the dataset, meaning that any representation that the model learns does not transfer over to other compounds. The data set was also imbalanced with more inactive compounds.
|
Page generated in 0.0915 seconds