Return to search

Data Deconvolution for Drug Prediction

Treating cancer is difficult as the disease is complex and drug responses often depend on the patient's characteristics. Precision medicine aims to solve this by selecting individualized treatments. Since this involves the analysis of large datasets, machine learning can be used to make the drug selection process more efficient. Traditionally, such models utilize bulk gene expression data. However, this potentially masks information from small cell populations and fails to address tumor heterogeneity. Therefore, this thesis applies data deconvolution methods to bulk gene expression data and estimates the corresponding cell type-specific gene expression profiles. This "increases" the resolution of the input data for the drug response prediction. A hold-out dataset, LODOCV and LOCOCV were used for the evaluation of this approach. Furthermore, all results are compared against a baseline model, which was trained on bulk data. Overall, the accuracy of the cell type-specific model did not show an improvement compared to the bulk model. It also prioritizes information from bulk samples, which makes the additional data unnecessary. The robustness of the cell type-specific model is slightly lower than that of the bulk model. Note, that these outcomes are not necessarily due to a flaw in the underlying concept, but may be connected to poor deconvolution results as the same reference matrix was used for the deconvolution of all bulk samples regardless of the cancer type or disease.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-205016
Date January 2024
CreatorsMenacher, Lisa Maria
PublisherLinköpings universitet, Statistik och maskininlärning
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.002 seconds