Return to search

Using Transcriptomic Data to Predict Biomarkers for Subtyping of Lung Cancer

Lung cancer is one the most dangerous types of all cancer. Several studies have explored the use of machine learning methods to predict and diagnose this cancer. This study explored the potential of decision tree (DT) and random forest (RF) classification models, in the context of a small transcriptome dataset for outcome prediction of different subtypes on lung cancer. In the study we compared the three subtypes; adenocarcinomas (AC), small cell lung cancer (SCLC) and squamous cell carcinomas (SCC) with normal lung tissue by applying the two machine learning methods from caret R package. The DT and RF model and their validation showed different results for each subtype of the lung cancer data. The DT found more features and validated them with better metrics. Analysis of the biological relevance was focused on the identified features for each of the subtypes AC, SCLC and SCC. The DT presented a detailed insight into the biological data which was essential by classifying it as a biomarker. The identified features from this research may serve as potential candidate genes which could be explored further to confirm their role in corresponding lung cancer types and contribute to targeted diagnostics of different subtypes.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:his-21598
Date January 2021
CreatorsDaran, Rukesh
PublisherHögskolan i Skövde, Institutionen för biovetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0122 seconds