Return to search

Machine Learning for Variant Detection and Population Analysis in Heterogenerous Cancer Sample

Cancer is a complex and deadly disease that is caused by genetic lesions in somatic cells. Further research in computational methodology for detecting and characterizing somatic mutations is necessary in order to understand the comprehensive systems level model of the roles of those lesions in cancer development. In the first project, I trained a list of supervised machine learning classifiers that classify false positive versus true positive somatic single nucleotide variants (SNVs). I was able to show an improvement of somatic SNV detection on the data set over the reported classifier. In the second project, we developed PhyloSub model that uses a nonparametric Bayesian prior over a set of trees to cluster SNVs, and infer the subclonal phylogenetic structure of tumors with uncertainty from SNV sequencing data. Experiments showed that PhyloSub model could infer the subclonal phylogenetic structure from both single and multiple tumor samples.

Identiferoai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/42971
Date28 November 2013
CreatorsJiao, Wei
ContributorsStein, Lincoln, Morris, Quaid
Source SetsUniversity of Toronto
Languageen_ca
Detected LanguageEnglish
TypeThesis

Page generated in 0.0014 seconds