Return to search

UNSUPERVISED LEARNING IN PHYLOGENOMIC ANALYSIS OVER THE SPACE OF PHYLOGENETIC TREES

A phylogenetic tree is a tree to represent an evolutionary history between species or other entities. Phylogenomics is a new field intersecting phylogenetics and genomics and it is well-known that we need statistical learning methods to handle and analyze a large amount of data which can be generated relatively cheaply with new technologies. Based on the existing Markov models, we introduce a new method, CURatio, to identify outliers in a given gene data set. This method, intrinsically an unsupervised method, can find outliers from thousands or even more genes. This ability to analyze large amounts of genes (even with missing information) makes it unique in many parametric methods. At the same time, the exploration of statistical analysis in high-dimensional space of phylogenetic trees has never stopped, many tree metrics are proposed to statistical methodology. Tropical metric is one of them. We implement a MCMC sampling method to estimate the principal components in a tree space with the tropical metric for achieving dimension reduction and visualizing the result in a 2-D tropical triangle.

Identiferoai:union.ndltd.org:uky.edu/oai:uknowledge.uky.edu:statistics_etds-1043
Date01 January 2019
CreatorsKang, Qiwen
PublisherUKnowledge
Source SetsUniversity of Kentucky
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceTheses and Dissertations--Statistics

Page generated in 0.0021 seconds