Single-cell RNA-sequencing (scRNA-seq) technology enables researchers to investigate
a genome at the cellular level with unprecedented resolution. An organism
consists of a heterogeneous collection of cell types, each of which plays a distinct
role in various biological processes. Hence, the first step of scRNA-seq data analysis
often is to distinguish cell types so that they can be investigated separately. Researchers
have recently developed several automated cell type annotation tools based
on supervised machine learning algorithms, requiring neither biological knowledge
nor subjective human decisions. Dropout is a crucial characteristic of scRNA-seq
data which is widely utilized in differential expression analysis but not by existing
cell annotation methods. We present scAnnotate, a cell annotation tool that fully
utilizes dropout information. We model every gene’s marginal distribution using a
mixture model, which describes both the dropout proportion and the distribution of
the non-dropout expression levels. Then, using an ensemble machine learning approach,
we combine the mixture models of all genes into a single model for cell-type
annotation. This combining approach can avoid estimating numerous parameters in
the high-dimensional joint distribution of all genes. Using fourteen real scRNA-seq
datasets, we demonstrate that scAnnotate is competitive against nine existing annotation
methods, and that it accurately annotates cells when training and test data are
(1) similar, (2) cross-platform, and (3) cross-species. Of the cells that are incorrectly
annotated by scAnnotate, we find that a majority are different from those of other
methods. / Graduate / 2023-07-27
Identifer | oai:union.ndltd.org:uvic.ca/oai:dspace.library.uvic.ca:1828/14093 |
Date | 11 August 2022 |
Creators | Ji, Xiangling |
Contributors | Zhang, Xuekui, Tsao, Min |
Source Sets | University of Victoria |
Language | English, English |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Rights | Available to the World Wide Web |
Page generated in 0.0019 seconds