A latent factor model for count data is popularly applied when deconvoluting mixed signals in biological data as exemplified by sequencing data for transcriptome or microbiome studies. Due to the availability of pure samples such as single-cell transcriptome data, the estimators can enjoy much better accuracy by utilizing the extra information. However, such an advantage quickly disappears in the presence of excessive zeros. To correctly account for such a phenomenon, we propose a zero-inflated non-negative matrix factorization that models excessive zeros in both mixed and pure samples and derive an effective multiplicative parameter updating rule. In simulation studies, our method yields smaller bias comparing to other deconvolution methods. We applied our approach to gene expression from brain tissue as well as fecal microbiome datasets, illustrating the superior performance of the approach. Our method is implemented as a publicly available R-package, iNMF.
In zero-inflated non-negative matrix factorization (iNMF) for the deconvolution of mixed signals of biological data, pure-samples play a significant role by solving the identifiability issue as well as improving the accuracy of estimates. One of the main issues of using single-cell data is that the identities(labels) of the cells are not given. Thus, it is crucial to sort these cells into their correct types computationally. We propose a nonlinear latent variable model that can be used for sorting pure-samples as well as grouping mixed-samples via deep neural networks. The computational difficulty will be handled by adopting a method known as variational autoencoding. While doing so, we keep the NMF structure in a decoder neural network, which makes the output of the network interpretable.
Identifer | oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/43944 |
Date | 25 February 2022 |
Creators | Kong, Yixin |
Contributors | Carvalho, Luis E., Chun, Hyonho |
Source Sets | Boston University |
Language | en_US |
Detected Language | English |
Type | Thesis/Dissertation |
Page generated in 0.0016 seconds