Spelling suggestions: "subject:"match correction""
1 |
Correction of batch effects in single cell RNA sequencing data using ComBat-SeqDullea, Jonathan Tyler 20 February 2021 (has links)
Single cell RNA sequencing allows expression profiles for individual cells to be obtained thus offering unprecedented insight into the behavior of individual cells. Insight gained from exploration of individual cells has implications in both cancer and developmental biology. Much of the power of these models is derived from the shear amount and granularity of the data that can be collected; however, with this power comes the deleterious introduction of batch effects. Samples sequenced on different days, by different technicians can show variance that cannot be attributed to biological condition, but rather is only due to the batch in which it was sequenced. These batch effects can cause alterations to the perceived relationships between the main effect and the outcome of interest, for instance cancer status, the main effect of cancer status may be hidden by the unwanted and unmodeled variance. Two known methods for the correction of batch effects in bulk RNA sequencing data are ComBat-Seq and Surrogate Variable Analysis; in this work, we demonstrate that when cell-type is known, inclusion of that covariate in the ComBat-Seq results in an appropriate correction of the batch effect. We also demonstrate that when cell-type is not known, SVA can be used to infer cell-type information form the latent structure of the count matrix with some loss of accuracy compared to the correction with cell type. This cell type information can be used in place of the actual cell-type covariate information to correct single cell RNA sequencing data with ComBat-Seq; inclusion of surrogate variables helps the accuracy of the correction in certain scenarios. Additionally, in the case where cell-type is not known, and the cell proportions are balanced between batches we demonstrate that ComBat-Seq can be used naive to cell-type information. The efficacy of this procedure is demonstrated with two simulated datasets and a dataset containing Jurkat and t293 cells. These results are then compared to Harmony, a recently reported batch correction algorithm. The procedure, herein reported, has benefits over harmony in certain situations such as when a counts matrix is needed for further analysis or when there is thought to be substantial intra-cell-type variability across different batches.
|
2 |
Adversarial Deep Neural Networks Effectively Remove Nonlinear Batch Effects from Gene-Expression DataDayton, Jonathan Bryan 01 July 2019 (has links)
Gene-expression profiling enables researchers to quantify transcription levels in cells, thus providing insight into functional mechanisms of diseases and other biological processes. However, because of the high dimensionality of these data and the sensitivity of measuring equipment, expression data often contains unwanted confounding effects that can skew analysis. For example, collecting data in multiple runs causes nontrivial differences in the data (known as batch effects), known covariates that are not of interest to the study may have strong effects, and there may be large systemic effects when integrating multiple expression datasets. Additionally, many of these confounding effects represent higher-order interactions that may not be removable using existing techniques that identify linear patterns. We created Confounded to remove these effects from expression data. Confounded is an adversarial variational autoencoder that removes confounding effects while minimizing the amount of change to the input data. We tested the model on artificially constructed data and commonly used gene expression datasets and compared against other common batch adjustment algorithms. We also applied the model to remove cancer-type-specific signal from a pan-cancer expression dataset. Our software is publicly available at https://github.com/jdayton3/Confounded.
|
Page generated in 0.0987 seconds