Return to search

Improving batch effect correction of metagenomic data: applications in the black women’s health study

The microbiome has become a focus of research, particularly in the field of human health and precision medicine, due to its role in human development, immunity, and nutrition. Microbiome profiling studies have become more tractable and advanced in large part thanks to advancements in metagenomics. One such study is the Black Women’s Health Study (BWHS), which aims to better understand health risks and disease development specific to Black women, who are more susceptible to certain health conditions. However, a major obstacle for reproducibility of microbiome research is the high sensitivity of microbial compositions to external factors and batch-to-batch technical variability, resulting in batch effects that often hinder analysis of factors of interest. While batch effect adjustment methods have been developed for other biomedical data, they do not appropriately account for two unique features of microbiome data: 1) its compositional nature, and 2) extreme overdispersion and zero-inflation.
My dissertation addresses these challenges by evaluating and improving batch effect correction methods for microbiome data and then applies these approaches to data from BWHS. First, I evaluated ComBat-Seq, along with existing microbiome-specific tools, in removing batch effects from both simulated 16S rRNA and real-world shotgun metagenomic sequencing data while preserving effects belonging to biological factors of interest. Second, I applied ComBat-Seq in an epidemiological study in which I identified several oral health-related genera among adult Black women to be associated with the host’s geographic location in the US. Finally, I introduced an extension to ComBat-Seq that improves its performance in batch effect correction on rare taxa with outliers via imputation. I demonstrated that, by replacing zeroes with predicted non-zero read counts that follow the observed compositional structure of the data, imputation effectively reduced the number of problematic cases in which outliers were intensified after batch effect correction.
Collectively, my thesis demonstrates that 1) when the specific features of microbiome data are accounted for, batch effect correction methods offer a promising solution to address batch effect in microbiome data and improve microbiome profiling studies and 2) it is important to consider social/environmental factors associated with the host’s physical location when studying the oral microbiome.

Identiferoai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/47924
Date11 January 2024
CreatorsFan, Howard James
ContributorsJohnson, W. Evan, Siggers, Trevor W.
Source SetsBoston University
Languageen_US
Detected LanguageEnglish
TypeThesis/Dissertation
RightsAttribution 4.0 International, http://creativecommons.org/licenses/by/4.0/

Page generated in 0.0026 seconds