Return to search

Identification of Environmental Alphaproteobacteria with Conserved Signature Proteins in Metagenomic Datasets

<p>Microbial metagenomics is the exploration of taxonomical diversity of microbial communities in environmental habitats using large, exhaustive DNA sequence datasets. However, due to inherent limitations of sequencing technology and the complexity of environmental genomes, current analytical approaches do not reveal the existence of all microbes that may be present. In this study, a new classification approach is proposed based upon unique proteins that are specific for different clades of Alphaproteobacteria to predict the presence and absence of species from these groups of bacteria in published metagenomic datasets. In this work, 264 previously–identified, published conserved signature proteins (CSPs) characteristic of individual taxonomic clades of Alphaproteobacteria are used as probes to detect the presence of bacteria in metagenomic datasets. Although public genome sequence information has increased manifold since these CSPs were initially identified 6 years ago, results indicate that nearly all of these CSPs (259 of 265) are specific for their previously characterized clades. Furthermore, they are confirmed to be present in the newly–identified and sequenced members of these clades. In view of their specificity and predictive ability in different monophyletic clades of Alphaproteobacteria, the sequences of these CSPs provide reliable probes to determine the presence or absence of these Alphaproteobacteria in metagenomic datasets. In this work, CSPs are used to determine the presence of Alphaproteobacteria diversity in 10 published metagenomic datasets (bioreactor, compost, wastewater, activated sludge, groundwater, freshwater sediment, microbial mat, marine, hydrothermal vent and whale fall metagenomes), which cover diverse environment and ecosystems. It is indicated that the BLAST searches with these CSPs can be used to efficiently identify Alphaproteobacteria species in these metagenome dataset and substantial differences can be determined in the distribution and relative abundance of different Alphaproteobacteria species in the tested metagenome datasets. Thus the CSPs, which are specific for different microbial taxa, provide novel and powerful means for identification of microbes and for their taxonomic profiling in metagenomic datasets.</p> / Master of Science (MSc)

Identiferoai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/15324
Date21 December 2014
CreatorsYao, Quan
ContributorsSchellhorn, Herb E., Gupta, Radhey S., Igdoura, Suleiman A., Biology
Source SetsMcMaster University
Detected LanguageEnglish
Typethesis

Page generated in 0.0028 seconds