Return to search

Genomic Island Discovery through Enrichment of Statistical Modeling with Biological Information

Horizontal gene transfer enables acquisition and dissemination of novel traits including antibiotic resistance and virulence among bacteria. Frequently such traits are gained through the acquisition of clusters of functionally related genes, often referred to as genomic islands (GIs). Quantifying horizontal flow of GIs and assessing their contributions to the emergence and evolution of novel metabolic traits in bacterial organisms are central to understanding the evolution of bacteria in general and the evolution of pathogenicity and antibiotic resistance in particular, a focus of this dissertation study. Methods for GI detection have also evolved with advances in sequencing and bioinformatics, however, comprehensive assessment of these methods has been lacking. This motivated us to assess the performance of current methods for identifying islands on broad datasets of well-characterized bacterial genomes and synthetic genomes, and leverage this information to develop a novel approach that circumvents the limitations of the current state-of-the-art in GI detection. The main findings from our assessment studies were 1) the methods have complementary strengths, 2) a gene-clustering method utilizing codon usage bias as the discriminant criterion, namely, JS-CB, is most efficient in localizing genomic islands, specifically the well-studied SCCmec resistance island in methicillin resistant Staphylococcus aureus (MRSA) genomes, and 3) in general, the bottom up, gene by gene analysis methods, are inherently limited in their ability to decipher large structures such as GIs as single entities within bacterial genomes. We adapted a top-down approach based on recursive segmentation and agglomerative clustering and developed a GI prediction tool, GEMINI, which combined compositional features with segment context information to localize GIs in the Liverpool epidemic strain of Pseudomonas aeruginosa. Application of GEMINI to the genome of P. aeruginosa LESB58 demonstrated its ability to delineate experimentally verified GIs in the LESB58 genome. GEMINI identified several novel islands including pathogenicity islands and revealed the mosaic structure of several LESB58 harbored GIs. A new GI identification approach, CAFE, with broad applicability was developed. CAFE incorporates biological information encoded in a genome within the statistical framework of segmentation and clustering to more robustly localize GIs in the genome. CAFE identifies genomic islands lacking markers by virtue of their association with genomic islands with markers originating from the same source. This is made possible by performing marker enrichment and phyletic pattern analyses within the integrated framework of recursive segmentation and clustering. CAFE compared favorably with frequently used methods for genomic island detection on synthetic test datasets and on a test-set of known islands from 15 well-characterized bacterial species. These tools can be readily adapted for cataloging GIs in just sequenced, yet uncharacterized genomes.

Identiferoai:union.ndltd.org:unt.edu/info:ark/67531/metadc1248417
Date08 1900
CreatorsJani, Mehul
ContributorsAzad, Rajeev K., Mathee, Kalai, Shulaev, Vladimir, Wright, Amanda, Hughes, Lee
PublisherUniversity of North Texas
Source SetsUniversity of North Texas
LanguageEnglish
Detected LanguageEnglish
TypeThesis or Dissertation
Formatxii, 145 pages, Text
RightsPublic, Jani, Mehul, Copyright, Copyright is held by the author, unless otherwise noted. All rights Reserved.

Page generated in 0.003 seconds