Return to search

Markov Model of Segmentation and Clustering: Applications in Deciphering Genomes and Metagenomes

Rapidly accumulating genomic data as a result of high-throughput sequencing has necessitated development of efficient computational methods to decode the biological information underlying these data. DNA composition varies across structurally or functionally different regions of a genome as well as those of distinct evolutionary origins. We adapted an integrative framework that combines a top-down, recursive segmentation algorithm with a bottom-up, agglomerative clustering algorithm to decipher compositionally distinct regions in genomes. The recursive segmentation procedure entails fragmenting a genome into compositionally distinct segments within a statistical hypothesis testing framework. This is followed by an agglomerative clustering procedure to group compositionally similar segments within the same framework. One of our main objectives was to decipher distinctive evolutionary patterns in sex chromosomes via unraveling the underlying compositional heterogeneity. Application of this approach to the human X-chromosome provided novel insights into the stratification of the X chromosome as a consequence of punctuated recombination suppressions between the X and Y from the distal long arm to the distal short arm. Novel "evolutionary strata" were identified particularly in the X conserved region (XCR) that is not amenable to the X-Y comparative analysis due to massive loss of the Y gametologs following recombination cessation. Our compositional based approach could circumvent the limitations of the current methods that depend on X-Y (or Z-W for ZW sex determination system) comparisons by deciphering the stratification even if only the sequence of sex chromosome in the homogametic sex (i.e. X or Z chromosome) is available. These studies were extended to the plant sex chromosomes which are known to have a number of evolutionary strata that formed at the initial stage of their evolution, presenting an opportunity to examine the onset of stratum formation on the sex chromosomes. Further applications included detection of horizontally acquired DNAs in extremophilic eukaryote, Galdieria sulphuraria, which encode variety of potentially adaptive functions, and in the taxonomic profiling of metagenomic sequences. Finally, we discussed how the Markovian segmentation and clustering method can be made more sensitive and robust for further applications in biological and biomedical sciences in future.

Identiferoai:union.ndltd.org:unt.edu/info:ark/67531/metadc1011827
Date08 1900
CreatorsPandey, Ravi Shanker
ContributorsAzad, Rajeev, Mikler, Armin, Shulaev, Vladimir, Padilla, Pamela Anne, Jagadeeswaran, Pudur
PublisherUniversity of North Texas
Source SetsUniversity of North Texas
LanguageEnglish
Detected LanguageEnglish
TypeThesis or Dissertation
Formatxii, 135 pages, Text
RightsPublic, Pandey, Ravi Shanker, Copyright, Copyright is held by the author, unless otherwise noted. All rights Reserved.

Page generated in 0.0028 seconds