Return to search

Computational Methods for Annotation and Expression Profiling of Bacterial Pathogens using "Omics" Approaches

The scope and application of high throughput techniques has expanded from studying a single genome, transcriptome or proteome to understanding complex environments at a greater resolution with the help of novel computational frameworks. Comprehensive structural annotation i.e. description of all functional elements in the genome, is required for measuring genome response accurately, using high throughput methods. Annotation of genome sequences using high throughput data from RNA-seq and proteomics experiments complement computational methods for identifying functional elements and can help validate existing in silico annotation, correct annotation errors, and could potentially identify novel functional elements. Re-annotation studies in recent times have revealed shortcomings of automated methods and the necessity to validate existing annotations using experimental data. This dissertation elucidates re-annotation of Mannheimia haemolytica, Pasteurella multocida and Histophilus somni, bacterial pathogens associated with bovine respiratory disease in cattle. Experimental re-annotation of these bacterial genomes using RNA-seq and proteomics enabled the validation of existing annotation and discovery of novel functional elements that can be utilized in future functional genomics studies. We also addressed the need for developing an automated bioinformatics workflow that is broadly applicable for bacterial genome re-annotation, by developing open source Perl pipeline that can use RNA-seq and proteomics data as input. Simultaneous analysis of host and pathogen gene expression profiling using metatranscriptomics approaches is necessary to improve our understanding of infectious diseases. Traditional methods for analysis of RNA-seq data do not address the impact of cross-mapping of reads to multiple genomes for data originating from a metatranscriptomic study. Analysis of sequence conservation between species can help determine a metric for cross mapping to correct for signal vs. noise. We generated artificial RNA-seq data and evaluated the impact of read length and sequence conservation on cross-mapping. Comparative genomics was used to identify a core and pan-genome for quantifying gene expression. Our results show that cross mapping between genomes can directly be related to evolutionary distance between these genomes and that an increase in RNA-seq read length tends to negate cross mapping.

Identiferoai:union.ndltd.org:MSSTATE/oai:scholarsjunction.msstate.edu:td-2137
Date07 May 2016
CreatorsReddy, Joseph S
PublisherScholars Junction
Source SetsMississippi State University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceTheses and Dissertations

Page generated in 0.0023 seconds