Global ETD Search

Return to search

Genomic data mining for the computational prediction of small non-coding RNA genes

The objective of this research is to develop a novel computational prediction algorithm for non-coding RNA (ncRNA) genes using features computable for any genomic sequence without the need for comparative analysis. Existing comparative-based methods require the knowledge of closely related organisms in order to search for sequence and structural similarities. This approach imposes constraints on the type of ncRNAs, the organism, and the regions where the ncRNAs can be found. We have developed a novel approach for ncRNA gene prediction without the limitations of current comparative-based methods. Our work has established a ncRNA database required for subsequent feature and genomic analysis. Furthermore, we have identified significant features from folding-, structural-, and ensemble-based statistics for use in ncRNA prediction. We have also examined higher-order gene structures, namely operons, to discover potential insights into how ncRNAs are transcribed. Being able to automatically identify ncRNAs on a genome-wide scale is immensely powerful for incorporating it into a pipeline for large-scale genome annotation. This work will contribute to a more comprehensive annotation of ncRNA genes in microbial genomes to meet the demands of functional and regulatory genomic studies.

http://hdl.handle.net/1853/33966

Computational biology

Non-coding RNA

Data mining

Genomics Data processing

Genomes Data processing

Identifer	oai:union.ndltd.org:GATECH/oai:smartech.gatech.edu:1853/33966
Date	20 January 2009
Creators	Tran, Thao Thanh Thi
Publisher	Georgia Institute of Technology
Source Sets	Georgia Tech Electronic Thesis and Dissertation Archive
Detected Language	English
Type	Dissertation

Page generated in 0.0019 seconds

Genomic data mining for the computational prediction of small non-coding RNA genes

Description

Links & Downloads

Tags

Additional Fields