Finding the approximate location of short read genome sequences by comparing
them to an already available closely related organism's complete genome sequence is
a challenging research issue. Predicting intron locations in the short form of mRNA
called Expressed Sequence Tags (ESTs) and the variability of intron lengths are the
major challenges. More specifically, finding the intron positions in an EST sequence
by comparing it with a reference genome sequence is a time consuming task, as
currently it is done manually.
In my thesis, I designed a pipeline that can predict the intron positions in ESTs
of non-model organisms. Initially, I compared the ESTs to the closest completely
sequenced genome. The pipeline then finds the alignment of the ESTs, the reference
genome sequence, and the coding region of the gene (known as Coding DNA Sequence or CDS) from the reference genome.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:MWU.1993/23206 |
Date | 14 January 2014 |
Creators | Mamun, S.M. Al |
Contributors | Domaratzki, Michael (Computer Science) Sharanowski, Barbara J. (Entomology), Leung, Carson Kai-Sang (Computer Science) Fristensky, Brian W. (Plant Science) |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Detected Language | English |
Page generated in 0.0016 seconds