Return to search

Contributions to In Silico Genome Annotation

Genome annotation is an important topic since it provides information for the foundation
of downstream genomic and biological research. It is considered as a way of summarizing
part of existing knowledge about the genomic characteristics of an organism. Annotating
different regions of a genome sequence is known as structural annotation, while
identifying functions of these regions is considered as a functional annotation. In silico
approaches can facilitate both tasks that otherwise would be difficult and timeconsuming.
This study contributes to genome annotation by introducing several novel
bioinformatics methods, some based on machine learning (ML) approaches.
First, we present Dragon PolyA Spotter (DPS), a method for accurate identification of the
polyadenylation signals (PAS) within human genomic DNA sequences. For this, we derived
a novel feature-set able to characterize properties of the genomic region surrounding the
PAS, enabling development of high accuracy optimized ML predictive models. DPS
considerably outperformed the state-of-the-art results.
The second contribution concerns developing generic models for structural annotation,
i.e., the recognition of different genomic signals and regions (GSR) within eukaryotic DNA.
We developed DeepGSR, a systematic framework that facilitates generating ML models
to predict GSR with high accuracy. To the best of our knowledge, no available generic and
automated method exists for such task that could facilitate the studies of newly sequenced organisms. The prediction module of DeepGSR uses deep learning algorithms
to derive highly abstract features that depend mainly on proper data representation and
hyperparameters calibration. DeepGSR, which was evaluated on recognition of PAS and
translation initiation sites (TIS) in different organisms, yields a simpler and more precise
representation of the problem under study, compared to some other hand-tailored
models, while producing high accuracy prediction results.
Finally, we focus on deriving a model capable of facilitating the functional annotation of
prokaryotes. As far as we know, there is no fully automated system for detailed
comparison of functional annotations generated by different methods. Hence, we
developed BEACON, a method and supporting system that compares gene annotation
from various methods to produce a more reliable and comprehensive annotation. Overall,
our research contributed to different aspects of the genome annotation.

Identiferoai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/626265
Date30 November 2017
CreatorsKalkatawi, Manal M.
ContributorsBajic, Vladimir B., Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Moshkov, Mikhail, Arold, Stefan T., Zhang, Zhang
Source SetsKing Abdullah University of Science and Technology
LanguageEnglish
Detected LanguageEnglish
TypeDissertation
Rights2018-11-30, At the time of archiving, the student author of this dissertation opted to temporarily restrict access to it. The full text of this dissertation became available to the public after the expiration of the embargo on 2018-11-30.

Page generated in 0.0025 seconds