To understand the process of life, it is crucial for us to study proteins and
their functions. Proteins execute (almost) all cellular activities, and their functions are standardized by Gene Ontology (GO). The amount of discovered protein sequences grows rapidly as a consequence of the fast rate of development of
technologies in gene sequencing. In UniProtKB, there are more than 200 million
proteins. Still, less than 1% of the proteins in the UniProtKB database are experimentally GO-annotated, which is the result of the exorbitant cost of biological
experiments. To minimize the large gap, developing an efficient and effective
method for automatic protein function prediction (AFP) is essential.
Many approaches have been proposed to solve the AFP problem. Still, these
methods suffer from limitations in the way the knowledge of the domain is presented and what type of knowledge is included. In this work, we formulate the
task of AFP as an entailment problem and exploit the structure of the related
knowledge in a set and reusable framework. To achieve this goal, we construct a
knowledge base of formal GO axioms and protein-protein interactions to use as
background knowledge for AFP. Our experiments show that the approach proposed here, which allows for ontology awareness, improves results for AFP of
proteins; they also show the importance of including protein-protein interactions for predicting the functions of proteins.
Identifer | oai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/686536 |
Date | 22 November 2022 |
Creators | Qathan, Shahad |
Contributors | Hoehndorf, Robert, Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Arold, Stefan T., Moshkov, Mikhail |
Source Sets | King Abdullah University of Science and Technology |
Language | English |
Detected Language | English |
Type | Thesis |
Rights | 2023-12-19, At the time of archiving, the student author of this thesis opted to temporarily restrict access to it. The full text of this thesis will become available to the public after the expiration of the embargo on 2023-12-19. |
Page generated in 0.0019 seconds