Return to search

Investigation and application of artificial intelligence algorithms for complexity metrics based classification of semantic web ontologies

M. Tech. (Department of Information Technology, Faculty of Applied and Computer Sciences), Vaal University of Technology. / The increasing demand for knowledge representation and exchange on the semantic web has resulted in an increase in both the number and size of ontologies. This increased features in ontologies has made them more complex and in turn difficult to select, reuse and maintain them. Several ontology evaluations and ranking tools have been proposed recently. Such evaluation tools provide a metrics suite that evaluates the content of an ontology by analysing their schemas and instances. The presence of ontology metric suites may enable classification techniques in placing the ontologies in various categories or classes. Machine Learning algorithms mostly based on statistical methods used in classification of data makes them the perfect tools to be used in performing classification of ontologies.
In this study, popular Machine Learning algorithms including K-Nearest Neighbors, Support Vector Machines, Decision Trees, Random Forest, Naïve Bayes, Linear Regression and Logistic Regression were used in the classification of ontologies based on their complexity metrics. A total of 200 biomedical ontologies were downloaded from the Bio Portal repository. Ontology metrics were then generated using the OntoMetrics tool, an online ontology evaluation platform. These metrics constituted the dataset used in the implementation of the machine learning algorithms.
The results obtained were evaluated with performance evaluation techniques, namely, precision, recall, F-Measure Score and Receiver Operating Characteristic (ROC) curves. The Overall accuracy scores for K-Nearest Neighbors, Support Vector Machines, Decision Trees, Random Forest, Naïve Bayes, Logistic Regression and Linear Regression algorithms were 66.67%, 65%, 98%, 99.29%, 74%, 64.67%, and 57%, respectively. From these scores, Decision Trees and Random Forests algorithms were the best performing and can be attributed to the ability to handle multiclass classifications.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:vut/oai:digiresearch.vut.ac.za:10352/533
Date11 1900
CreatorsKoech, Gideon Kiprotich
ContributorsFonou-Dombeu, J. V., Dr.
PublisherVaal University of Technology
Source SetsSouth African National ETD Portal
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.002 seconds