With the emerging amount of information available in the internet, how to make full use of this information becomes an urgent issue. One of the solutions is using ontology alignment to aggregate different sources of information in order to get comprehensive and complete information. Scalability is a problem regarding the ontology alignment and it can be settled down by reducing the search space of mapping suggestions. In this paper we propose an automated procedure mainly using different clustering techniques to prune the search space. The main focus of this paper is to evaluate different clustering related techniques to be applied in our system. K-means, Chameleon and Birch have been studied and evaluated, every parameter in these clustering algorithms is studied by doing experiments separately, in order to find the best clustering setting to the ontology clustering problem. Four different similarity assignment methods are researched and analyzed as well. Tfidf vectors and cosine similarity are used to identify the similar clusters in the two ontologies, experiments about threshold of cosine similarity are made to get the most suitable value. Our system successfully builds an automated procedure to generate reduced search space for ontology alignment, on one hand, the result shows that it reduces twenty to ninety times of comparisons that the ontology alignment was supposed to make, the precision goes up as well. On the other hand, it only needs one to two minutes of execution time, meanwhile the recall and f-score only drop down a little bit. The trade- off is acceptable for the ontology alignment system which will take tens of minutes to generate the ontology alignment of the same ontology set. As a result, the large scale ontology alignment becomes more computable and feasible.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-141887 |
Date | January 2017 |
Creators | Gao, Zhiming |
Publisher | Linköpings universitet, Databas och informationsteknik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0021 seconds