Return to search

Background annotation of entities in Linked Data vocabularies / Background annotation entit v Linked Data slovníků

One the key feature behind Linked Data is the use of vocabularies that allow datasets to share a common language to describe similar concepts and relationships and resolve ambiguities between them. The development of vocabularies is often driven by a consensus process among datasets implementers, in which the criterion of interoperability is considered to be sufficient. This can lead to misrepresentation of real-world entities in Linked Data vocabularies entities. Such drawbacks can be fixed by the use of a formal methodology for modelling Linked Data vocabularies entities and identifying ontological distinctions. One proven example is the OntoClean methodology for curing taxonomies. In this work, it is presented a software tool that implements the PURO approach to ontological distinction modelling. PURO models vocabularies as Ontological Foreground Models (OFM), and the structure of ontological distinctions as Ontological Background Models (OBM), constructed using meta-properties attached to vocabulary entities, in a process known as vocabulary annotation. The software tool, named Background Annotation plugin, written in Java and integrated in the Protégé ontology editor, enables a user to graphically annotate vocabulary entities through an annotation workflow, that implements, among other things, persistency of annotations and their retrieval. Two kinds of workflows are supported: generic and dataset-specific, in order to differentiate a vocabulary usage, in terms of a PURO OBM, with respect to a given Linked Data dataset. The workflow is enhanced by the use of dataset statistical indicators retrieved through the Sindice service, for a sample of chosen datasets, such as the number of entities present in a dataset, and the relative frequency of vocabulary entities in that dataset. A further enhancement is provided by dataset summaries that offer an overview of the most common entity-property paths found in a dataset. Foreseen utilisation of the Background Annotation plugin include: 1) the checking of mapping agreement between different datasets, as produced by the R2R framework and 2) annotation of dependent resources in Concise Boundaries Descriptions of entities, used in data sampling from Linked Data datasets for data mining purposes.

Identiferoai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:162758
Date January 2012
CreatorsSerra, Simone
ContributorsSvátek, Vojtěch, Zamazal, Ondřej
PublisherVysoká škola ekonomická v Praze
Source SetsCzech ETDs
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/masterThesis
Rightsinfo:eu-repo/semantics/restrictedAccess

Page generated in 0.0022 seconds