Return to search

Reasoning about quality in the Web of Linked Data

In recent years the Web has evolved from a collection of hyperlinked documents to a vast ecosystem of interconnected documents, devices, services, and agents. However, the open nature of the Web enables anyone or any thing to publish any content they choose. Therefore poor quality data can quickly propagate and an appropriate mechanism to assess the quality of such data is essential if agents are to identify reliable information for use in decision-making. Existing assessment frameworks investigate the context around data (additional information that describes the situation in which a datum was created). Such metadata can be made available by publishing information to the Web of Linked Data. However, there are situations in which examining context alone is not sufficient - such as when one must identify the agent responsible for data creation, or transformational processes applied to data. In these situations, examining data provenance is critical to identifying quality issues. Moreover, there will be situations in which an agent is unable to perform a quality assessment of their own. For example, if the original contextual metadata is no longer available. Here, it may be possible for agents to explore provenance of previous quality assessments and make decisions about quality result re-use. This thesis explores issues around quality assessment and provenance in the Web of Linked Data. It contributes a formal model of quality assessment designed to align with emerging standards for provenance on the Web. This model is then realised as an OWL ontology, which can be used as part of a software framework to perform data quality assessment. Through a number of real-world examples, spanning environmental sensing, invasive species monitoring, and passenger information domains, the thesis establishes the importance of examining provenance as part of quality assessment. Moreover, it demonstrates that by examining quality assessment provenance agents can make re-use decisions about existing quality assessment results. Included in these implementations are sets of example quality metrics that demonstrate how these can be encoded using the SPARQL Inferencing Notation (SPIN).

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:665349
Date January 2015
CreatorsBaillie, Chris
PublisherUniversity of Aberdeen
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://digitool.abdn.ac.uk:80/webclient/DeliveryManager?pid=227177

Page generated in 0.0019 seconds