Return to search

Integrating and querying semantic annotations

Semantic annotations are crucial components in turning unstructured text into more meaningful and machine-understandable information. The acquisition of the mass of semantically-enriched information would allow applications that consume the information to gain wide benefits. At present there are a plethora of commercial and open-source services or tools for enriching documents with semantic annotations. Since there has been limited effort to compare such annotators, this study first surveys and compares them in multiple dimensions, including the techniques, the coverage and the quality of annotations. The overlap and the diversity in capabilities of annotators motivate the need of semantic annotation integration: middleware that produces a unified annotation with improved quality on top of diverse semantic annotators. The integration of semantic annotations leads to new challenges, both compared to usual data integration scenarios and to standard aggregation of machine learning tools. A set of approaches to these challenges are proposed that perform ontology-aware aggregation, adapting Maximum Entropy Markov models to the setting of ontology-based annotations. These approaches are further compared with the existing ontology-unaware supervised approaches, ontology-aware unsupervised methods and individual annotators, demonstrating their effectiveness by an overall improvement in all the testing scenarios. A middleware system – ROSeAnn and its corresponding APIs have been developed. In addition, this study also concerns the availability and usability of semantic-rich data. Thus the second focus of this thesis aims to allow users to query text annotated with different annotators by using both explicit and implicit knowledge. We describe our first step towards this, a query language and a prototype system – QUASAR that provides a uniform way to query multiple facets of annotated documents. We will show how integrating semantic annotations and utilizing external knowledge help in increasing the quality of query answers over annotated documents.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:629549
Date January 2014
CreatorsChen, Luying
ContributorsBenedikt, Michael
PublisherUniversity of Oxford
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://ora.ox.ac.uk/objects/uuid:9998a902-fc12-40fc-81ea-b669079abc95

Page generated in 0.0015 seconds