Return to search

A framework for semantic web implementation based on context-oriented controlled automatic annotation.

The Semantic Web is the vision of the future Web. Its aim is to enable machines to process Web documents in a way that makes it possible for the computer software to "understand" the meaning of the document contents. Each document on the Semantic Web is to be enriched with meta-data that express the semantics of its contents. Many infrastructures, technologies and standards have been developed and have proven their theoretical use for the Semantic Web, yet very few applications have been created. Most of the current Semantic Web applications were developed for research purposes. This project investigates the major factors restricting the wide spread of Semantic Web applications. We identify the two most important requirements for a successful implementation as the automatic production of the semantically annotated document, and the creation and maintenance of semantic based knowledge base.
This research proposes a framework for Semantic Web implementation based on context-oriented controlled automatic Annotation; for short, we called the framework the Semantic Web Implementation Framework (SWIF) and the system that implements this framework the Semantic Web Implementation System (SWIS). The proposed architecture provides for a Semantic Web implementation of stand-alone websites that automatically annotates Web pages before being uploaded to the Intranet or Internet, and maintains persistent storage of Resource Description Framework (RDF) data for both the domain memory, denoted by Control Knowledge, and the meta-data of the Web site¿s pages. We believe that the presented implementation of the major parts of SWIS introduce a competitive system with current state of art Annotation tools and knowledge management systems; this is because it handles input documents in the
ii
context in which they are created in addition to the automatic learning and verification of knowledge using only the available computerized corporate databases. In this work, we introduce the concept of Control Knowledge (CK) that represents the application¿s domain memory and use it to verify the extracted knowledge. Learning is based on the number of occurrences of the same piece of information in different documents. We introduce the concept of Verifiability in the context of Annotation by comparing the extracted text¿s meaning with the information in the CK and the use of the proposed database table Verifiability_Tab. We use the linguistic concept Thematic Role in investigating and identifying the correct meaning of words in text documents, this helps correct relation extraction. The verb lexicon used contains the argument structure of each verb together with the thematic structure of the arguments. We also introduce a new method to chunk conjoined statements and identify the missing subject of the produced clauses. We use the semantic class of verbs that relates a list of verbs to a single property in the ontology, which helps in disambiguating the verb in the input text to enable better information extraction and Annotation. Consequently we propose the following definition for the annotated document or what is sometimes called the ¿Intelligent Document¿ ¿The Intelligent Document is the document that clearly expresses its syntax and semantics for human use and software automation¿.
This work introduces a promising improvement to the quality of the automatically generated annotated document and the quality of the automatically extracted information in the knowledge base. Our approach in the area of using Semantic Web
iii
technology opens new opportunities for diverse areas of applications. E-Learning applications can be greatly improved and become more effective.

Identiferoai:union.ndltd.org:BRADFORD/oai:bradscholars.brad.ac.uk:10454/3207
Date January 2009
CreatorsHatem, Muna Salman
ContributorsNeagu, Daniel, Ramadan, Haider
PublisherUniversity of Bradford, Department of Computer Science
Source SetsBradford Scholars
LanguageEnglish
Detected LanguageEnglish
TypeThesis, doctoral, PhD
Rights<a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/by-nc-nd/3.0/88x31.png" /></a><br />The University of Bradford theses are licenced under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/">Creative Commons Licence</a>.

Page generated in 0.0029 seconds