• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Ontology Generation, Information Harvesting and Semantic Annotation for Machine-Generated Web Pages

Tao, Cui 17 December 2008 (has links) (PDF)
The current World Wide Web is a web of pages. Users have to guess possible keywords that might lead through search engines to the pages that contain information of interest and browse hundreds or even thousands of the returned pages in order to obtain what they want. This frustrating problem motivates an approach to turn the web of pages into a web of knowledge, so that web users can query the information of interest directly. This dissertation provides a step in this direction and a way to partially overcome the challenges. Specifically, this dissertation shows how to turn machine-generated web pages like those on the hidden web into semantic web pages for the web of knowledge. We design and develop three systems to address the challenge of turning the web pages into web-of-knowledge pages: TISP (Table Interpretation for Sibling Pages), TISP++, and FOCIH (Form-based Ontology Creation and Information Harvesting). TISP can automatically interpret hidden-web tables. Given interpreted tables, TISP++ can generate ontologies and semantically annotate the information present in the interpreted tables automatically. This way, we can offer a way to make the hidden information publicly accessible. We also provide users with a way where they can generate personalized ontologies. FOCIH provides users with an interface with which they can provide their own view by creating a form that specifies the information they want. Based on the form, FOCIH can generate user-specific ontologies, and based on patterns in machine-generated pages, FOCIH can harvest information and annotate these pages with respect to the generated ontology. Users can directly query on the annotated information. With these contributions, this dissertation serves as a foundational pillar for turning the current web of pages into a web of knowledge.

Page generated in 0.1517 seconds