Return to search

An ontology-driven concept-based information retrieval approach for web documents

Building computer agents that can utilize the meanings in the text of Web documents is a promising extension of current search technology. Concept-based information retrieval applies "intelligent" agents to identify Web documents that match user queries. A new concept-based information retrieval framework, Hybrid Ontology-based Textual Information Retrieval (HOTIR), is introduced in this thesis. HOTIR accepts conventional keyword-based queries, translates them into concept-based queries, enriches definitions of concepts with supplementary knowledge from a knowledge base, and ranks documents by aggregating "equivalent" concepts identified in them. The concept-based queries in HOTIR are organized in a hierarchy of concepts (HofC) and definitions of concepts are added from a knowledge base to enhance their meanings. The knowledge base is a modified ontology (ModOnt) that can enrich the HofC with concept definitions in the form of related-concepts, terms, their importance values, and their relations. The ModOnt relies on an adaptive assignment of term importance (AATI) scheme that continuously updates the importance of terms/concepts using Web documents. The identified concepts in a Web document that match those in the HofC are evaluated using ordered weighted averaging (OWA) operators, and documents are ranked according to the degree to which they satisfy the HofC. The case studies and experiments presented in the thesis are designed to validate the performance of HOTIR. / Computer Engineering

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:AEU.10048/1235
Date11 1900
CreatorsLi, Zhan
ContributorsReformat, Marek (Electrical and Computer Engineering), Kreinovich, Vladik (Computing Science), Szymanski, Jozef (Civil and Environmental Engineering), Pedrycz, Witold (Electrical and Computer Engineering), Musilek, Petr (Electrical and Computer Engineering), Zhao, Vicky (Electrical and Computer Engineering)
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Format2807161 bytes, application/pdf

Page generated in 0.0205 seconds