Information retrieval has been a prevasive part of human society since its existence.With the advent of internet and World wide Web it became an extensive area of researchand major foucs, which lead to development of various search engines to locate the de-sired information, mostly for globally connected computer networks viz. internet.Butthere is another major part of computer network viz. intranet, which has not seen muchof advancement in information retrieval approaches, in spite of being a major source ofinformation within a large number of organizations.Most common technique for intranet based search engines is still mere database-centric. Thus practically intranets are unable to avail the benefits of sophisticated tech-niques that have been developed for internet based search engines without exposing thedata to commercial search engines.In this Master level thesis we propose a ”state of the art architecture” for an advancedsearch engine for intranet which is capable of dealing with continuously growing sizeof intranets knowledge base. This search engine employs lexical processing of doc-umetns,where documents are indexed and searched based on standalone terms or key-words, along with the semantic processing of the documents where the context of thewords and the relationship among them is given more importance.Combining lexical and semantic processing of the documents give an effective ap-proach to handle navigational queries along with research queries, opposite to the modernsearch engines which either uses lexical processing or semantic processing (or one as themajor) of the documents. We give equal importance to both the approaches in our design,considering best of the both world.This work also takes into account various widely acclaimed concepts like inferencerules, ontologies and active feedback from the user community to continuously enhanceand improve the quality of search results along with the possibility to infer and deducenew knowledge from the existing one, while preparing for the advent of semantic web.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:mdh-9408 |
Date | January 2009 |
Creators | Narayan, Nitesh |
Publisher | Mälardalens högskola, Akademin för innovation, design och teknik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/masterThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0017 seconds