Most information retrieval (IR) systems respond to users' representation of their information needs (queries) with a ranked list of relevant results, usually text documents. XML documents di er from traditional text documents by explicitly separating structure and content. XML-IR systems aim to exploit this separation by searching and retrieving relevant components of documents (called elements) rather than entire documents thereby, better ful lling users' information needs. Despite the potential bene t of XML-IR systems, most research in this area has not been centered on the needs of users. In particular, current XML-IR query formation interfaces, namely keywords-only and formal language, are not able to optimally address the needs of users. Keywords-only interfaces are too unsophisticated to fully capture the users' complex information needs that contain both content and structural requirements. In contrast, while formal languages are able to capture users' content and structural requirements they are too di cult to use, even for experts, and are too closely tied to the physical structure of the collection. This thesis presents a solution to these problems by presenting NLPX, a natural language interface for XML-IR systems. NLPX allows users to enter XML-IR queries in natural language and translates them into a formal language (NEXI) to be processed by existing XML retrieval systems. When evaluated by system testing, NLPX outperformed alternative translation approaches. When tested in a user-based experiment, NLPX performed comparably to a query-by-template interface, the baseline user-oriented interface for formulating structured queries. It is hoped that the outcomes of this thesis will help to refocus the eld of XML-IR around the user. This will lead to the development of more useful XML-IR systems, which will hopefully result in the more widespread use of XML-IR systems.
Identifer | oai:union.ndltd.org:ADTP/265633 |
Date | January 2008 |
Creators | Woodley, Alan Paul |
Publisher | Queensland University of Technology |
Source Sets | Australiasian Digital Theses Program |
Detected Language | English |
Rights | Copyright Alan Paul Woodley |
Page generated in 0.0016 seconds