This dissertation presents a study of creating maps that can be used to help people seek information from Internet documents. The study involves several different research areas in computer and information science including web mining, data mining, artificial neural network in particular self organising maps (SOM), information visualisation, user interface and information retrieval. The purpose of this dissertation is to offer an alternative way to retrieve information by visually representing the characteristics of the unseen documents and their relationships on the 2-dimensional surface of the SOM. The process starts with collecting documents that include text and images from the Internet, moving to extracting important features from them. In other words, we are performing an information retrieval indexing process. The document features are then clustered by using the SOM. As a result, documents with similar features will be clustered together on 2-dimensional maps. The maps are labelled and the documents are connected to locations on the maps based on the labels. The maps are then arranged hierarchically and visualised so that they can be used as a browsing and exploration tool for information retrieval. / We propose a novel method to automatically label the SOM, called HLabelSOM, that produces hierarchical maps and allows documents to place more than one location on the map. In a visualisation interface, called DocMap, we display these hierarchical maps to help people seeking information. The different levels of the hierarchical maps are able to serve users with different information needs, form the needs of general information to the needs of documents in specific topics. Moreover, users may change their intent in the search process, switching from a more general to a more detailed focus or vice versa. The flexibility of placing documents in more than one location itself increases the chance to find the desired documents. Most importantly, by using DocMap a mental contact between a user and the set of documents is established. The user is able to see the relationships among documents topics and find the desired documents with reasonable time and effort. / Thesis (PhD)--University of South Australia, 2005.
Identifer | oai:union.ndltd.org:ADTP/267537 |
Creators | Tan, Hiong Sen. |
Source Sets | Australiasian Digital Theses Program |
Language | English |
Detected Language | English |
Rights | copyright under review |
Page generated in 0.0275 seconds