Return to search

Linking textual resources to support information discovery

A vast amount of information is today stored in the form of textual documents, many of which are available online. These documents come from different sources and are of different types. They include newspaper articles, books, corporate reports, encyclopedia entries and research papers. At a semantic level, these documents contain knowledge, which was created by explicitly connecting information and expressing it in the form of a natural language. However, a significant amount of knowledge is not explicitly stated in a single document, yet can be derived or discovered by researching, i.e. accessing, comparing, contrasting and analysing, information from multiple documents. Carrying out this work using traditional search interfaces is tedious due to information overload and the difficulty of formulating queries that would help us to discover information we are not aware of. In order to support this exploratory process, we need to be able to effectively navigate between related pieces of information across documents. While information can be connected using manually curated cross-document links, this approach not only does not scale, but cannot systematically assist us in the discovery of sometimes non-obvious (hidden) relationships. Consequently, there is a need for automatic approaches to link discovery. This work studies how people link content, investigates the properties of different link types, presents new methods for automatic link discovery and designs a system in which link discovery is applied on a collection of millions of documents to improve access to public knowledge.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:647512
Date January 2015
CreatorsKnoth, Petr
PublisherOpen University
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://oro.open.ac.uk/42677/

Page generated in 0.0019 seconds