• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Graph-Based Representations for Textual Case-Based Reasoning

Valle, Kjetil January 2011 (has links)
This thesis presents a graph-based approach to the problem of text representation. The work is motivated by the need for better representations for use in textual Case-Based Reasoning (CBR). In CBR new problems are solved by reasoning based on similar past problem cases. When the cases are represented in free text format, measuring the similarity between a new problem and previously solved problems become a challenging task. The case documents need to be re-represented before they can be compared/matched.Textual CBR (TCBR) addresses this issue. We investigate automatic re-representation of textual cases, in particular measuring the salience of features (entities in the text) towards this end. We use the classical vector space model in Information Retrieval (IR) but investigate whether graph-representation and salience inference using graphs can improve on the Term Frequency (TF) and Term Frequency-Inverse Document Frequency (TF-IDF) measures, emph{bag of words} approaches predominant in IR.Our special focus is whether, and possibly how, the co-occurrence and the syntactic dependency relations between terms have an impact on feature weighting. We measure salience through the notion of graph centrality. We experiment with two types of application tasks, classification and case retrieval. Although classification is not a typical TCBR task, it is easier to find datasets for this application, and the centrality measures we have studied are not specific to TCBR. The experiments on this task are therefore relevant to the second application task which is our ultimate target. We test various centrality metrics described in the literature, make a distinction between local and global weighting measures and compare them for both application tasks. In general, our graph-based salience inference methods perform better than TF and TF-IDF.

Page generated in 0.0153 seconds