In this dissertation, we present a framework to enhance the retrieval, ranking, and categorization of text documents in medical domain. The contributions of this study are the introduction of a similarity model to retrieve and rank medical textdocuments and the introduction of rule-based categorization method based on lexical syntactic patterns features. We formulate the similarity model by combining three features to model the relationship among document and construct a document network. We aim to rank retrieved documents according to their topics / making highly relevant document on the top of the hit-list. We have applied this model on OHSUMED collection (TREC-9) in order to demonstrate the performance effectiveness in terms of topical ranking, recall, and precision metrics.
In addition, we introduce ROLEX-SP (Rules Of LEXical Syntactic Patterns) / a method for the automatic induction of rule-based text-classifiers relies on lexical syntactic patterns as a set of features to categorize text-documents. The proposed method is dedicated to solve the problem of multi-class classification and feature imbalance problems in domain specific text documents. Furthermore, our proposed
method is able to categorize documents according to a predefined set of characteristics such as: user-specific, domain-specific, and query-based categorization which facilitates browsing documents in search-engines and increase
users ability to choose among relevant documents. To demonstrate the applicability of ROLEX-SP, we have performed experiments on OHSUMED (categorization
collection). The results indicate that ROLEX-SP outperforms state-of-the-art methods in categorizing short-text medical documents.
Identifer | oai:union.ndltd.org:METU/oai:etd.lib.metu.edu.tr:http://etd.lib.metu.edu.tr/upload/2/12611996/index.pdf |
Date | 01 June 2010 |
Creators | Al Zamil, Mohammed Gh. I. |
Contributors | Betin Can, Aysu |
Publisher | METU |
Source Sets | Middle East Technical Univ. |
Language | English |
Detected Language | English |
Type | Ph.D. Thesis |
Format | text/pdf |
Rights | To liberate the content for METU campus |
Page generated in 0.0095 seconds